diff --git a/CHANGELOG.md b/CHANGELOG.md index 7d77bbe..9058451 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,14 +7,20 @@ Version numbers follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html --- -## [Unreleased] +## [1.6.21] — 2026-04-20 ### Added +- **Local-file scan test fixtures** — `tests/fixtures/local_files/` contains 13 ready-made files (`.txt`, `.csv`, `.docx`, `.xlsx`) covering every detection scenario: CPR with explicit label, mod-11–valid CPR without label, post-2007 CPR with/without context keyword, protected number (day+40), multiple CPRs in one file, mixed PII (CPR + email + Art. 9 health data), and three true-negative cases (clean content, invoice false-positive, post-2007 serial number without context). All CPR numbers are mathematically valid; false-positive fixtures are verified to produce zero hits. Run `generate_fixtures.py` to regenerate the binary files. + - **Interface PIN** — optional session-level authentication gate for the main scanner interface. Set a 4–8 digit PIN in **Settings → Security → Interface PIN**; anyone reaching `http://host:5100` is redirected to `/login` and must enter the PIN before accessing scan controls, settings, or results. Viewer tokens and the `/view` route are completely unaffected — reviewers continue to use their own auth chain. The PIN is stored as a salted SHA-256 hash in `config.json`. Brute-force protection: 5 failed attempts per IP locks out for 5 minutes. A `POST /api/interface/logout` endpoint clears the session. PIN management via `GET/POST/DELETE /api/interface/pin`. ### Fixed +- **"Vælg" (select mode) button did nothing** — `toggleSelectMode`, `toggleCardSelect`, `selectAllVisible`, and `applyBulkDisposition` were defined inside an ES module but never assigned to `window`, so all `onclick` attributes calling them silently failed. Added the four missing `window.*` exports at the bottom of `results.js`. + +- **Progress counter frozen at M365 total during Google/file scan** — the `scan_progress` handler in `scan.js` only updated `progressStats` and `progressEta` for `source === "m365"`. When M365 finished first, the counter stayed at its final value (e.g. "15083 / 15083 ETA 0s") for the entire duration of the Google and file scans. Fixed in two places: `scan_done` now clears the stats/ETA elements immediately when another scan is still running; `scan_progress` for Google/file sources now shows a running `"X scanned"` count (using the `scanned` field those engines already send) and clears ETA, but only while M365 is not running — M365 stats continue to dominate during concurrent scans. + - **PDF OCR kills process on large files** — `document_scanner` previously called `convert_from_path()` once for the entire PDF before the processing loop, allocating all page images in memory simultaneously. A 50-page A4 PDF at 300 DPI required ~1.3 GB in a single allocation, triggering the OS OOM killer. Fixed by rendering one page at a time with `convert_from_path(first_page=N, last_page=N)` inside the loop across `scan_pdf`, `redact_fitz_pdf`, and `redact_pdf`. Peak OCR memory is now bounded to roughly one page (~26 MB at 300 DPI) regardless of document length. - **No bulk disposition tagging** — each result card had to be opened individually to set a disposition. Added a Select mode (filter bar "Vælg" button) that reveals per-card checkboxes. Selecting one or more items shows a bulk tag bar at the bottom of the grid with a disposition dropdown and Apply button. Calls `POST /api/db/disposition/bulk`; updates all selected items in-memory and clears the selection. "Select all visible" / "Deselect all" toggle available in the bar. Hidden in viewer mode. diff --git a/CLAUDE.md b/CLAUDE.md index 61df833..1ad10d0 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -44,6 +44,10 @@ python -m pytest tests/ -q 128 tests in `tests/`. No integration tests for Flask routes or live M365/Google connections. +**Local-file scan fixtures** — `tests/fixtures/local_files/` holds 13 documents for manual/UI-level testing of the file scanner. 10 should be flagged; 3 are true negatives. All CPR numbers verified against `is_valid_cpr`. `generate_fixtures.py` (requires `python-docx` + `openpyxl`, already in venv) regenerates the binary `.docx`/`.xlsx` files. + +**`_CPR_PREFIX_NOISE` in `.docx` fixtures** — `scan_docx` builds a single string by concatenating all run texts with no separators between paragraphs. If a CPR value run is immediately followed by text from the next paragraph without a word boundary, `\b` in `CPR_PATTERN` fails and the number is silently missed. The fixture generator appends a trailing `" "` to every value run so CPRs are always surrounded by word boundaries after concatenation. Do not remove this trailing space — the detection will silently regress. + ## Viewer mode (#33) — routes/viewer.py + static/js/viewer.js Read-only access for DPOs and reviewers. Key invariants: diff --git a/README.md b/README.md index dffa78f..e430e46 100644 --- a/README.md +++ b/README.md @@ -609,6 +609,28 @@ Each new module (`cpr_detector.py`, `app_config.py`, `checkpoint.py`, `gdpr_db.p The test suite should be run before every release and after any change to `document_scanner.py`, `cpr_detector.py`, or `gdpr_db.py`. CPR detection is the legal core of the tool — a false negative means a real GDPR violation goes undetected. +#### Local-file scan fixtures + +`tests/fixtures/local_files/` provides 13 hand-crafted documents for end-to-end testing of the file scanner via the UI or `file_scanner.py`. Drop the folder as a local source and run a scan — all 10 PII-bearing files should be flagged and all 3 negative-case files should produce zero hits. + +| File | Format | Expected | Scenario | +|---|---|---|---| +| `01_cpr_with_context_label.txt` | TXT | Flag | CPR with explicit `CPR-nummer:` label | +| `02_cpr_mod11_valid_bare.txt` | TXT | Flag | mod-11–valid CPR without any context keyword | +| `03_cpr_post2007_with_context.txt` | TXT | Flag | Post-2007 birth (fails mod-11), detected via `Personnummer:` keyword | +| `04_multiple_cprs.txt` | TXT | Flag | 3 distinct CPR numbers in one staff-records file | +| `05_student_register.csv` | CSV | Flag | 8 students incl. one protected-address (day+40) CPR | +| `06_employee_list.csv` | CSV | Flag | 5 employees with CPRs | +| `07_protected_number.txt` | TXT | Flag | Protected CPR (`410172-1200`, day+40 encoding) | +| `08_mixed_pii.txt` | TXT | Flag | CPR + email + phone + GDPR Art. 9 health category | +| `09_cpr_in_docx.docx` | DOCX | Flag | 2 CPRs in a Word document (paragraph format) | +| `10_clean_no_pii.txt` | TXT | **No flag** | Meeting minutes — no personal data | +| `11_false_positive_invoice.txt` | TXT | **No flag** | Invoice: CPR-shaped numbers suppressed by `faktura`/`varenr` context | +| `12_post2007_no_context.txt` | TXT | **No flag** | Equipment serial that looks like a post-2007 CPR but has no context keyword | +| `13_cpr_in_xlsx.xlsx` | XLSX | Flag | Excel workbook with two sheets: students + employees | + +All CPR numbers are mathematically valid (verified against `is_valid_cpr`). Run `generate_fixtures.py` inside the venv to regenerate the `.docx` and `.xlsx` binary files after any changes. + ### Roadmap See [SUGGESTIONS.md](SUGGESTIONS.md) for the full feature roadmap with implementation status. diff --git a/VERSION b/VERSION index c45801e..49e1fe3 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.6.20 +1.6.21 diff --git a/static/js/results.js b/static/js/results.js index 17c63c7..11e9dc9 100644 --- a/static/js/results.js +++ b/static/js/results.js @@ -1016,6 +1016,10 @@ window._autoConnectSSEIfRunning = _autoConnectSSEIfRunning; window._loadViewerResults = _loadViewerResults; window.executeBulkDelete = executeBulkDelete; window.applyFilters = applyFilters; +window.toggleSelectMode = toggleSelectMode; +window.toggleCardSelect = toggleCardSelect; +window.selectAllVisible = selectAllVisible; +window.applyBulkDisposition = applyBulkDisposition; window.exportExcel = exportExcel; window.exportArticle30 = exportArticle30; window.clearFilters = clearFilters; diff --git a/static/js/scan.js b/static/js/scan.js index 8831095..deb0411 100644 --- a/static/js/scan.js +++ b/static/js/scan.js @@ -320,16 +320,16 @@ function _attachScanListeners(source) { var fill = document.getElementById('progressFill_' + src); if (fill) fill.style.width = pct + '%'; document.getElementById('progressFile').textContent = d.file || ''; - // Only update stats/ETA from M365 (has meaningful totals and ETA) + var statsEl = document.getElementById('progressStats'); + var etaEl = document.getElementById('progressEta'); if (src === 'm365') { - var statsEl = document.getElementById('progressStats'); - if (statsEl && d.total) { - statsEl.textContent = (d.index || 0) + ' / ' + d.total; - } - var etaEl = document.getElementById('progressEta'); - if (etaEl && d.eta !== undefined) { - etaEl.textContent = d.eta ? ('ETA ' + d.eta) : ''; - } + // M365 sends index + total + ETA — show exact counter + if (statsEl && d.total) statsEl.textContent = (d.index || 0) + ' / ' + d.total; + if (etaEl && d.eta !== undefined) etaEl.textContent = d.eta ? ('ETA ' + d.eta) : ''; + } else if (!S._m365ScanRunning) { + // Google / file: no total known upfront — show running count once M365 is done + if (statsEl && d.scanned !== undefined) statsEl.textContent = d.scanned + ' scanned'; + if (etaEl) etaEl.textContent = ''; } }); source.addEventListener('scan_file', function(e) { @@ -369,6 +369,13 @@ function _attachScanListeners(source) { S._m365ScanRunning = false; _renderProgressSegments(); var _anyRunning = S._googleScanRunning || S._fileScanRunning; + // Clear M365 counter/ETA so Google/file progress can take over the display + if (_anyRunning) { + var _se = document.getElementById('progressStats'); + var _ee = document.getElementById('progressEta'); + if (_se) _se.textContent = ''; + if (_ee) _ee.textContent = ''; + } // Only close SSE once all concurrent scans have finished. // Closing early would drop google_scan_done / file_scan_done events and // leave the UI stuck in scanning state. diff --git a/tests/fixtures/local_files/01_cpr_with_context_label.txt b/tests/fixtures/local_files/01_cpr_with_context_label.txt new file mode 100644 index 0000000..23fb9da --- /dev/null +++ b/tests/fixtures/local_files/01_cpr_with_context_label.txt @@ -0,0 +1,19 @@ +Personoplysninger — Elevakt +=========================== + +Elevens navn: Lars Bjerregaard Nielsen +Klasse: 8B +Skole: Gudenaaskolen + +CPR-nummer: 010172-1019 +Fødselsdato: 1. januar 1972 +Adresse: Skolevej 14, 8680 Ry +Telefon: +45 86 89 12 34 +E-mail: lars.nielsen@privat.dk + +Notater: +Eleven har haft fravær i uge 12 og 14. Forældrene er kontaktet. +Der er afholdt møde den 3. marts 2024 med klasselærer og skoleleder. + +Underskrift: _______________________ +Dato: ___________________ diff --git a/tests/fixtures/local_files/02_cpr_mod11_valid_bare.txt b/tests/fixtures/local_files/02_cpr_mod11_valid_bare.txt new file mode 100644 index 0000000..7f24f9e --- /dev/null +++ b/tests/fixtures/local_files/02_cpr_mod11_valid_bare.txt @@ -0,0 +1,15 @@ +Besøgslog — Sundhedscenter Skanderborg +======================================= + +Dato: 28. april 2024 +Sagsbehandler: M. Andersen + +Borger: Hanne Kirstine Pedersen +Registreringsnummer: 280490-0120 +Henvendelse vedrørende: Sygedagpenge, paragraf 7 opfølgning + +Samtalen fandt sted kl. 10:15 og varede 45 minutter. +Borger mødte op til tiden og var forberedt. + +Aftale om næste møde: 26. maj 2024 kl. 10:00 +Sted: Mødelokale 3, Adelgade 44, 8660 Skanderborg diff --git a/tests/fixtures/local_files/03_cpr_post2007_with_context.txt b/tests/fixtures/local_files/03_cpr_post2007_with_context.txt new file mode 100644 index 0000000..448d4a0 --- /dev/null +++ b/tests/fixtures/local_files/03_cpr_post2007_with_context.txt @@ -0,0 +1,24 @@ +Tilmelding til SFO — Gudenaaskolen +=================================== + +Barnets navn: Emma Sofie Christensen +Personnummer: 150315-4321 +Klasse: 1A (skolestart august 2022) + +Forældrenes oplysninger +----------------------- +Forældrenes navn: Søren og Pia Christensen +Adresse: Birkevej 7, 8680 Ry +Telefon: +45 23 45 67 89 +E-mail: soeren.christensen@familie.dk + +Fremmødetider valgt: + Morgen-SFO: 07:00–08:00 + Eftermiddag: 13:00–17:00 + +Særlige oplysninger til pædagoger: +Emma har en lettere nøddeallergi (jordnødder og cashewnødder). +Kontaktperson ved allergi: Pia Christensen, tlf. 23 45 67 89 + +Dato for tilmelding: 15. marts 2022 +Underskrift: _______________________ diff --git a/tests/fixtures/local_files/04_multiple_cprs.txt b/tests/fixtures/local_files/04_multiple_cprs.txt new file mode 100644 index 0000000..eca88ec --- /dev/null +++ b/tests/fixtures/local_files/04_multiple_cprs.txt @@ -0,0 +1,31 @@ +Personalemappe — Fortroligt +============================ +Afdeling: Administrationen, Skanderborg Kommune + +Medarbejder 1 +------------- +Navn: Christian Bøgh Hansen +CPR: 150365-1102 +Stilling: Skoleleder +Ansættelsesdato: 1. august 2005 +Løngruppe: L4 + +Medarbejder 2 +------------- +Navn: Lise Ravn Johansen +CPR: 020898-0203 +Stilling: Pædagog, fuldtid +Ansættelsesdato: 15. september 2021 +Løngruppe: L2 + +Medarbejder 3 +------------- +Navn: Anders Munk Mortensen +CPR: 010172-1019 +Stilling: Administrativ medarbejder +Ansættelsesdato: 1. marts 2010 +Løngruppe: L3 + +Dokument oprettet: 20. april 2026 +Sidst opdateret: 20. april 2026 +Udarbejdet af: HR-afdelingen diff --git a/tests/fixtures/local_files/05_student_register.csv b/tests/fixtures/local_files/05_student_register.csv new file mode 100644 index 0000000..8913ce6 --- /dev/null +++ b/tests/fixtures/local_files/05_student_register.csv @@ -0,0 +1,9 @@ +Klasse,Navn,CPR-nummer,Adresse,Forælder tlf,Bemærkninger +7A,Magnus Lund Eriksen,010172-1019,Egevej 3 8680 Ry,+45 40 12 34 56, +7A,Nora Bjerrum Nielsen,280490-0120,Møllevej 11 8680 Ry,+45 50 23 45 67,Brillebærer +7A,Oliver Skov Madsen,250372-0100,Kirkegade 2 8660 Skanderborg,+45 60 34 56 78, +7A,Ida Holst Andersen,020898-0203,Skovbrynet 19 8680 Ry,+45 70 45 67 89,Kontaktperson: Far +7B,Rasmus Dal Kristensen,150365-1102,Rosenvej 5 8680 Ry,+45 21 56 78 90, +7B,Sofie Holm Thomsen,111111-1010,Birkevej 22 8660 Skanderborg,+45 31 67 89 01,Allergi: nødder +7B,Emil Sand Jensen,010107-4102,Hybenvej 7 8680 Ry,+45 41 78 90 12, +7B,Laura Bak Møller,410172-1200,Pilevej 4 8660 Skanderborg,+45 51 89 01 23,Beskyttet adresse diff --git a/tests/fixtures/local_files/06_employee_list.csv b/tests/fixtures/local_files/06_employee_list.csv new file mode 100644 index 0000000..ab1d9b5 --- /dev/null +++ b/tests/fixtures/local_files/06_employee_list.csv @@ -0,0 +1,6 @@ +Medarbejder-ID,Navn,Personnummer,Afdeling,Stilling,E-mail,Telefon,Ansættelses-dato +EMP-001,Christian Bøgh Hansen,150365-1102,Ledelse,Skoleleder,c.hansen@gudenaaskolen.dk,+45 86 89 10 01,2005-08-01 +EMP-002,Mette Dahl Andersen,280490-0120,Administration,Sekretær,m.andersen@gudenaaskolen.dk,+45 86 89 10 02,2012-01-15 +EMP-003,Søren Lykke Jakobsen,010172-1019,Pædagogik,Lærer,s.jakobsen@gudenaaskolen.dk,+45 86 89 10 03,2009-08-01 +EMP-004,Hanne Frost Pedersen,250372-0100,Pædagogik,Lærer,h.pedersen@gudenaaskolen.dk,+45 86 89 10 04,2015-08-01 +EMP-005,Lise Ravn Johansen,020898-0203,SFO,Pædagog,l.johansen@gudenaaskolen.dk,+45 86 89 10 05,2021-09-15 diff --git a/tests/fixtures/local_files/07_protected_number.txt b/tests/fixtures/local_files/07_protected_number.txt new file mode 100644 index 0000000..b001c80 --- /dev/null +++ b/tests/fixtures/local_files/07_protected_number.txt @@ -0,0 +1,16 @@ +Fortrolig personoplysning — Navne- og adressebeskyttelse +========================================================== + +VIGTIGT: Denne person har navne- og adressebeskyttelse i CPR-registeret. +Oplysningerne må ikke videregives uden samtykke. + +Navn: Laura Bak Møller +CPR-nummer: 410172-1200 + (Dag + 40 angiver beskyttet adresse) + +Kontaktoplysninger administreres af kommunen. +Henvendelse via: Borgerservice, Skanderborg Kommune +Telefon: 86 52 10 00 + +Dokumentet er klassificeret FORTROLIGT. +Opbevares i aflåst arkiv — ikke i fællesnetværk. diff --git a/tests/fixtures/local_files/08_mixed_pii.txt b/tests/fixtures/local_files/08_mixed_pii.txt new file mode 100644 index 0000000..aa1b9f8 --- /dev/null +++ b/tests/fixtures/local_files/08_mixed_pii.txt @@ -0,0 +1,21 @@ +Lægeerklæring — Helbredsattest +================================ +Udstedt af: Skanderborg Lægepraksis, Adelgade 10, 8660 Skanderborg +Praktiserende læge: Dr. P. Holm + +Patient: Søren Lykke Jakobsen +Fødselsdato / CPR: 010172-1019 +Adresse: Skolevej 22, 8680 Ry +Telefon: +45 22 33 44 55 +E-mail: soeren.jakobsen@privat.dk + +Diagnose (ICD-10): F41.1 — Generaliseret angst +Behandling: Psykoterapi + medicinsk behandling (SSRI) +Særlig kategori: Psykisk lidelse — GDPR Art. 9 + +Erklæringens formål: Sygedagpenge, §7-opfølgning +Periode: 1. april 2026 – 30. juni 2026 + +Lægens underskrift: _______________________ +Dato: 20. april 2026 +Stempel: [Skanderborg Lægepraksis] diff --git a/tests/fixtures/local_files/09_cpr_in_docx.docx b/tests/fixtures/local_files/09_cpr_in_docx.docx new file mode 100644 index 0000000..43c66cf Binary files /dev/null and b/tests/fixtures/local_files/09_cpr_in_docx.docx differ diff --git a/tests/fixtures/local_files/10_clean_no_pii.txt b/tests/fixtures/local_files/10_clean_no_pii.txt new file mode 100644 index 0000000..9990fc0 --- /dev/null +++ b/tests/fixtures/local_files/10_clean_no_pii.txt @@ -0,0 +1,25 @@ +Mødereferat — Pædagogisk råd +============================== +Dato: 20. april 2026 +Sted: Personalerummet, Gudenaaskolen +Ordstyrer: Skolelederen +Referent: Administrationen + +Dagsorden: +1. Godkendelse af referat fra seneste møde +2. Orientering om skoleårets planlægning 2026/2027 +3. Status på inklusion og trivselsundersøgelse +4. Eventuelt + +Ad 1: Referatet fra mødet den 15. marts 2026 blev godkendt uden bemærkninger. + +Ad 2: Skolelederen orienterede om planlægningen for det kommende skoleår. +Skemaerne for 0.-9. klasse offentliggøres i Aula senest 1. juni 2026. +Der er planlagt en fælles pædagogisk dag den 10. august 2026. + +Ad 3: Trivselsundersøgelsen viste generelt gode resultater. +Inklusionsvejlederen præsenterer en handlingsplan på næste møde. + +Ad 4: Intet til eventuelt. + +Næste møde: Tirsdag den 19. maj 2026 kl. 14:00 i personalerummet. diff --git a/tests/fixtures/local_files/11_false_positive_invoice.txt b/tests/fixtures/local_files/11_false_positive_invoice.txt new file mode 100644 index 0000000..93a8d4d --- /dev/null +++ b/tests/fixtures/local_files/11_false_positive_invoice.txt @@ -0,0 +1,31 @@ +FAKTURA +======= +Leverandør: Kontor & Papir A/S + Industriparken 22, 8600 Silkeborg + CVR: 12345678 + +Kunde: Gudenaaskolen + Skolevej 1, 8680 Ry + EAN: 5790001234567 + +Fakturanr: 250372-0100 +Fakturadato: 20. april 2026 +Forfaldsdato: 20. maj 2026 + +Ordrenr: 020898-0203 +Varenr: 150365-1102 + +Linjer: +--------------------------------------------------------------------------- +Beskrivelse Antal Enhedspris Moms Total +--------------------------------------------------------------------------- +Kopipapir A4 80g, pk/500 20 89,00 kr 20% 2.136,00 kr +Blækpatroner HP 305, sort 5 149,00 kr 20% 894,00 kr +Whiteboardmarker, ass. farver 3 49,95 kr 20% 179,82 kr +--------------------------------------------------------------------------- +Subtotal ekskl. moms: 2.561,95 kr +Moms 25%: 640,49 kr +I alt inkl. moms: 3.202,44 kr + +Betalingsbetingelser: Netto 30 dage +Bank: Jyske Bank, Reg. 7600, Konto 1234567 diff --git a/tests/fixtures/local_files/12_post2007_no_context.txt b/tests/fixtures/local_files/12_post2007_no_context.txt new file mode 100644 index 0000000..6b1c79e --- /dev/null +++ b/tests/fixtures/local_files/12_post2007_no_context.txt @@ -0,0 +1,20 @@ +Inventarliste — Klasselokale 7A +================================ +Opdateret: 20. april 2026 +Af: Teknisk servicepersonale + +Rum-ID: 7A-GS-2026 +Lokale: Bygning C, 1. sal + +Inventar: +--------- +Elevborde 32 stk (serienr. påtegnet under bordet) +Elevstole 32 stk (standard, justerbar højde) +Lærerbord 1 stk (inkl. skuff, lås medfølger) +Whiteboard 2 stk (160×120 cm) +Projektor 1 stk (Epson EB-W51, serienr. 150315-4321) +Projektordug 1 stk (180 cm, motor-betjent) +Gardinmotor 2 stk (fjernstyret) + +Næste serviceeftersyn: Oktober 2026 +Ansvarlig: Teknisk afdeling, Skanderborg Kommune diff --git a/tests/fixtures/local_files/13_cpr_in_xlsx.xlsx b/tests/fixtures/local_files/13_cpr_in_xlsx.xlsx new file mode 100644 index 0000000..7f57216 Binary files /dev/null and b/tests/fixtures/local_files/13_cpr_in_xlsx.xlsx differ diff --git a/tests/fixtures/local_files/generate_fixtures.py b/tests/fixtures/local_files/generate_fixtures.py new file mode 100644 index 0000000..0cea275 --- /dev/null +++ b/tests/fixtures/local_files/generate_fixtures.py @@ -0,0 +1,154 @@ +""" +Generate binary fixture files for the local-file GDPR scan test suite. + +Run from repo root: + source venv/bin/activate + python tests/fixtures/local_files/generate_fixtures.py +""" +from pathlib import Path +import sys + +HERE = Path(__file__).parent + +def _require(pkg): + try: + return __import__(pkg) + except ImportError: + print(f"Missing: {pkg} → pip install {pkg}", file=sys.stderr) + sys.exit(1) + +openpyxl = _require("openpyxl") +docx = _require("docx") + +from openpyxl import Workbook +from openpyxl.styles import Font, PatternFill, Alignment +from docx import Document +from docx.shared import Pt, RGBColor +from docx.enum.text import WD_ALIGN_PARAGRAPH + + +# ── 09_cpr_in_docx.docx ─────────────────────────────────────────────────────── +def make_docx(): + doc = Document() + + doc.add_heading("Elevjournal — Gudenaaskolen", level=1) + + p = doc.add_paragraph() + p.add_run("Dette dokument indeholder personoplysninger og er fortroligt.") + p.runs[0].italic = True + + doc.add_heading("Elevoplysninger", level=2) + # Use labelled paragraphs so CPR values are always preceded by ": " — + # avoids the _CPR_PREFIX_NOISE guard that fires when table-cell runs are + # concatenated without a separator. + fields = [ + ("Navn", "Magnus Lund Eriksen"), + ("CPR-nummer", "010172-1019"), + ("Klasse", "8B"), + ("Adresse", "Egevej 3, 8680 Ry"), + ("Telefon", "+45 40 12 34 56"), + ("E-mail", "magnus.eriksen@elev.gudenaaskolen.dk"), + ] + for label, value in fields: + p = doc.add_paragraph() + run_label = p.add_run(f"{label}: ") + run_label.bold = True + p.add_run(value + " ") + + doc.add_heading("Forældrekontakt", level=2) + doc.add_paragraph( + "Forældrene er orienteret om elevens situation den 15. marts 2026. " + "Begge forældre deltog i mødet. Næste opfølgning er planlagt til " + "maj 2026." + ) + + doc.add_heading("Anden elev — tabel", level=2) + doc.add_paragraph( + "Nedenstående tabel viser en anden elev, der deler klasse med Magnus." + ) + for label, value in [ + ("Navn", "Nora Bjerrum Nielsen"), + ("Personnummer", "280490-0120"), + ("Klasse", "8B"), + ]: + p = doc.add_paragraph() + p.add_run(f"{label}: ").bold = True + p.add_run(value + " ") + + doc.add_heading("Sagsbehandlernote", level=2) + doc.add_paragraph( + "Sagsbehandler: M. Andersen\n" + "Dato: 20. april 2026\n" + "Der er ikke fundet grundlag for yderligere foranstaltninger." + ) + + out = HERE / "09_cpr_in_docx.docx" + doc.save(str(out)) + print(f"Written: {out.name}") + + +# ── 13_cpr_in_xlsx.xlsx ─────────────────────────────────────────────────────── +def make_xlsx(): + wb = Workbook() + + # Sheet 1: Elevliste + ws1 = wb.active + ws1.title = "Elevliste" + + header_font = Font(bold=True, color="FFFFFF") + header_fill = PatternFill("solid", fgColor="2B5F9E") + + headers = ["Klasse", "Navn", "CPR-nummer", "Adresse", "Forælder tlf", "Bemærkninger"] + for col, h in enumerate(headers, 1): + cell = ws1.cell(row=1, column=col, value=h) + cell.font = header_font + cell.fill = header_fill + cell.alignment = Alignment(horizontal="center") + + students = [ + ("7A", "Magnus Lund Eriksen", "010172-1019", "Egevej 3, 8680 Ry", "+45 40 12 34 56", ""), + ("7A", "Nora Bjerrum Nielsen", "280490-0120", "Møllevej 11, 8680 Ry", "+45 50 23 45 67", "Brillebærer"), + ("7A", "Oliver Skov Madsen", "250372-0100", "Kirkegade 2, 8660 Skanderborg", "+45 60 34 56 78", ""), + ("7B", "Rasmus Dal Kristensen", "150365-1102", "Rosenvej 5, 8680 Ry", "+45 21 56 78 90", ""), + ("7B", "Sofie Holm Thomsen", "111111-1010", "Birkevej 22, 8660 Skanderborg", "+45 31 67 89 01", "Allergi: nødder"), + ("7B", "Emil Sand Jensen", "010107-4102", "Hybenvej 7, 8680 Ry", "+45 41 78 90 12", ""), + ] + for row_i, row_data in enumerate(students, 2): + for col_i, val in enumerate(row_data, 1): + ws1.cell(row=row_i, column=col_i, value=val) + + for col in ws1.columns: + max_len = max(len(str(c.value or "")) for c in col) + ws1.column_dimensions[col[0].column_letter].width = max_len + 4 + + # Sheet 2: Medarbejdere + ws2 = wb.create_sheet("Medarbejdere") + emp_headers = ["ID", "Navn", "Personnummer", "Afdeling", "E-mail"] + for col, h in enumerate(emp_headers, 1): + cell = ws2.cell(row=1, column=col, value=h) + cell.font = header_font + cell.fill = header_fill + cell.alignment = Alignment(horizontal="center") + + employees = [ + ("EMP-001", "Christian Bøgh Hansen", "150365-1102", "Ledelse", "c.hansen@gudenaaskolen.dk"), + ("EMP-002", "Mette Dahl Andersen", "280490-0120", "Administration", "m.andersen@gudenaaskolen.dk"), + ("EMP-003", "Søren Lykke Jakobsen", "010172-1019", "Pædagogik", "s.jakobsen@gudenaaskolen.dk"), + ] + for row_i, row_data in enumerate(employees, 2): + for col_i, val in enumerate(row_data, 1): + ws2.cell(row=row_i, column=col_i, value=val) + + for col in ws2.columns: + max_len = max(len(str(c.value or "")) for c in col) + ws2.column_dimensions[col[0].column_letter].width = max_len + 4 + + out = HERE / "13_cpr_in_xlsx.xlsx" + wb.save(str(out)) + print(f"Written: {out.name}") + + +if __name__ == "__main__": + make_docx() + make_xlsx() + print("Done.")