fix: select mode onclick exports, multi-source progress counter, OCR       page-by-page
This commit is contained in:
StyxX65 2026-04-21 13:12:54 +02:00
parent d8083eb0c0
commit 7c1afca80b
20 changed files with 425 additions and 11 deletions

View File

@ -7,14 +7,20 @@ Version numbers follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html
--- ---
## [Unreleased] ## [1.6.21] — 2026-04-20
### Added ### Added
- **Local-file scan test fixtures**`tests/fixtures/local_files/` contains 13 ready-made files (`.txt`, `.csv`, `.docx`, `.xlsx`) covering every detection scenario: CPR with explicit label, mod-11valid CPR without label, post-2007 CPR with/without context keyword, protected number (day+40), multiple CPRs in one file, mixed PII (CPR + email + Art. 9 health data), and three true-negative cases (clean content, invoice false-positive, post-2007 serial number without context). All CPR numbers are mathematically valid; false-positive fixtures are verified to produce zero hits. Run `generate_fixtures.py` to regenerate the binary files.
- **Interface PIN** — optional session-level authentication gate for the main scanner interface. Set a 48 digit PIN in **Settings → Security → Interface PIN**; anyone reaching `http://host:5100` is redirected to `/login` and must enter the PIN before accessing scan controls, settings, or results. Viewer tokens and the `/view` route are completely unaffected — reviewers continue to use their own auth chain. The PIN is stored as a salted SHA-256 hash in `config.json`. Brute-force protection: 5 failed attempts per IP locks out for 5 minutes. A `POST /api/interface/logout` endpoint clears the session. PIN management via `GET/POST/DELETE /api/interface/pin`. - **Interface PIN** — optional session-level authentication gate for the main scanner interface. Set a 48 digit PIN in **Settings → Security → Interface PIN**; anyone reaching `http://host:5100` is redirected to `/login` and must enter the PIN before accessing scan controls, settings, or results. Viewer tokens and the `/view` route are completely unaffected — reviewers continue to use their own auth chain. The PIN is stored as a salted SHA-256 hash in `config.json`. Brute-force protection: 5 failed attempts per IP locks out for 5 minutes. A `POST /api/interface/logout` endpoint clears the session. PIN management via `GET/POST/DELETE /api/interface/pin`.
### Fixed ### Fixed
- **"Vælg" (select mode) button did nothing** — `toggleSelectMode`, `toggleCardSelect`, `selectAllVisible`, and `applyBulkDisposition` were defined inside an ES module but never assigned to `window`, so all `onclick` attributes calling them silently failed. Added the four missing `window.*` exports at the bottom of `results.js`.
- **Progress counter frozen at M365 total during Google/file scan** — the `scan_progress` handler in `scan.js` only updated `progressStats` and `progressEta` for `source === "m365"`. When M365 finished first, the counter stayed at its final value (e.g. "15083 / 15083 ETA 0s") for the entire duration of the Google and file scans. Fixed in two places: `scan_done` now clears the stats/ETA elements immediately when another scan is still running; `scan_progress` for Google/file sources now shows a running `"X scanned"` count (using the `scanned` field those engines already send) and clears ETA, but only while M365 is not running — M365 stats continue to dominate during concurrent scans.
- **PDF OCR kills process on large files**`document_scanner` previously called `convert_from_path()` once for the entire PDF before the processing loop, allocating all page images in memory simultaneously. A 50-page A4 PDF at 300 DPI required ~1.3 GB in a single allocation, triggering the OS OOM killer. Fixed by rendering one page at a time with `convert_from_path(first_page=N, last_page=N)` inside the loop across `scan_pdf`, `redact_fitz_pdf`, and `redact_pdf`. Peak OCR memory is now bounded to roughly one page (~26 MB at 300 DPI) regardless of document length. - **PDF OCR kills process on large files**`document_scanner` previously called `convert_from_path()` once for the entire PDF before the processing loop, allocating all page images in memory simultaneously. A 50-page A4 PDF at 300 DPI required ~1.3 GB in a single allocation, triggering the OS OOM killer. Fixed by rendering one page at a time with `convert_from_path(first_page=N, last_page=N)` inside the loop across `scan_pdf`, `redact_fitz_pdf`, and `redact_pdf`. Peak OCR memory is now bounded to roughly one page (~26 MB at 300 DPI) regardless of document length.
- **No bulk disposition tagging** — each result card had to be opened individually to set a disposition. Added a Select mode (filter bar "Vælg" button) that reveals per-card checkboxes. Selecting one or more items shows a bulk tag bar at the bottom of the grid with a disposition dropdown and Apply button. Calls `POST /api/db/disposition/bulk`; updates all selected items in-memory and clears the selection. "Select all visible" / "Deselect all" toggle available in the bar. Hidden in viewer mode. - **No bulk disposition tagging** — each result card had to be opened individually to set a disposition. Added a Select mode (filter bar "Vælg" button) that reveals per-card checkboxes. Selecting one or more items shows a bulk tag bar at the bottom of the grid with a disposition dropdown and Apply button. Calls `POST /api/db/disposition/bulk`; updates all selected items in-memory and clears the selection. "Select all visible" / "Deselect all" toggle available in the bar. Hidden in viewer mode.

View File

@ -44,6 +44,10 @@ python -m pytest tests/ -q
128 tests in `tests/`. No integration tests for Flask routes or live M365/Google connections. 128 tests in `tests/`. No integration tests for Flask routes or live M365/Google connections.
**Local-file scan fixtures** — `tests/fixtures/local_files/` holds 13 documents for manual/UI-level testing of the file scanner. 10 should be flagged; 3 are true negatives. All CPR numbers verified against `is_valid_cpr`. `generate_fixtures.py` (requires `python-docx` + `openpyxl`, already in venv) regenerates the binary `.docx`/`.xlsx` files.
**`_CPR_PREFIX_NOISE` in `.docx` fixtures** — `scan_docx` builds a single string by concatenating all run texts with no separators between paragraphs. If a CPR value run is immediately followed by text from the next paragraph without a word boundary, `\b` in `CPR_PATTERN` fails and the number is silently missed. The fixture generator appends a trailing `" "` to every value run so CPRs are always surrounded by word boundaries after concatenation. Do not remove this trailing space — the detection will silently regress.
## Viewer mode (#33) — routes/viewer.py + static/js/viewer.js ## Viewer mode (#33) — routes/viewer.py + static/js/viewer.js
Read-only access for DPOs and reviewers. Key invariants: Read-only access for DPOs and reviewers. Key invariants:

View File

@ -609,6 +609,28 @@ Each new module (`cpr_detector.py`, `app_config.py`, `checkpoint.py`, `gdpr_db.p
The test suite should be run before every release and after any change to `document_scanner.py`, `cpr_detector.py`, or `gdpr_db.py`. CPR detection is the legal core of the tool — a false negative means a real GDPR violation goes undetected. The test suite should be run before every release and after any change to `document_scanner.py`, `cpr_detector.py`, or `gdpr_db.py`. CPR detection is the legal core of the tool — a false negative means a real GDPR violation goes undetected.
#### Local-file scan fixtures
`tests/fixtures/local_files/` provides 13 hand-crafted documents for end-to-end testing of the file scanner via the UI or `file_scanner.py`. Drop the folder as a local source and run a scan — all 10 PII-bearing files should be flagged and all 3 negative-case files should produce zero hits.
| File | Format | Expected | Scenario |
|---|---|---|---|
| `01_cpr_with_context_label.txt` | TXT | Flag | CPR with explicit `CPR-nummer:` label |
| `02_cpr_mod11_valid_bare.txt` | TXT | Flag | mod-11valid CPR without any context keyword |
| `03_cpr_post2007_with_context.txt` | TXT | Flag | Post-2007 birth (fails mod-11), detected via `Personnummer:` keyword |
| `04_multiple_cprs.txt` | TXT | Flag | 3 distinct CPR numbers in one staff-records file |
| `05_student_register.csv` | CSV | Flag | 8 students incl. one protected-address (day+40) CPR |
| `06_employee_list.csv` | CSV | Flag | 5 employees with CPRs |
| `07_protected_number.txt` | TXT | Flag | Protected CPR (`410172-1200`, day+40 encoding) |
| `08_mixed_pii.txt` | TXT | Flag | CPR + email + phone + GDPR Art. 9 health category |
| `09_cpr_in_docx.docx` | DOCX | Flag | 2 CPRs in a Word document (paragraph format) |
| `10_clean_no_pii.txt` | TXT | **No flag** | Meeting minutes — no personal data |
| `11_false_positive_invoice.txt` | TXT | **No flag** | Invoice: CPR-shaped numbers suppressed by `faktura`/`varenr` context |
| `12_post2007_no_context.txt` | TXT | **No flag** | Equipment serial that looks like a post-2007 CPR but has no context keyword |
| `13_cpr_in_xlsx.xlsx` | XLSX | Flag | Excel workbook with two sheets: students + employees |
All CPR numbers are mathematically valid (verified against `is_valid_cpr`). Run `generate_fixtures.py` inside the venv to regenerate the `.docx` and `.xlsx` binary files after any changes.
### Roadmap ### Roadmap
See [SUGGESTIONS.md](SUGGESTIONS.md) for the full feature roadmap with implementation status. See [SUGGESTIONS.md](SUGGESTIONS.md) for the full feature roadmap with implementation status.

View File

@ -1 +1 @@
1.6.20 1.6.21

View File

@ -1016,6 +1016,10 @@ window._autoConnectSSEIfRunning = _autoConnectSSEIfRunning;
window._loadViewerResults = _loadViewerResults; window._loadViewerResults = _loadViewerResults;
window.executeBulkDelete = executeBulkDelete; window.executeBulkDelete = executeBulkDelete;
window.applyFilters = applyFilters; window.applyFilters = applyFilters;
window.toggleSelectMode = toggleSelectMode;
window.toggleCardSelect = toggleCardSelect;
window.selectAllVisible = selectAllVisible;
window.applyBulkDisposition = applyBulkDisposition;
window.exportExcel = exportExcel; window.exportExcel = exportExcel;
window.exportArticle30 = exportArticle30; window.exportArticle30 = exportArticle30;
window.clearFilters = clearFilters; window.clearFilters = clearFilters;

View File

@ -320,16 +320,16 @@ function _attachScanListeners(source) {
var fill = document.getElementById('progressFill_' + src); var fill = document.getElementById('progressFill_' + src);
if (fill) fill.style.width = pct + '%'; if (fill) fill.style.width = pct + '%';
document.getElementById('progressFile').textContent = d.file || ''; document.getElementById('progressFile').textContent = d.file || '';
// Only update stats/ETA from M365 (has meaningful totals and ETA) var statsEl = document.getElementById('progressStats');
var etaEl = document.getElementById('progressEta');
if (src === 'm365') { if (src === 'm365') {
var statsEl = document.getElementById('progressStats'); // M365 sends index + total + ETA — show exact counter
if (statsEl && d.total) { if (statsEl && d.total) statsEl.textContent = (d.index || 0) + ' / ' + d.total;
statsEl.textContent = (d.index || 0) + ' / ' + d.total; if (etaEl && d.eta !== undefined) etaEl.textContent = d.eta ? ('ETA ' + d.eta) : '';
} } else if (!S._m365ScanRunning) {
var etaEl = document.getElementById('progressEta'); // Google / file: no total known upfront — show running count once M365 is done
if (etaEl && d.eta !== undefined) { if (statsEl && d.scanned !== undefined) statsEl.textContent = d.scanned + ' scanned';
etaEl.textContent = d.eta ? ('ETA ' + d.eta) : ''; if (etaEl) etaEl.textContent = '';
}
} }
}); });
source.addEventListener('scan_file', function(e) { source.addEventListener('scan_file', function(e) {
@ -369,6 +369,13 @@ function _attachScanListeners(source) {
S._m365ScanRunning = false; S._m365ScanRunning = false;
_renderProgressSegments(); _renderProgressSegments();
var _anyRunning = S._googleScanRunning || S._fileScanRunning; var _anyRunning = S._googleScanRunning || S._fileScanRunning;
// Clear M365 counter/ETA so Google/file progress can take over the display
if (_anyRunning) {
var _se = document.getElementById('progressStats');
var _ee = document.getElementById('progressEta');
if (_se) _se.textContent = '';
if (_ee) _ee.textContent = '';
}
// Only close SSE once all concurrent scans have finished. // Only close SSE once all concurrent scans have finished.
// Closing early would drop google_scan_done / file_scan_done events and // Closing early would drop google_scan_done / file_scan_done events and
// leave the UI stuck in scanning state. // leave the UI stuck in scanning state.

View File

@ -0,0 +1,19 @@
Personoplysninger — Elevakt
===========================
Elevens navn: Lars Bjerregaard Nielsen
Klasse: 8B
Skole: Gudenaaskolen
CPR-nummer: 010172-1019
Fødselsdato: 1. januar 1972
Adresse: Skolevej 14, 8680 Ry
Telefon: +45 86 89 12 34
E-mail: lars.nielsen@privat.dk
Notater:
Eleven har haft fravær i uge 12 og 14. Forældrene er kontaktet.
Der er afholdt møde den 3. marts 2024 med klasselærer og skoleleder.
Underskrift: _______________________
Dato: ___________________

View File

@ -0,0 +1,15 @@
Besøgslog — Sundhedscenter Skanderborg
=======================================
Dato: 28. april 2024
Sagsbehandler: M. Andersen
Borger: Hanne Kirstine Pedersen
Registreringsnummer: 280490-0120
Henvendelse vedrørende: Sygedagpenge, paragraf 7 opfølgning
Samtalen fandt sted kl. 10:15 og varede 45 minutter.
Borger mødte op til tiden og var forberedt.
Aftale om næste møde: 26. maj 2024 kl. 10:00
Sted: Mødelokale 3, Adelgade 44, 8660 Skanderborg

View File

@ -0,0 +1,24 @@
Tilmelding til SFO — Gudenaaskolen
===================================
Barnets navn: Emma Sofie Christensen
Personnummer: 150315-4321
Klasse: 1A (skolestart august 2022)
Forældrenes oplysninger
-----------------------
Forældrenes navn: Søren og Pia Christensen
Adresse: Birkevej 7, 8680 Ry
Telefon: +45 23 45 67 89
E-mail: soeren.christensen@familie.dk
Fremmødetider valgt:
Morgen-SFO: 07:0008:00
Eftermiddag: 13:0017:00
Særlige oplysninger til pædagoger:
Emma har en lettere nøddeallergi (jordnødder og cashewnødder).
Kontaktperson ved allergi: Pia Christensen, tlf. 23 45 67 89
Dato for tilmelding: 15. marts 2022
Underskrift: _______________________

View File

@ -0,0 +1,31 @@
Personalemappe — Fortroligt
============================
Afdeling: Administrationen, Skanderborg Kommune
Medarbejder 1
-------------
Navn: Christian Bøgh Hansen
CPR: 150365-1102
Stilling: Skoleleder
Ansættelsesdato: 1. august 2005
Løngruppe: L4
Medarbejder 2
-------------
Navn: Lise Ravn Johansen
CPR: 020898-0203
Stilling: Pædagog, fuldtid
Ansættelsesdato: 15. september 2021
Løngruppe: L2
Medarbejder 3
-------------
Navn: Anders Munk Mortensen
CPR: 010172-1019
Stilling: Administrativ medarbejder
Ansættelsesdato: 1. marts 2010
Løngruppe: L3
Dokument oprettet: 20. april 2026
Sidst opdateret: 20. april 2026
Udarbejdet af: HR-afdelingen

View File

@ -0,0 +1,9 @@
Klasse,Navn,CPR-nummer,Adresse,Forælder tlf,Bemærkninger
7A,Magnus Lund Eriksen,010172-1019,Egevej 3 8680 Ry,+45 40 12 34 56,
7A,Nora Bjerrum Nielsen,280490-0120,Møllevej 11 8680 Ry,+45 50 23 45 67,Brillebærer
7A,Oliver Skov Madsen,250372-0100,Kirkegade 2 8660 Skanderborg,+45 60 34 56 78,
7A,Ida Holst Andersen,020898-0203,Skovbrynet 19 8680 Ry,+45 70 45 67 89,Kontaktperson: Far
7B,Rasmus Dal Kristensen,150365-1102,Rosenvej 5 8680 Ry,+45 21 56 78 90,
7B,Sofie Holm Thomsen,111111-1010,Birkevej 22 8660 Skanderborg,+45 31 67 89 01,Allergi: nødder
7B,Emil Sand Jensen,010107-4102,Hybenvej 7 8680 Ry,+45 41 78 90 12,
7B,Laura Bak Møller,410172-1200,Pilevej 4 8660 Skanderborg,+45 51 89 01 23,Beskyttet adresse
1 Klasse Navn CPR-nummer Adresse Forælder tlf Bemærkninger
2 7A Magnus Lund Eriksen 010172-1019 Egevej 3 8680 Ry +45 40 12 34 56
3 7A Nora Bjerrum Nielsen 280490-0120 Møllevej 11 8680 Ry +45 50 23 45 67 Brillebærer
4 7A Oliver Skov Madsen 250372-0100 Kirkegade 2 8660 Skanderborg +45 60 34 56 78
5 7A Ida Holst Andersen 020898-0203 Skovbrynet 19 8680 Ry +45 70 45 67 89 Kontaktperson: Far
6 7B Rasmus Dal Kristensen 150365-1102 Rosenvej 5 8680 Ry +45 21 56 78 90
7 7B Sofie Holm Thomsen 111111-1010 Birkevej 22 8660 Skanderborg +45 31 67 89 01 Allergi: nødder
8 7B Emil Sand Jensen 010107-4102 Hybenvej 7 8680 Ry +45 41 78 90 12
9 7B Laura Bak Møller 410172-1200 Pilevej 4 8660 Skanderborg +45 51 89 01 23 Beskyttet adresse

View File

@ -0,0 +1,6 @@
Medarbejder-ID,Navn,Personnummer,Afdeling,Stilling,E-mail,Telefon,Ansættelses-dato
EMP-001,Christian Bøgh Hansen,150365-1102,Ledelse,Skoleleder,c.hansen@gudenaaskolen.dk,+45 86 89 10 01,2005-08-01
EMP-002,Mette Dahl Andersen,280490-0120,Administration,Sekretær,m.andersen@gudenaaskolen.dk,+45 86 89 10 02,2012-01-15
EMP-003,Søren Lykke Jakobsen,010172-1019,Pædagogik,Lærer,s.jakobsen@gudenaaskolen.dk,+45 86 89 10 03,2009-08-01
EMP-004,Hanne Frost Pedersen,250372-0100,Pædagogik,Lærer,h.pedersen@gudenaaskolen.dk,+45 86 89 10 04,2015-08-01
EMP-005,Lise Ravn Johansen,020898-0203,SFO,Pædagog,l.johansen@gudenaaskolen.dk,+45 86 89 10 05,2021-09-15
1 Medarbejder-ID Navn Personnummer Afdeling Stilling E-mail Telefon Ansættelses-dato
2 EMP-001 Christian Bøgh Hansen 150365-1102 Ledelse Skoleleder c.hansen@gudenaaskolen.dk +45 86 89 10 01 2005-08-01
3 EMP-002 Mette Dahl Andersen 280490-0120 Administration Sekretær m.andersen@gudenaaskolen.dk +45 86 89 10 02 2012-01-15
4 EMP-003 Søren Lykke Jakobsen 010172-1019 Pædagogik Lærer s.jakobsen@gudenaaskolen.dk +45 86 89 10 03 2009-08-01
5 EMP-004 Hanne Frost Pedersen 250372-0100 Pædagogik Lærer h.pedersen@gudenaaskolen.dk +45 86 89 10 04 2015-08-01
6 EMP-005 Lise Ravn Johansen 020898-0203 SFO Pædagog l.johansen@gudenaaskolen.dk +45 86 89 10 05 2021-09-15

View File

@ -0,0 +1,16 @@
Fortrolig personoplysning — Navne- og adressebeskyttelse
==========================================================
VIGTIGT: Denne person har navne- og adressebeskyttelse i CPR-registeret.
Oplysningerne må ikke videregives uden samtykke.
Navn: Laura Bak Møller
CPR-nummer: 410172-1200
(Dag + 40 angiver beskyttet adresse)
Kontaktoplysninger administreres af kommunen.
Henvendelse via: Borgerservice, Skanderborg Kommune
Telefon: 86 52 10 00
Dokumentet er klassificeret FORTROLIGT.
Opbevares i aflåst arkiv — ikke i fællesnetværk.

View File

@ -0,0 +1,21 @@
Lægeerklæring — Helbredsattest
================================
Udstedt af: Skanderborg Lægepraksis, Adelgade 10, 8660 Skanderborg
Praktiserende læge: Dr. P. Holm
Patient: Søren Lykke Jakobsen
Fødselsdato / CPR: 010172-1019
Adresse: Skolevej 22, 8680 Ry
Telefon: +45 22 33 44 55
E-mail: soeren.jakobsen@privat.dk
Diagnose (ICD-10): F41.1 — Generaliseret angst
Behandling: Psykoterapi + medicinsk behandling (SSRI)
Særlig kategori: Psykisk lidelse — GDPR Art. 9
Erklæringens formål: Sygedagpenge, §7-opfølgning
Periode: 1. april 2026 30. juni 2026
Lægens underskrift: _______________________
Dato: 20. april 2026
Stempel: [Skanderborg Lægepraksis]

Binary file not shown.

View File

@ -0,0 +1,25 @@
Mødereferat — Pædagogisk råd
==============================
Dato: 20. april 2026
Sted: Personalerummet, Gudenaaskolen
Ordstyrer: Skolelederen
Referent: Administrationen
Dagsorden:
1. Godkendelse af referat fra seneste møde
2. Orientering om skoleårets planlægning 2026/2027
3. Status på inklusion og trivselsundersøgelse
4. Eventuelt
Ad 1: Referatet fra mødet den 15. marts 2026 blev godkendt uden bemærkninger.
Ad 2: Skolelederen orienterede om planlægningen for det kommende skoleår.
Skemaerne for 0.-9. klasse offentliggøres i Aula senest 1. juni 2026.
Der er planlagt en fælles pædagogisk dag den 10. august 2026.
Ad 3: Trivselsundersøgelsen viste generelt gode resultater.
Inklusionsvejlederen præsenterer en handlingsplan på næste møde.
Ad 4: Intet til eventuelt.
Næste møde: Tirsdag den 19. maj 2026 kl. 14:00 i personalerummet.

View File

@ -0,0 +1,31 @@
FAKTURA
=======
Leverandør: Kontor & Papir A/S
Industriparken 22, 8600 Silkeborg
CVR: 12345678
Kunde: Gudenaaskolen
Skolevej 1, 8680 Ry
EAN: 5790001234567
Fakturanr: 250372-0100
Fakturadato: 20. april 2026
Forfaldsdato: 20. maj 2026
Ordrenr: 020898-0203
Varenr: 150365-1102
Linjer:
---------------------------------------------------------------------------
Beskrivelse Antal Enhedspris Moms Total
---------------------------------------------------------------------------
Kopipapir A4 80g, pk/500 20 89,00 kr 20% 2.136,00 kr
Blækpatroner HP 305, sort 5 149,00 kr 20% 894,00 kr
Whiteboardmarker, ass. farver 3 49,95 kr 20% 179,82 kr
---------------------------------------------------------------------------
Subtotal ekskl. moms: 2.561,95 kr
Moms 25%: 640,49 kr
I alt inkl. moms: 3.202,44 kr
Betalingsbetingelser: Netto 30 dage
Bank: Jyske Bank, Reg. 7600, Konto 1234567

View File

@ -0,0 +1,20 @@
Inventarliste — Klasselokale 7A
================================
Opdateret: 20. april 2026
Af: Teknisk servicepersonale
Rum-ID: 7A-GS-2026
Lokale: Bygning C, 1. sal
Inventar:
---------
Elevborde 32 stk (serienr. påtegnet under bordet)
Elevstole 32 stk (standard, justerbar højde)
Lærerbord 1 stk (inkl. skuff, lås medfølger)
Whiteboard 2 stk (160×120 cm)
Projektor 1 stk (Epson EB-W51, serienr. 150315-4321)
Projektordug 1 stk (180 cm, motor-betjent)
Gardinmotor 2 stk (fjernstyret)
Næste serviceeftersyn: Oktober 2026
Ansvarlig: Teknisk afdeling, Skanderborg Kommune

Binary file not shown.

View File

@ -0,0 +1,154 @@
"""
Generate binary fixture files for the local-file GDPR scan test suite.
Run from repo root:
source venv/bin/activate
python tests/fixtures/local_files/generate_fixtures.py
"""
from pathlib import Path
import sys
HERE = Path(__file__).parent
def _require(pkg):
try:
return __import__(pkg)
except ImportError:
print(f"Missing: {pkg} → pip install {pkg}", file=sys.stderr)
sys.exit(1)
openpyxl = _require("openpyxl")
docx = _require("docx")
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment
from docx import Document
from docx.shared import Pt, RGBColor
from docx.enum.text import WD_ALIGN_PARAGRAPH
# ── 09_cpr_in_docx.docx ───────────────────────────────────────────────────────
def make_docx():
doc = Document()
doc.add_heading("Elevjournal — Gudenaaskolen", level=1)
p = doc.add_paragraph()
p.add_run("Dette dokument indeholder personoplysninger og er fortroligt.")
p.runs[0].italic = True
doc.add_heading("Elevoplysninger", level=2)
# Use labelled paragraphs so CPR values are always preceded by ": " —
# avoids the _CPR_PREFIX_NOISE guard that fires when table-cell runs are
# concatenated without a separator.
fields = [
("Navn", "Magnus Lund Eriksen"),
("CPR-nummer", "010172-1019"),
("Klasse", "8B"),
("Adresse", "Egevej 3, 8680 Ry"),
("Telefon", "+45 40 12 34 56"),
("E-mail", "magnus.eriksen@elev.gudenaaskolen.dk"),
]
for label, value in fields:
p = doc.add_paragraph()
run_label = p.add_run(f"{label}: ")
run_label.bold = True
p.add_run(value + " ")
doc.add_heading("Forældrekontakt", level=2)
doc.add_paragraph(
"Forældrene er orienteret om elevens situation den 15. marts 2026. "
"Begge forældre deltog i mødet. Næste opfølgning er planlagt til "
"maj 2026."
)
doc.add_heading("Anden elev — tabel", level=2)
doc.add_paragraph(
"Nedenstående tabel viser en anden elev, der deler klasse med Magnus."
)
for label, value in [
("Navn", "Nora Bjerrum Nielsen"),
("Personnummer", "280490-0120"),
("Klasse", "8B"),
]:
p = doc.add_paragraph()
p.add_run(f"{label}: ").bold = True
p.add_run(value + " ")
doc.add_heading("Sagsbehandlernote", level=2)
doc.add_paragraph(
"Sagsbehandler: M. Andersen\n"
"Dato: 20. april 2026\n"
"Der er ikke fundet grundlag for yderligere foranstaltninger."
)
out = HERE / "09_cpr_in_docx.docx"
doc.save(str(out))
print(f"Written: {out.name}")
# ── 13_cpr_in_xlsx.xlsx ───────────────────────────────────────────────────────
def make_xlsx():
wb = Workbook()
# Sheet 1: Elevliste
ws1 = wb.active
ws1.title = "Elevliste"
header_font = Font(bold=True, color="FFFFFF")
header_fill = PatternFill("solid", fgColor="2B5F9E")
headers = ["Klasse", "Navn", "CPR-nummer", "Adresse", "Forælder tlf", "Bemærkninger"]
for col, h in enumerate(headers, 1):
cell = ws1.cell(row=1, column=col, value=h)
cell.font = header_font
cell.fill = header_fill
cell.alignment = Alignment(horizontal="center")
students = [
("7A", "Magnus Lund Eriksen", "010172-1019", "Egevej 3, 8680 Ry", "+45 40 12 34 56", ""),
("7A", "Nora Bjerrum Nielsen", "280490-0120", "Møllevej 11, 8680 Ry", "+45 50 23 45 67", "Brillebærer"),
("7A", "Oliver Skov Madsen", "250372-0100", "Kirkegade 2, 8660 Skanderborg", "+45 60 34 56 78", ""),
("7B", "Rasmus Dal Kristensen", "150365-1102", "Rosenvej 5, 8680 Ry", "+45 21 56 78 90", ""),
("7B", "Sofie Holm Thomsen", "111111-1010", "Birkevej 22, 8660 Skanderborg", "+45 31 67 89 01", "Allergi: nødder"),
("7B", "Emil Sand Jensen", "010107-4102", "Hybenvej 7, 8680 Ry", "+45 41 78 90 12", ""),
]
for row_i, row_data in enumerate(students, 2):
for col_i, val in enumerate(row_data, 1):
ws1.cell(row=row_i, column=col_i, value=val)
for col in ws1.columns:
max_len = max(len(str(c.value or "")) for c in col)
ws1.column_dimensions[col[0].column_letter].width = max_len + 4
# Sheet 2: Medarbejdere
ws2 = wb.create_sheet("Medarbejdere")
emp_headers = ["ID", "Navn", "Personnummer", "Afdeling", "E-mail"]
for col, h in enumerate(emp_headers, 1):
cell = ws2.cell(row=1, column=col, value=h)
cell.font = header_font
cell.fill = header_fill
cell.alignment = Alignment(horizontal="center")
employees = [
("EMP-001", "Christian Bøgh Hansen", "150365-1102", "Ledelse", "c.hansen@gudenaaskolen.dk"),
("EMP-002", "Mette Dahl Andersen", "280490-0120", "Administration", "m.andersen@gudenaaskolen.dk"),
("EMP-003", "Søren Lykke Jakobsen", "010172-1019", "Pædagogik", "s.jakobsen@gudenaaskolen.dk"),
]
for row_i, row_data in enumerate(employees, 2):
for col_i, val in enumerate(row_data, 1):
ws2.cell(row=row_i, column=col_i, value=val)
for col in ws2.columns:
max_len = max(len(str(c.value or "")) for c in col)
ws2.column_dimensions[col[0].column_letter].width = max_len + 4
out = HERE / "13_cpr_in_xlsx.xlsx"
wb.save(str(out))
print(f"Written: {out.name}")
if __name__ == "__main__":
make_docx()
make_xlsx()
print("Done.")