Bugfixes
fix: select mode onclick exports, multi-source progress counter, OCR page-by-page
This commit is contained in:
parent
d8083eb0c0
commit
7c1afca80b
@ -7,14 +7,20 @@ Version numbers follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## [Unreleased]
|
## [1.6.21] — 2026-04-20
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
|
|
||||||
|
- **Local-file scan test fixtures** — `tests/fixtures/local_files/` contains 13 ready-made files (`.txt`, `.csv`, `.docx`, `.xlsx`) covering every detection scenario: CPR with explicit label, mod-11–valid CPR without label, post-2007 CPR with/without context keyword, protected number (day+40), multiple CPRs in one file, mixed PII (CPR + email + Art. 9 health data), and three true-negative cases (clean content, invoice false-positive, post-2007 serial number without context). All CPR numbers are mathematically valid; false-positive fixtures are verified to produce zero hits. Run `generate_fixtures.py` to regenerate the binary files.
|
||||||
|
|
||||||
- **Interface PIN** — optional session-level authentication gate for the main scanner interface. Set a 4–8 digit PIN in **Settings → Security → Interface PIN**; anyone reaching `http://host:5100` is redirected to `/login` and must enter the PIN before accessing scan controls, settings, or results. Viewer tokens and the `/view` route are completely unaffected — reviewers continue to use their own auth chain. The PIN is stored as a salted SHA-256 hash in `config.json`. Brute-force protection: 5 failed attempts per IP locks out for 5 minutes. A `POST /api/interface/logout` endpoint clears the session. PIN management via `GET/POST/DELETE /api/interface/pin`.
|
- **Interface PIN** — optional session-level authentication gate for the main scanner interface. Set a 4–8 digit PIN in **Settings → Security → Interface PIN**; anyone reaching `http://host:5100` is redirected to `/login` and must enter the PIN before accessing scan controls, settings, or results. Viewer tokens and the `/view` route are completely unaffected — reviewers continue to use their own auth chain. The PIN is stored as a salted SHA-256 hash in `config.json`. Brute-force protection: 5 failed attempts per IP locks out for 5 minutes. A `POST /api/interface/logout` endpoint clears the session. PIN management via `GET/POST/DELETE /api/interface/pin`.
|
||||||
|
|
||||||
### Fixed
|
### Fixed
|
||||||
|
|
||||||
|
- **"Vælg" (select mode) button did nothing** — `toggleSelectMode`, `toggleCardSelect`, `selectAllVisible`, and `applyBulkDisposition` were defined inside an ES module but never assigned to `window`, so all `onclick` attributes calling them silently failed. Added the four missing `window.*` exports at the bottom of `results.js`.
|
||||||
|
|
||||||
|
- **Progress counter frozen at M365 total during Google/file scan** — the `scan_progress` handler in `scan.js` only updated `progressStats` and `progressEta` for `source === "m365"`. When M365 finished first, the counter stayed at its final value (e.g. "15083 / 15083 ETA 0s") for the entire duration of the Google and file scans. Fixed in two places: `scan_done` now clears the stats/ETA elements immediately when another scan is still running; `scan_progress` for Google/file sources now shows a running `"X scanned"` count (using the `scanned` field those engines already send) and clears ETA, but only while M365 is not running — M365 stats continue to dominate during concurrent scans.
|
||||||
|
|
||||||
- **PDF OCR kills process on large files** — `document_scanner` previously called `convert_from_path()` once for the entire PDF before the processing loop, allocating all page images in memory simultaneously. A 50-page A4 PDF at 300 DPI required ~1.3 GB in a single allocation, triggering the OS OOM killer. Fixed by rendering one page at a time with `convert_from_path(first_page=N, last_page=N)` inside the loop across `scan_pdf`, `redact_fitz_pdf`, and `redact_pdf`. Peak OCR memory is now bounded to roughly one page (~26 MB at 300 DPI) regardless of document length.
|
- **PDF OCR kills process on large files** — `document_scanner` previously called `convert_from_path()` once for the entire PDF before the processing loop, allocating all page images in memory simultaneously. A 50-page A4 PDF at 300 DPI required ~1.3 GB in a single allocation, triggering the OS OOM killer. Fixed by rendering one page at a time with `convert_from_path(first_page=N, last_page=N)` inside the loop across `scan_pdf`, `redact_fitz_pdf`, and `redact_pdf`. Peak OCR memory is now bounded to roughly one page (~26 MB at 300 DPI) regardless of document length.
|
||||||
|
|
||||||
- **No bulk disposition tagging** — each result card had to be opened individually to set a disposition. Added a Select mode (filter bar "Vælg" button) that reveals per-card checkboxes. Selecting one or more items shows a bulk tag bar at the bottom of the grid with a disposition dropdown and Apply button. Calls `POST /api/db/disposition/bulk`; updates all selected items in-memory and clears the selection. "Select all visible" / "Deselect all" toggle available in the bar. Hidden in viewer mode.
|
- **No bulk disposition tagging** — each result card had to be opened individually to set a disposition. Added a Select mode (filter bar "Vælg" button) that reveals per-card checkboxes. Selecting one or more items shows a bulk tag bar at the bottom of the grid with a disposition dropdown and Apply button. Calls `POST /api/db/disposition/bulk`; updates all selected items in-memory and clears the selection. "Select all visible" / "Deselect all" toggle available in the bar. Hidden in viewer mode.
|
||||||
|
|||||||
@ -44,6 +44,10 @@ python -m pytest tests/ -q
|
|||||||
|
|
||||||
128 tests in `tests/`. No integration tests for Flask routes or live M365/Google connections.
|
128 tests in `tests/`. No integration tests for Flask routes or live M365/Google connections.
|
||||||
|
|
||||||
|
**Local-file scan fixtures** — `tests/fixtures/local_files/` holds 13 documents for manual/UI-level testing of the file scanner. 10 should be flagged; 3 are true negatives. All CPR numbers verified against `is_valid_cpr`. `generate_fixtures.py` (requires `python-docx` + `openpyxl`, already in venv) regenerates the binary `.docx`/`.xlsx` files.
|
||||||
|
|
||||||
|
**`_CPR_PREFIX_NOISE` in `.docx` fixtures** — `scan_docx` builds a single string by concatenating all run texts with no separators between paragraphs. If a CPR value run is immediately followed by text from the next paragraph without a word boundary, `\b` in `CPR_PATTERN` fails and the number is silently missed. The fixture generator appends a trailing `" "` to every value run so CPRs are always surrounded by word boundaries after concatenation. Do not remove this trailing space — the detection will silently regress.
|
||||||
|
|
||||||
## Viewer mode (#33) — routes/viewer.py + static/js/viewer.js
|
## Viewer mode (#33) — routes/viewer.py + static/js/viewer.js
|
||||||
|
|
||||||
Read-only access for DPOs and reviewers. Key invariants:
|
Read-only access for DPOs and reviewers. Key invariants:
|
||||||
|
|||||||
22
README.md
22
README.md
@ -609,6 +609,28 @@ Each new module (`cpr_detector.py`, `app_config.py`, `checkpoint.py`, `gdpr_db.p
|
|||||||
|
|
||||||
The test suite should be run before every release and after any change to `document_scanner.py`, `cpr_detector.py`, or `gdpr_db.py`. CPR detection is the legal core of the tool — a false negative means a real GDPR violation goes undetected.
|
The test suite should be run before every release and after any change to `document_scanner.py`, `cpr_detector.py`, or `gdpr_db.py`. CPR detection is the legal core of the tool — a false negative means a real GDPR violation goes undetected.
|
||||||
|
|
||||||
|
#### Local-file scan fixtures
|
||||||
|
|
||||||
|
`tests/fixtures/local_files/` provides 13 hand-crafted documents for end-to-end testing of the file scanner via the UI or `file_scanner.py`. Drop the folder as a local source and run a scan — all 10 PII-bearing files should be flagged and all 3 negative-case files should produce zero hits.
|
||||||
|
|
||||||
|
| File | Format | Expected | Scenario |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `01_cpr_with_context_label.txt` | TXT | Flag | CPR with explicit `CPR-nummer:` label |
|
||||||
|
| `02_cpr_mod11_valid_bare.txt` | TXT | Flag | mod-11–valid CPR without any context keyword |
|
||||||
|
| `03_cpr_post2007_with_context.txt` | TXT | Flag | Post-2007 birth (fails mod-11), detected via `Personnummer:` keyword |
|
||||||
|
| `04_multiple_cprs.txt` | TXT | Flag | 3 distinct CPR numbers in one staff-records file |
|
||||||
|
| `05_student_register.csv` | CSV | Flag | 8 students incl. one protected-address (day+40) CPR |
|
||||||
|
| `06_employee_list.csv` | CSV | Flag | 5 employees with CPRs |
|
||||||
|
| `07_protected_number.txt` | TXT | Flag | Protected CPR (`410172-1200`, day+40 encoding) |
|
||||||
|
| `08_mixed_pii.txt` | TXT | Flag | CPR + email + phone + GDPR Art. 9 health category |
|
||||||
|
| `09_cpr_in_docx.docx` | DOCX | Flag | 2 CPRs in a Word document (paragraph format) |
|
||||||
|
| `10_clean_no_pii.txt` | TXT | **No flag** | Meeting minutes — no personal data |
|
||||||
|
| `11_false_positive_invoice.txt` | TXT | **No flag** | Invoice: CPR-shaped numbers suppressed by `faktura`/`varenr` context |
|
||||||
|
| `12_post2007_no_context.txt` | TXT | **No flag** | Equipment serial that looks like a post-2007 CPR but has no context keyword |
|
||||||
|
| `13_cpr_in_xlsx.xlsx` | XLSX | Flag | Excel workbook with two sheets: students + employees |
|
||||||
|
|
||||||
|
All CPR numbers are mathematically valid (verified against `is_valid_cpr`). Run `generate_fixtures.py` inside the venv to regenerate the `.docx` and `.xlsx` binary files after any changes.
|
||||||
|
|
||||||
### Roadmap
|
### Roadmap
|
||||||
|
|
||||||
See [SUGGESTIONS.md](SUGGESTIONS.md) for the full feature roadmap with implementation status.
|
See [SUGGESTIONS.md](SUGGESTIONS.md) for the full feature roadmap with implementation status.
|
||||||
|
|||||||
@ -1016,6 +1016,10 @@ window._autoConnectSSEIfRunning = _autoConnectSSEIfRunning;
|
|||||||
window._loadViewerResults = _loadViewerResults;
|
window._loadViewerResults = _loadViewerResults;
|
||||||
window.executeBulkDelete = executeBulkDelete;
|
window.executeBulkDelete = executeBulkDelete;
|
||||||
window.applyFilters = applyFilters;
|
window.applyFilters = applyFilters;
|
||||||
|
window.toggleSelectMode = toggleSelectMode;
|
||||||
|
window.toggleCardSelect = toggleCardSelect;
|
||||||
|
window.selectAllVisible = selectAllVisible;
|
||||||
|
window.applyBulkDisposition = applyBulkDisposition;
|
||||||
window.exportExcel = exportExcel;
|
window.exportExcel = exportExcel;
|
||||||
window.exportArticle30 = exportArticle30;
|
window.exportArticle30 = exportArticle30;
|
||||||
window.clearFilters = clearFilters;
|
window.clearFilters = clearFilters;
|
||||||
|
|||||||
@ -320,16 +320,16 @@ function _attachScanListeners(source) {
|
|||||||
var fill = document.getElementById('progressFill_' + src);
|
var fill = document.getElementById('progressFill_' + src);
|
||||||
if (fill) fill.style.width = pct + '%';
|
if (fill) fill.style.width = pct + '%';
|
||||||
document.getElementById('progressFile').textContent = d.file || '';
|
document.getElementById('progressFile').textContent = d.file || '';
|
||||||
// Only update stats/ETA from M365 (has meaningful totals and ETA)
|
var statsEl = document.getElementById('progressStats');
|
||||||
|
var etaEl = document.getElementById('progressEta');
|
||||||
if (src === 'm365') {
|
if (src === 'm365') {
|
||||||
var statsEl = document.getElementById('progressStats');
|
// M365 sends index + total + ETA — show exact counter
|
||||||
if (statsEl && d.total) {
|
if (statsEl && d.total) statsEl.textContent = (d.index || 0) + ' / ' + d.total;
|
||||||
statsEl.textContent = (d.index || 0) + ' / ' + d.total;
|
if (etaEl && d.eta !== undefined) etaEl.textContent = d.eta ? ('ETA ' + d.eta) : '';
|
||||||
}
|
} else if (!S._m365ScanRunning) {
|
||||||
var etaEl = document.getElementById('progressEta');
|
// Google / file: no total known upfront — show running count once M365 is done
|
||||||
if (etaEl && d.eta !== undefined) {
|
if (statsEl && d.scanned !== undefined) statsEl.textContent = d.scanned + ' scanned';
|
||||||
etaEl.textContent = d.eta ? ('ETA ' + d.eta) : '';
|
if (etaEl) etaEl.textContent = '';
|
||||||
}
|
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
source.addEventListener('scan_file', function(e) {
|
source.addEventListener('scan_file', function(e) {
|
||||||
@ -369,6 +369,13 @@ function _attachScanListeners(source) {
|
|||||||
S._m365ScanRunning = false;
|
S._m365ScanRunning = false;
|
||||||
_renderProgressSegments();
|
_renderProgressSegments();
|
||||||
var _anyRunning = S._googleScanRunning || S._fileScanRunning;
|
var _anyRunning = S._googleScanRunning || S._fileScanRunning;
|
||||||
|
// Clear M365 counter/ETA so Google/file progress can take over the display
|
||||||
|
if (_anyRunning) {
|
||||||
|
var _se = document.getElementById('progressStats');
|
||||||
|
var _ee = document.getElementById('progressEta');
|
||||||
|
if (_se) _se.textContent = '';
|
||||||
|
if (_ee) _ee.textContent = '';
|
||||||
|
}
|
||||||
// Only close SSE once all concurrent scans have finished.
|
// Only close SSE once all concurrent scans have finished.
|
||||||
// Closing early would drop google_scan_done / file_scan_done events and
|
// Closing early would drop google_scan_done / file_scan_done events and
|
||||||
// leave the UI stuck in scanning state.
|
// leave the UI stuck in scanning state.
|
||||||
|
|||||||
19
tests/fixtures/local_files/01_cpr_with_context_label.txt
vendored
Normal file
19
tests/fixtures/local_files/01_cpr_with_context_label.txt
vendored
Normal file
@ -0,0 +1,19 @@
|
|||||||
|
Personoplysninger — Elevakt
|
||||||
|
===========================
|
||||||
|
|
||||||
|
Elevens navn: Lars Bjerregaard Nielsen
|
||||||
|
Klasse: 8B
|
||||||
|
Skole: Gudenaaskolen
|
||||||
|
|
||||||
|
CPR-nummer: 010172-1019
|
||||||
|
Fødselsdato: 1. januar 1972
|
||||||
|
Adresse: Skolevej 14, 8680 Ry
|
||||||
|
Telefon: +45 86 89 12 34
|
||||||
|
E-mail: lars.nielsen@privat.dk
|
||||||
|
|
||||||
|
Notater:
|
||||||
|
Eleven har haft fravær i uge 12 og 14. Forældrene er kontaktet.
|
||||||
|
Der er afholdt møde den 3. marts 2024 med klasselærer og skoleleder.
|
||||||
|
|
||||||
|
Underskrift: _______________________
|
||||||
|
Dato: ___________________
|
||||||
15
tests/fixtures/local_files/02_cpr_mod11_valid_bare.txt
vendored
Normal file
15
tests/fixtures/local_files/02_cpr_mod11_valid_bare.txt
vendored
Normal file
@ -0,0 +1,15 @@
|
|||||||
|
Besøgslog — Sundhedscenter Skanderborg
|
||||||
|
=======================================
|
||||||
|
|
||||||
|
Dato: 28. april 2024
|
||||||
|
Sagsbehandler: M. Andersen
|
||||||
|
|
||||||
|
Borger: Hanne Kirstine Pedersen
|
||||||
|
Registreringsnummer: 280490-0120
|
||||||
|
Henvendelse vedrørende: Sygedagpenge, paragraf 7 opfølgning
|
||||||
|
|
||||||
|
Samtalen fandt sted kl. 10:15 og varede 45 minutter.
|
||||||
|
Borger mødte op til tiden og var forberedt.
|
||||||
|
|
||||||
|
Aftale om næste møde: 26. maj 2024 kl. 10:00
|
||||||
|
Sted: Mødelokale 3, Adelgade 44, 8660 Skanderborg
|
||||||
24
tests/fixtures/local_files/03_cpr_post2007_with_context.txt
vendored
Normal file
24
tests/fixtures/local_files/03_cpr_post2007_with_context.txt
vendored
Normal file
@ -0,0 +1,24 @@
|
|||||||
|
Tilmelding til SFO — Gudenaaskolen
|
||||||
|
===================================
|
||||||
|
|
||||||
|
Barnets navn: Emma Sofie Christensen
|
||||||
|
Personnummer: 150315-4321
|
||||||
|
Klasse: 1A (skolestart august 2022)
|
||||||
|
|
||||||
|
Forældrenes oplysninger
|
||||||
|
-----------------------
|
||||||
|
Forældrenes navn: Søren og Pia Christensen
|
||||||
|
Adresse: Birkevej 7, 8680 Ry
|
||||||
|
Telefon: +45 23 45 67 89
|
||||||
|
E-mail: soeren.christensen@familie.dk
|
||||||
|
|
||||||
|
Fremmødetider valgt:
|
||||||
|
Morgen-SFO: 07:00–08:00
|
||||||
|
Eftermiddag: 13:00–17:00
|
||||||
|
|
||||||
|
Særlige oplysninger til pædagoger:
|
||||||
|
Emma har en lettere nøddeallergi (jordnødder og cashewnødder).
|
||||||
|
Kontaktperson ved allergi: Pia Christensen, tlf. 23 45 67 89
|
||||||
|
|
||||||
|
Dato for tilmelding: 15. marts 2022
|
||||||
|
Underskrift: _______________________
|
||||||
31
tests/fixtures/local_files/04_multiple_cprs.txt
vendored
Normal file
31
tests/fixtures/local_files/04_multiple_cprs.txt
vendored
Normal file
@ -0,0 +1,31 @@
|
|||||||
|
Personalemappe — Fortroligt
|
||||||
|
============================
|
||||||
|
Afdeling: Administrationen, Skanderborg Kommune
|
||||||
|
|
||||||
|
Medarbejder 1
|
||||||
|
-------------
|
||||||
|
Navn: Christian Bøgh Hansen
|
||||||
|
CPR: 150365-1102
|
||||||
|
Stilling: Skoleleder
|
||||||
|
Ansættelsesdato: 1. august 2005
|
||||||
|
Løngruppe: L4
|
||||||
|
|
||||||
|
Medarbejder 2
|
||||||
|
-------------
|
||||||
|
Navn: Lise Ravn Johansen
|
||||||
|
CPR: 020898-0203
|
||||||
|
Stilling: Pædagog, fuldtid
|
||||||
|
Ansættelsesdato: 15. september 2021
|
||||||
|
Løngruppe: L2
|
||||||
|
|
||||||
|
Medarbejder 3
|
||||||
|
-------------
|
||||||
|
Navn: Anders Munk Mortensen
|
||||||
|
CPR: 010172-1019
|
||||||
|
Stilling: Administrativ medarbejder
|
||||||
|
Ansættelsesdato: 1. marts 2010
|
||||||
|
Løngruppe: L3
|
||||||
|
|
||||||
|
Dokument oprettet: 20. april 2026
|
||||||
|
Sidst opdateret: 20. april 2026
|
||||||
|
Udarbejdet af: HR-afdelingen
|
||||||
9
tests/fixtures/local_files/05_student_register.csv
vendored
Normal file
9
tests/fixtures/local_files/05_student_register.csv
vendored
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
Klasse,Navn,CPR-nummer,Adresse,Forælder tlf,Bemærkninger
|
||||||
|
7A,Magnus Lund Eriksen,010172-1019,Egevej 3 8680 Ry,+45 40 12 34 56,
|
||||||
|
7A,Nora Bjerrum Nielsen,280490-0120,Møllevej 11 8680 Ry,+45 50 23 45 67,Brillebærer
|
||||||
|
7A,Oliver Skov Madsen,250372-0100,Kirkegade 2 8660 Skanderborg,+45 60 34 56 78,
|
||||||
|
7A,Ida Holst Andersen,020898-0203,Skovbrynet 19 8680 Ry,+45 70 45 67 89,Kontaktperson: Far
|
||||||
|
7B,Rasmus Dal Kristensen,150365-1102,Rosenvej 5 8680 Ry,+45 21 56 78 90,
|
||||||
|
7B,Sofie Holm Thomsen,111111-1010,Birkevej 22 8660 Skanderborg,+45 31 67 89 01,Allergi: nødder
|
||||||
|
7B,Emil Sand Jensen,010107-4102,Hybenvej 7 8680 Ry,+45 41 78 90 12,
|
||||||
|
7B,Laura Bak Møller,410172-1200,Pilevej 4 8660 Skanderborg,+45 51 89 01 23,Beskyttet adresse
|
||||||
|
6
tests/fixtures/local_files/06_employee_list.csv
vendored
Normal file
6
tests/fixtures/local_files/06_employee_list.csv
vendored
Normal file
@ -0,0 +1,6 @@
|
|||||||
|
Medarbejder-ID,Navn,Personnummer,Afdeling,Stilling,E-mail,Telefon,Ansættelses-dato
|
||||||
|
EMP-001,Christian Bøgh Hansen,150365-1102,Ledelse,Skoleleder,c.hansen@gudenaaskolen.dk,+45 86 89 10 01,2005-08-01
|
||||||
|
EMP-002,Mette Dahl Andersen,280490-0120,Administration,Sekretær,m.andersen@gudenaaskolen.dk,+45 86 89 10 02,2012-01-15
|
||||||
|
EMP-003,Søren Lykke Jakobsen,010172-1019,Pædagogik,Lærer,s.jakobsen@gudenaaskolen.dk,+45 86 89 10 03,2009-08-01
|
||||||
|
EMP-004,Hanne Frost Pedersen,250372-0100,Pædagogik,Lærer,h.pedersen@gudenaaskolen.dk,+45 86 89 10 04,2015-08-01
|
||||||
|
EMP-005,Lise Ravn Johansen,020898-0203,SFO,Pædagog,l.johansen@gudenaaskolen.dk,+45 86 89 10 05,2021-09-15
|
||||||
|
16
tests/fixtures/local_files/07_protected_number.txt
vendored
Normal file
16
tests/fixtures/local_files/07_protected_number.txt
vendored
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
Fortrolig personoplysning — Navne- og adressebeskyttelse
|
||||||
|
==========================================================
|
||||||
|
|
||||||
|
VIGTIGT: Denne person har navne- og adressebeskyttelse i CPR-registeret.
|
||||||
|
Oplysningerne må ikke videregives uden samtykke.
|
||||||
|
|
||||||
|
Navn: Laura Bak Møller
|
||||||
|
CPR-nummer: 410172-1200
|
||||||
|
(Dag + 40 angiver beskyttet adresse)
|
||||||
|
|
||||||
|
Kontaktoplysninger administreres af kommunen.
|
||||||
|
Henvendelse via: Borgerservice, Skanderborg Kommune
|
||||||
|
Telefon: 86 52 10 00
|
||||||
|
|
||||||
|
Dokumentet er klassificeret FORTROLIGT.
|
||||||
|
Opbevares i aflåst arkiv — ikke i fællesnetværk.
|
||||||
21
tests/fixtures/local_files/08_mixed_pii.txt
vendored
Normal file
21
tests/fixtures/local_files/08_mixed_pii.txt
vendored
Normal file
@ -0,0 +1,21 @@
|
|||||||
|
Lægeerklæring — Helbredsattest
|
||||||
|
================================
|
||||||
|
Udstedt af: Skanderborg Lægepraksis, Adelgade 10, 8660 Skanderborg
|
||||||
|
Praktiserende læge: Dr. P. Holm
|
||||||
|
|
||||||
|
Patient: Søren Lykke Jakobsen
|
||||||
|
Fødselsdato / CPR: 010172-1019
|
||||||
|
Adresse: Skolevej 22, 8680 Ry
|
||||||
|
Telefon: +45 22 33 44 55
|
||||||
|
E-mail: soeren.jakobsen@privat.dk
|
||||||
|
|
||||||
|
Diagnose (ICD-10): F41.1 — Generaliseret angst
|
||||||
|
Behandling: Psykoterapi + medicinsk behandling (SSRI)
|
||||||
|
Særlig kategori: Psykisk lidelse — GDPR Art. 9
|
||||||
|
|
||||||
|
Erklæringens formål: Sygedagpenge, §7-opfølgning
|
||||||
|
Periode: 1. april 2026 – 30. juni 2026
|
||||||
|
|
||||||
|
Lægens underskrift: _______________________
|
||||||
|
Dato: 20. april 2026
|
||||||
|
Stempel: [Skanderborg Lægepraksis]
|
||||||
BIN
tests/fixtures/local_files/09_cpr_in_docx.docx
vendored
Normal file
BIN
tests/fixtures/local_files/09_cpr_in_docx.docx
vendored
Normal file
Binary file not shown.
25
tests/fixtures/local_files/10_clean_no_pii.txt
vendored
Normal file
25
tests/fixtures/local_files/10_clean_no_pii.txt
vendored
Normal file
@ -0,0 +1,25 @@
|
|||||||
|
Mødereferat — Pædagogisk råd
|
||||||
|
==============================
|
||||||
|
Dato: 20. april 2026
|
||||||
|
Sted: Personalerummet, Gudenaaskolen
|
||||||
|
Ordstyrer: Skolelederen
|
||||||
|
Referent: Administrationen
|
||||||
|
|
||||||
|
Dagsorden:
|
||||||
|
1. Godkendelse af referat fra seneste møde
|
||||||
|
2. Orientering om skoleårets planlægning 2026/2027
|
||||||
|
3. Status på inklusion og trivselsundersøgelse
|
||||||
|
4. Eventuelt
|
||||||
|
|
||||||
|
Ad 1: Referatet fra mødet den 15. marts 2026 blev godkendt uden bemærkninger.
|
||||||
|
|
||||||
|
Ad 2: Skolelederen orienterede om planlægningen for det kommende skoleår.
|
||||||
|
Skemaerne for 0.-9. klasse offentliggøres i Aula senest 1. juni 2026.
|
||||||
|
Der er planlagt en fælles pædagogisk dag den 10. august 2026.
|
||||||
|
|
||||||
|
Ad 3: Trivselsundersøgelsen viste generelt gode resultater.
|
||||||
|
Inklusionsvejlederen præsenterer en handlingsplan på næste møde.
|
||||||
|
|
||||||
|
Ad 4: Intet til eventuelt.
|
||||||
|
|
||||||
|
Næste møde: Tirsdag den 19. maj 2026 kl. 14:00 i personalerummet.
|
||||||
31
tests/fixtures/local_files/11_false_positive_invoice.txt
vendored
Normal file
31
tests/fixtures/local_files/11_false_positive_invoice.txt
vendored
Normal file
@ -0,0 +1,31 @@
|
|||||||
|
FAKTURA
|
||||||
|
=======
|
||||||
|
Leverandør: Kontor & Papir A/S
|
||||||
|
Industriparken 22, 8600 Silkeborg
|
||||||
|
CVR: 12345678
|
||||||
|
|
||||||
|
Kunde: Gudenaaskolen
|
||||||
|
Skolevej 1, 8680 Ry
|
||||||
|
EAN: 5790001234567
|
||||||
|
|
||||||
|
Fakturanr: 250372-0100
|
||||||
|
Fakturadato: 20. april 2026
|
||||||
|
Forfaldsdato: 20. maj 2026
|
||||||
|
|
||||||
|
Ordrenr: 020898-0203
|
||||||
|
Varenr: 150365-1102
|
||||||
|
|
||||||
|
Linjer:
|
||||||
|
---------------------------------------------------------------------------
|
||||||
|
Beskrivelse Antal Enhedspris Moms Total
|
||||||
|
---------------------------------------------------------------------------
|
||||||
|
Kopipapir A4 80g, pk/500 20 89,00 kr 20% 2.136,00 kr
|
||||||
|
Blækpatroner HP 305, sort 5 149,00 kr 20% 894,00 kr
|
||||||
|
Whiteboardmarker, ass. farver 3 49,95 kr 20% 179,82 kr
|
||||||
|
---------------------------------------------------------------------------
|
||||||
|
Subtotal ekskl. moms: 2.561,95 kr
|
||||||
|
Moms 25%: 640,49 kr
|
||||||
|
I alt inkl. moms: 3.202,44 kr
|
||||||
|
|
||||||
|
Betalingsbetingelser: Netto 30 dage
|
||||||
|
Bank: Jyske Bank, Reg. 7600, Konto 1234567
|
||||||
20
tests/fixtures/local_files/12_post2007_no_context.txt
vendored
Normal file
20
tests/fixtures/local_files/12_post2007_no_context.txt
vendored
Normal file
@ -0,0 +1,20 @@
|
|||||||
|
Inventarliste — Klasselokale 7A
|
||||||
|
================================
|
||||||
|
Opdateret: 20. april 2026
|
||||||
|
Af: Teknisk servicepersonale
|
||||||
|
|
||||||
|
Rum-ID: 7A-GS-2026
|
||||||
|
Lokale: Bygning C, 1. sal
|
||||||
|
|
||||||
|
Inventar:
|
||||||
|
---------
|
||||||
|
Elevborde 32 stk (serienr. påtegnet under bordet)
|
||||||
|
Elevstole 32 stk (standard, justerbar højde)
|
||||||
|
Lærerbord 1 stk (inkl. skuff, lås medfølger)
|
||||||
|
Whiteboard 2 stk (160×120 cm)
|
||||||
|
Projektor 1 stk (Epson EB-W51, serienr. 150315-4321)
|
||||||
|
Projektordug 1 stk (180 cm, motor-betjent)
|
||||||
|
Gardinmotor 2 stk (fjernstyret)
|
||||||
|
|
||||||
|
Næste serviceeftersyn: Oktober 2026
|
||||||
|
Ansvarlig: Teknisk afdeling, Skanderborg Kommune
|
||||||
BIN
tests/fixtures/local_files/13_cpr_in_xlsx.xlsx
vendored
Normal file
BIN
tests/fixtures/local_files/13_cpr_in_xlsx.xlsx
vendored
Normal file
Binary file not shown.
154
tests/fixtures/local_files/generate_fixtures.py
vendored
Normal file
154
tests/fixtures/local_files/generate_fixtures.py
vendored
Normal file
@ -0,0 +1,154 @@
|
|||||||
|
"""
|
||||||
|
Generate binary fixture files for the local-file GDPR scan test suite.
|
||||||
|
|
||||||
|
Run from repo root:
|
||||||
|
source venv/bin/activate
|
||||||
|
python tests/fixtures/local_files/generate_fixtures.py
|
||||||
|
"""
|
||||||
|
from pathlib import Path
|
||||||
|
import sys
|
||||||
|
|
||||||
|
HERE = Path(__file__).parent
|
||||||
|
|
||||||
|
def _require(pkg):
|
||||||
|
try:
|
||||||
|
return __import__(pkg)
|
||||||
|
except ImportError:
|
||||||
|
print(f"Missing: {pkg} → pip install {pkg}", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
openpyxl = _require("openpyxl")
|
||||||
|
docx = _require("docx")
|
||||||
|
|
||||||
|
from openpyxl import Workbook
|
||||||
|
from openpyxl.styles import Font, PatternFill, Alignment
|
||||||
|
from docx import Document
|
||||||
|
from docx.shared import Pt, RGBColor
|
||||||
|
from docx.enum.text import WD_ALIGN_PARAGRAPH
|
||||||
|
|
||||||
|
|
||||||
|
# ── 09_cpr_in_docx.docx ───────────────────────────────────────────────────────
|
||||||
|
def make_docx():
|
||||||
|
doc = Document()
|
||||||
|
|
||||||
|
doc.add_heading("Elevjournal — Gudenaaskolen", level=1)
|
||||||
|
|
||||||
|
p = doc.add_paragraph()
|
||||||
|
p.add_run("Dette dokument indeholder personoplysninger og er fortroligt.")
|
||||||
|
p.runs[0].italic = True
|
||||||
|
|
||||||
|
doc.add_heading("Elevoplysninger", level=2)
|
||||||
|
# Use labelled paragraphs so CPR values are always preceded by ": " —
|
||||||
|
# avoids the _CPR_PREFIX_NOISE guard that fires when table-cell runs are
|
||||||
|
# concatenated without a separator.
|
||||||
|
fields = [
|
||||||
|
("Navn", "Magnus Lund Eriksen"),
|
||||||
|
("CPR-nummer", "010172-1019"),
|
||||||
|
("Klasse", "8B"),
|
||||||
|
("Adresse", "Egevej 3, 8680 Ry"),
|
||||||
|
("Telefon", "+45 40 12 34 56"),
|
||||||
|
("E-mail", "magnus.eriksen@elev.gudenaaskolen.dk"),
|
||||||
|
]
|
||||||
|
for label, value in fields:
|
||||||
|
p = doc.add_paragraph()
|
||||||
|
run_label = p.add_run(f"{label}: ")
|
||||||
|
run_label.bold = True
|
||||||
|
p.add_run(value + " ")
|
||||||
|
|
||||||
|
doc.add_heading("Forældrekontakt", level=2)
|
||||||
|
doc.add_paragraph(
|
||||||
|
"Forældrene er orienteret om elevens situation den 15. marts 2026. "
|
||||||
|
"Begge forældre deltog i mødet. Næste opfølgning er planlagt til "
|
||||||
|
"maj 2026."
|
||||||
|
)
|
||||||
|
|
||||||
|
doc.add_heading("Anden elev — tabel", level=2)
|
||||||
|
doc.add_paragraph(
|
||||||
|
"Nedenstående tabel viser en anden elev, der deler klasse med Magnus."
|
||||||
|
)
|
||||||
|
for label, value in [
|
||||||
|
("Navn", "Nora Bjerrum Nielsen"),
|
||||||
|
("Personnummer", "280490-0120"),
|
||||||
|
("Klasse", "8B"),
|
||||||
|
]:
|
||||||
|
p = doc.add_paragraph()
|
||||||
|
p.add_run(f"{label}: ").bold = True
|
||||||
|
p.add_run(value + " ")
|
||||||
|
|
||||||
|
doc.add_heading("Sagsbehandlernote", level=2)
|
||||||
|
doc.add_paragraph(
|
||||||
|
"Sagsbehandler: M. Andersen\n"
|
||||||
|
"Dato: 20. april 2026\n"
|
||||||
|
"Der er ikke fundet grundlag for yderligere foranstaltninger."
|
||||||
|
)
|
||||||
|
|
||||||
|
out = HERE / "09_cpr_in_docx.docx"
|
||||||
|
doc.save(str(out))
|
||||||
|
print(f"Written: {out.name}")
|
||||||
|
|
||||||
|
|
||||||
|
# ── 13_cpr_in_xlsx.xlsx ───────────────────────────────────────────────────────
|
||||||
|
def make_xlsx():
|
||||||
|
wb = Workbook()
|
||||||
|
|
||||||
|
# Sheet 1: Elevliste
|
||||||
|
ws1 = wb.active
|
||||||
|
ws1.title = "Elevliste"
|
||||||
|
|
||||||
|
header_font = Font(bold=True, color="FFFFFF")
|
||||||
|
header_fill = PatternFill("solid", fgColor="2B5F9E")
|
||||||
|
|
||||||
|
headers = ["Klasse", "Navn", "CPR-nummer", "Adresse", "Forælder tlf", "Bemærkninger"]
|
||||||
|
for col, h in enumerate(headers, 1):
|
||||||
|
cell = ws1.cell(row=1, column=col, value=h)
|
||||||
|
cell.font = header_font
|
||||||
|
cell.fill = header_fill
|
||||||
|
cell.alignment = Alignment(horizontal="center")
|
||||||
|
|
||||||
|
students = [
|
||||||
|
("7A", "Magnus Lund Eriksen", "010172-1019", "Egevej 3, 8680 Ry", "+45 40 12 34 56", ""),
|
||||||
|
("7A", "Nora Bjerrum Nielsen", "280490-0120", "Møllevej 11, 8680 Ry", "+45 50 23 45 67", "Brillebærer"),
|
||||||
|
("7A", "Oliver Skov Madsen", "250372-0100", "Kirkegade 2, 8660 Skanderborg", "+45 60 34 56 78", ""),
|
||||||
|
("7B", "Rasmus Dal Kristensen", "150365-1102", "Rosenvej 5, 8680 Ry", "+45 21 56 78 90", ""),
|
||||||
|
("7B", "Sofie Holm Thomsen", "111111-1010", "Birkevej 22, 8660 Skanderborg", "+45 31 67 89 01", "Allergi: nødder"),
|
||||||
|
("7B", "Emil Sand Jensen", "010107-4102", "Hybenvej 7, 8680 Ry", "+45 41 78 90 12", ""),
|
||||||
|
]
|
||||||
|
for row_i, row_data in enumerate(students, 2):
|
||||||
|
for col_i, val in enumerate(row_data, 1):
|
||||||
|
ws1.cell(row=row_i, column=col_i, value=val)
|
||||||
|
|
||||||
|
for col in ws1.columns:
|
||||||
|
max_len = max(len(str(c.value or "")) for c in col)
|
||||||
|
ws1.column_dimensions[col[0].column_letter].width = max_len + 4
|
||||||
|
|
||||||
|
# Sheet 2: Medarbejdere
|
||||||
|
ws2 = wb.create_sheet("Medarbejdere")
|
||||||
|
emp_headers = ["ID", "Navn", "Personnummer", "Afdeling", "E-mail"]
|
||||||
|
for col, h in enumerate(emp_headers, 1):
|
||||||
|
cell = ws2.cell(row=1, column=col, value=h)
|
||||||
|
cell.font = header_font
|
||||||
|
cell.fill = header_fill
|
||||||
|
cell.alignment = Alignment(horizontal="center")
|
||||||
|
|
||||||
|
employees = [
|
||||||
|
("EMP-001", "Christian Bøgh Hansen", "150365-1102", "Ledelse", "c.hansen@gudenaaskolen.dk"),
|
||||||
|
("EMP-002", "Mette Dahl Andersen", "280490-0120", "Administration", "m.andersen@gudenaaskolen.dk"),
|
||||||
|
("EMP-003", "Søren Lykke Jakobsen", "010172-1019", "Pædagogik", "s.jakobsen@gudenaaskolen.dk"),
|
||||||
|
]
|
||||||
|
for row_i, row_data in enumerate(employees, 2):
|
||||||
|
for col_i, val in enumerate(row_data, 1):
|
||||||
|
ws2.cell(row=row_i, column=col_i, value=val)
|
||||||
|
|
||||||
|
for col in ws2.columns:
|
||||||
|
max_len = max(len(str(c.value or "")) for c in col)
|
||||||
|
ws2.column_dimensions[col[0].column_letter].width = max_len + 4
|
||||||
|
|
||||||
|
out = HERE / "13_cpr_in_xlsx.xlsx"
|
||||||
|
wb.save(str(out))
|
||||||
|
print(f"Written: {out.name}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
make_docx()
|
||||||
|
make_xlsx()
|
||||||
|
print("Done.")
|
||||||
Loading…
x
Reference in New Issue
Block a user