diff --git a/CHANGELOG.md b/CHANGELOG.md index ab45fbd..d033886 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,6 +11,8 @@ Version numbers follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html ### Added +- **Role filter in results + role-scoped exports** — a new **Role** dropdown in the filter bar (All roles / Ansatte / Elever) narrows the results grid to staff or student items. Clicking **Excel** or **Art.30** while a role is selected exports only that group — the `?role=student|staff` param is forwarded to both export endpoints. `_build_excel_bytes()` and `_build_article30_docx()` now accept a `role` param; all internal sheets (GPS, External transfers, Art.30 staff/student tables) respect the filter. Filenames get an `_elever` or `_ansatte` suffix. + - **Scan filter options for student environments** — two new profile options reduce noise when scanning student accounts: - **Ignore GPS in images** (`skip_gps_images`) — images whose only PII signal is an embedded GPS coordinate are not flagged. Smartphones embed location in every camera photo by default, generating large numbers of low-priority flags in school contexts. GPS data is still extracted and shown in the detail card when the image is flagged by another signal (faces, EXIF author/comment). Applies to M365, Google, and file scans. - **Min. CPR count per file** (`min_cpr_count`, default 1) — a file is only flagged if it contains at least this many *distinct* CPR numbers. Set to 2 to avoid reporting a student's own consent form or registration document (one CPR) while still flagging class lists and grade sheets with multiple students' CPRs. Deduplication is by value — a CPR repeated 10 times counts as 1 distinct number. Applies to M365, Google, and file scans. diff --git a/CLAUDE.md b/CLAUDE.md index a53dc7d..5a3cf02 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -86,6 +86,7 @@ Large M365 tenants can generate enormous memory pressure. Key rules to preserve: - **`GDPRDb.get_session_sources()`** — returns a `set` of source-key strings (e.g. `{"gmail", "gdrive", "email"}`) for every scan in the current session window. Used by both `_build_excel_bytes()` and `_build_article30_docx()` to include zero-hit sources in summary tables. Do not derive the scanned-source set from `by_source` alone — that dict only contains sources with flagged items. - **Excel Summary sheet vs. per-source tabs** — the Summary sheet shows all scanned sources (even with 0 items). Per-source tabs are only created for sources with items; an empty tab has no value. - **ART.30 breakdown table** — iterates `scanned_sources` (not `by_source`) so Gmail, Google Drive, etc. appear with `0 | 0 | 0 | —` when the scan found nothing. +- **Role-filtered exports** — `_build_excel_bytes(role='')` and `_build_article30_docx(role='')` accept `role='student'` or `role='staff'`. A local `_items` list is built at the top of each function and used everywhere instead of `state.flagged_items` directly — GPS sheet, External transfers sheet, and Art.30 staff/student tables all see only the filtered subset. Route handlers read `request.args.get('role', '')` and forward it. Filenames get `_elever` / `_ansatte` suffix. The `#filterRole` dropdown in the filter bar drives both the client-side grid filter and the export URL param — do not separate them. ## SSE teardown — static/js/scan.js diff --git a/README.md b/README.md index c03bfb2..736f5e7 100644 --- a/README.md +++ b/README.md @@ -145,14 +145,17 @@ Each flagged item appears as a card showing: - **Ext.** / **** badge — external email recipient or externally shared file (Art. 44–46 transfer risk) - **delete button** — appears on hover (grid view) or always visible (list view) -**Filter bar** — always visible above both the results grid and the preview panel. Narrow results by source, disposition, transfer risk, and risk level: +**Filter bar** — always visible above both the results grid and the preview panel. Narrow results by source, disposition, transfer risk, risk level, and role: | Filter | Options | |---|---| | Source | All / Email / OneDrive / SharePoint / Teams | | Disposition | All / Unreviewed / Retain (legal/legitimate/contract) / Delete-scheduled / Deleted | | Transfer risk | All / External recipient / External share / Shared | -| Risk level | All risk levels / Art. 9 special category / Photos / biometric | +| Risk level | All risk levels / Art. 9 special category / Photos / biometric | +| **Role** | **All roles / Ansatte (staff) / Elever (students)** | + +The Role filter also scopes exports — selecting **Elever** before clicking **Excel** or **Art.30** produces a report containing only student items. The exported filename gets an `_elever` or `_ansatte` suffix so recipients can distinguish the files. #### Delete items diff --git a/docs/manuals/MANUAL-DA.md b/docs/manuals/MANUAL-DA.md index 2ed905c..9741735 100644 --- a/docs/manuals/MANUAL-DA.md +++ b/docs/manuals/MANUAL-DA.md @@ -226,6 +226,7 @@ Brug filterbjælken over resultaterne til at indsnævre visningen: - **Disposition** — vis elementer efter gennemgangsstatus. - **Deling** — filtrer på delt / ekstern / alle. - **Risiko** — vis kun Art. 9, fotos, GPS eller høj-risiko-elementer. +- **Rolle** — vis kun **Ansatte** eller **Elever**. Påvirker også eksporten: klikker du på **Excel** eller **Art.30**, mens en rolle er valgt, indeholder rapporten kun den pågældende gruppe, og filnavnet får suffikset `_elever` eller `_ansatte`. --- diff --git a/docs/manuals/MANUAL-EN.md b/docs/manuals/MANUAL-EN.md index 2927b40..ab095c9 100644 --- a/docs/manuals/MANUAL-EN.md +++ b/docs/manuals/MANUAL-EN.md @@ -226,6 +226,7 @@ Use the filter bar above the results to narrow down what you see: - **Disposition dropdown** — show items by their review status. - **Transfer dropdown** — filter by shared / external / all. - **Risk dropdown** — show only Art. 9, photos, GPS, or high-risk items. +- **Role dropdown** — show only **Ansatte** (staff) or **Elever** (students). Also scopes exports: clicking **Excel** or **Art.30** while a role is selected produces a report containing only that group, with `_elever` or `_ansatte` appended to the filename. --- diff --git a/lang/da.json b/lang/da.json index 698c5fe..491f61d 100644 --- a/lang/da.json +++ b/lang/da.json @@ -564,6 +564,9 @@ "m365_opt_min_cpr": "Min. CPR-antal pr. fil", "m365_opt_min_cpr_hint": "Filer med færre distinkte CPR-numre end denne tærskel rapporteres ikke. Sæt til 2 for at undgå falske positive, når elever har egne CPR-numre i filer.", "m365_filter_photo_only": "📷 Billeder / biometrisk", + "m365_filter_all_roles": "Alle roller", + "m365_filter_staff": "Ansatte", + "m365_filter_student": "Elever", "m365_badge_faces": "ansigter", "a30_photo_items": "Billeder med registrerede ansigter (Art. 9 biometrisk)", "a30_photo_note": "Fotografier af identificerbare personer er biometriske data i henhold til Art. 9 GDPR. Opbevaring kræver et dokumenteret retsgrundlag i henhold til Art. 9(2). For skolefotografier af elever under 15 år er forældrenes samtykke påkrævet (Databeskyttelsesloven §6). Se Datatilsynets vejledning om fotografering i skoler.", diff --git a/lang/de.json b/lang/de.json index d4ba788..aab9bea 100644 --- a/lang/de.json +++ b/lang/de.json @@ -564,6 +564,9 @@ "m365_opt_min_cpr": "Min. CPR-Anzahl pro Datei", "m365_opt_min_cpr_hint": "Dateien mit weniger eindeutigen CPR-Nummern als dieser Schwellenwert werden nicht gemeldet. Auf 2 setzen, um Falsch-Positive zu vermeiden, wenn Schüler eigene CPR-Nummern in Dateien haben.", "m365_filter_photo_only": "📷 Fotos / biometrisch", + "m365_filter_all_roles": "Alle Rollen", + "m365_filter_staff": "Personal", + "m365_filter_student": "Schüler", "m365_badge_faces": "Gesichter", "a30_photo_items": "Fotos mit erkannten Gesichtern (Art. 9 biometrisch)", "a30_photo_note": "Fotografien identifizierbarer Personen sind biometrische Daten gemäß Art. 9 DSGVO. Die Aufbewahrung erfordert eine dokumentierte Rechtsgrundlage gemäß Art. 9(2). Für Schulfotos von Schülern unter 15 Jahren ist die elterliche Einwilligung erforderlich (Databeskyttelsesloven §6). Siehe Leitfaden des Datatilsynet zur Schulfotografie.", diff --git a/lang/en.json b/lang/en.json index 600cbc6..62ed6aa 100644 --- a/lang/en.json +++ b/lang/en.json @@ -564,6 +564,9 @@ "m365_opt_min_cpr": "Min. CPR count per file", "m365_opt_min_cpr_hint": "Files with fewer distinct CPR numbers than this threshold are not reported. Set to 2 to avoid false positives when students have their own CPR in documents.", "m365_filter_photo_only": "📷 Photos / biometric", + "m365_filter_all_roles": "All roles", + "m365_filter_staff": "Staff", + "m365_filter_student": "Students", "m365_badge_faces": "faces", "a30_photo_items": "Photos with detected faces (Art. 9 biometric)", "a30_photo_note": "Photographs of identifiable persons are biometric data under Art. 9 GDPR. Retention requires a documented legal basis under Art. 9(2). For school photographs of pupils under 15, parental consent is required (Databeskyttelsesloven §6). See Datatilsynet guidance on school photography.", diff --git a/routes/export.py b/routes/export.py index 7ee9d5f..4e26cd1 100644 --- a/routes/export.py +++ b/routes/export.py @@ -24,9 +24,10 @@ bp = Blueprint("export", __name__) logger = logging.getLogger(__name__) -def _build_excel_bytes() -> tuple[bytes, str]: +def _build_excel_bytes(role: str = "") -> tuple[bytes, str]: """Build the M365 scan Excel workbook and return (bytes, filename). - Raises on error. Used by export_excel() and send_report().""" + Raises on error. Used by export_excel() and send_report(). + role: '' = all, 'student' = students only, 'staff' = staff + other.""" from openpyxl import Workbook from openpyxl.styles import Font, PatternFill, Alignment, Border, Side from openpyxl.utils import get_column_letter @@ -131,11 +132,20 @@ def _build_excel_bytes() -> tuple[bytes, str]: ws.auto_filter.ref = f"A1:{get_column_letter(len(COLS))}1" + # Apply role filter — '' means all roles + if role == "student": + _items = [i for i in state.flagged_items if i.get("user_role") == "student"] + elif role == "staff": + _items = [i for i in state.flagged_items if i.get("user_role") != "student"] + else: + _items = list(state.flagged_items) + wb = Workbook() ws_sum = wb.active ws_sum.title = "Summary" ws_sum.sheet_properties.tabColor = "1F3864" - ws_sum["A1"] = "GDPRScanner — Export" + _role_label = {"student": " — Elever", "staff": " — Ansatte"}.get(role, "") + ws_sum["A1"] = f"GDPRScanner — Export{_role_label}" ws_sum["A1"].font = Font(name="Arial", bold=True, size=14, color=HEADER_FG) ws_sum["A1"].fill = _fill(HEADER_BG) ws_sum.merge_cells("A1:D1") @@ -146,8 +156,8 @@ def _build_excel_bytes() -> tuple[bytes, str]: ws_sum["A2"] = "Generated:" ws_sum["B2"] = _dt.datetime.now().strftime("%Y-%m-%d %H:%M") ws_sum["A3"] = "Total flagged items:" - ws_sum["B3"] = len(state.flagged_items) - gps_count = sum(1 for i in state.flagged_items if (i.get("exif") or {}).get("gps")) + ws_sum["B3"] = len(_items) + gps_count = sum(1 for i in _items if (i.get("exif") or {}).get("gps")) if gps_count: ws_sum["A4"] = "Items with GPS data:" ws_sum["B4"] = gps_count @@ -168,7 +178,7 @@ def _build_excel_bytes() -> tuple[bytes, str]: ws_sum.column_dimensions["C"].width = 16 by_source: dict = {} - for item in state.flagged_items: + for item in _items: by_source.setdefault(item.get("source_type", "other"), []).append(item) # Determine which sources were actually scanned (even if they found nothing) @@ -204,7 +214,7 @@ def _build_excel_bytes() -> tuple[bytes, str]: _write_sheet(wb.create_sheet(title=clean_label), items, tab_bg) # GPS items sheet - gps_items = [i for i in state.flagged_items if (i.get("exif") or {}).get("gps")] + gps_items = [i for i in _items if (i.get("exif") or {}).get("gps")] if gps_items: ws_gps = wb.create_sheet(title="GPS locations") ws_gps.sheet_properties.tabColor = "1A7A6E" @@ -242,7 +252,7 @@ def _build_excel_bytes() -> tuple[bytes, str]: ws_gps.auto_filter.ref = f"A1:{get_column_letter(len(GPS_COLS))}1" # External transfers sheet - ext_items = [i for i in state.flagged_items + ext_items = [i for i in _items if i.get("transfer_risk") in ("external-recipient", "external-share", "shared")] if ext_items: ws_ext = wb.create_sheet(title="External transfers") @@ -258,8 +268,11 @@ def _build_excel_bytes() -> tuple[bytes, str]: buf = io.BytesIO() wb.save(buf) buf.seek(0) - fname = f"m365_scan_{_dt.datetime.now().strftime('%Y%m%d_%H%M%S')}.xlsx" + _role_suffix = {"student": "_elever", "staff": "_ansatte"}.get(role, "") + fname = f"m365_scan{_role_suffix}_{_dt.datetime.now().strftime('%Y%m%d_%H%M%S')}.xlsx" return buf.read(), fname + + @bp.route("/api/export_excel") def export_excel(): """Export flagged items as an Excel workbook with per-source tabs.""" @@ -275,8 +288,9 @@ def export_excel(): state.flagged_items[:] = db_items except Exception: pass + role = request.args.get("role", "") try: - xl_bytes, fname = _build_excel_bytes() + xl_bytes, fname = _build_excel_bytes(role=role) return Response( xl_bytes, mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", @@ -292,9 +306,10 @@ def export_excel(): # ── Article 30 report ───────────────────────────────────────────────────────── -def _build_article30_docx() -> tuple[bytes, str]: +def _build_article30_docx(role: str = "") -> tuple[bytes, str]: """Generate a GDPR Article 30 Register of Processing Activities as .docx. - Returns (bytes, filename). Strings are translated using the active state.LANG dict.""" + Returns (bytes, filename). Strings are translated using the active state.LANG dict. + role: '' = all, 'student' = students only, 'staff' = staff + other.""" try: from docx import Document as _Document from docx.shared import Pt, RGBColor, Inches, Cm @@ -314,6 +329,10 @@ def _build_article30_docx() -> tuple[bytes, str]: db = _get_db() if DB_OK else None stats = db.get_stats() if db else {} items = db.get_session_items() if db else list(state.flagged_items) + if role == "student": + items = [i for i in items if i.get("user_role") == "student"] + elif role == "staff": + items = [i for i in items if i.get("user_role") != "student"] trend = db.get_trend(10) if db else [] overdue = db.get_overdue_items(5) if db else [] @@ -357,7 +376,8 @@ def _build_article30_docx() -> tuple[bytes, str]: now_str = _dt.datetime.now().strftime("%Y-%m-%d %H:%M") date_str = _dt.datetime.now().strftime("%Y-%m-%d") - fname = f"article30_{date_str}.docx" + _role_suffix = {"student": "_elever", "staff": "_ansatte"}.get(role, "") + fname = f"article30{_role_suffix}_{date_str}.docx" # Aggregate by source by_source: dict = {} @@ -1121,7 +1141,8 @@ def export_article30(): if not state.flagged_items: return jsonify({"error": "No results to export — run a scan first"}), 400 try: - docx_bytes, fname = _build_article30_docx() + role = request.args.get("role", "") + docx_bytes, fname = _build_article30_docx(role=role) return Response( docx_bytes, mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document", diff --git a/static/js/results.js b/static/js/results.js index 655da00..7844e2a 100644 --- a/static/js/results.js +++ b/static/js/results.js @@ -669,6 +669,7 @@ function applyFilters() { const dispVal = document.getElementById('filterDisposition')?.value || ''; const transferVal = document.getElementById('filterTransfer')?.value || ''; const specialVal = document.getElementById('filterSpecial')?.value || ''; + const roleVal = document.getElementById('filterRole')?.value || ''; S.filteredData = S.flaggedData.filter(f => { if (search && !f.name.toLowerCase().includes(search)) return false; if (srcVal && f.source_type !== srcVal) return false; @@ -676,6 +677,8 @@ function applyFilters() { if (transferVal && (f.transfer_risk || '') !== transferVal) return false; if (specialVal === '1' && !(f.special_category && f.special_category.length)) return false; if (specialVal === 'photo' && !(f.face_count > 0)) return false; + if (roleVal === 'student' && f.user_role !== 'student') return false; + if (roleVal === 'staff' && f.user_role === 'student') return false; return true; }); const grid = document.getElementById('grid'); @@ -721,7 +724,8 @@ async function exportExcel() { return; } // Browser / localhost fallback: fetch as blob and trigger download - const r = await fetch('/api/export_excel'); + const _roleParam = document.getElementById('filterRole')?.value || ''; + const r = await fetch('/api/export_excel' + (_roleParam ? '?role=' + encodeURIComponent(_roleParam) : '')); if (!r.ok) { const err = await r.json().catch(() => ({error: 'Export failed'})); log('Export error: ' + (err.error || r.status), 'err'); @@ -762,7 +766,8 @@ async function exportArticle30() { const btn = document.getElementById('exportA30Btn'); if (btn) { btn.disabled = true; btn.textContent = '⏳'; } try { - const r = await fetch('/api/export_article30'); + const _roleParam30 = document.getElementById('filterRole')?.value || ''; + const r = await fetch('/api/export_article30' + (_roleParam30 ? '?role=' + encodeURIComponent(_roleParam30) : '')); if (!r.ok) { const err = await r.json().catch(() => ({error: 'Export failed'})); log('Article 30 export error: ' + (err.error || r.status), 'err'); @@ -796,6 +801,8 @@ function clearFilters() { if (ft) ft.value = ''; const fs = document.getElementById('filterSpecial'); if (fs) fs.value = ''; + const fr = document.getElementById('filterRole'); + if (fr) fr.value = ''; applyFilters(); } diff --git a/templates/index.html b/templates/index.html index 5527440..56f2ba8 100644 --- a/templates/index.html +++ b/templates/index.html @@ -361,6 +361,11 @@ document.addEventListener('DOMContentLoaded', applyI18n); +