feat: role filter in results grid + role-scoped Excel and Art.30 exports

- New Role dropdown in filter bar (All / Ansatte / Elever) — filters the
  results grid client-side via applyFilters() and clearFilters().

- Exports respect the active role: exportExcel() and exportArticle30()
  append ?role=student|staff to the fetch URL when a role is selected.

- _build_excel_bytes(role='') and _build_article30_docx(role='') filter
  to a local _items list at the top; all internal sheets (Summary, GPS,
  External transfers, Art.30 staff/student tables) see only the filtered
  subset. Filenames get _elever or _ansatte suffix.

- i18n: m365_filter_all_roles / m365_filter_staff / m365_filter_student
  added to en/da/de.json.

- CLAUDE.md, README.md, CHANGELOG.md, MANUAL-EN.md, MANUAL-DA.md updated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
StyxX65 2026-04-12 09:02:52 +02:00
parent 28c9effd17
commit 0c35a7a83d
11 changed files with 68 additions and 18 deletions

View File

@ -11,6 +11,8 @@ Version numbers follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html
### Added
- **Role filter in results + role-scoped exports** — a new **Role** dropdown in the filter bar (All roles / Ansatte / Elever) narrows the results grid to staff or student items. Clicking **Excel** or **Art.30** while a role is selected exports only that group — the `?role=student|staff` param is forwarded to both export endpoints. `_build_excel_bytes()` and `_build_article30_docx()` now accept a `role` param; all internal sheets (GPS, External transfers, Art.30 staff/student tables) respect the filter. Filenames get an `_elever` or `_ansatte` suffix.
- **Scan filter options for student environments** — two new profile options reduce noise when scanning student accounts:
- **Ignore GPS in images** (`skip_gps_images`) — images whose only PII signal is an embedded GPS coordinate are not flagged. Smartphones embed location in every camera photo by default, generating large numbers of low-priority flags in school contexts. GPS data is still extracted and shown in the detail card when the image is flagged by another signal (faces, EXIF author/comment). Applies to M365, Google, and file scans.
- **Min. CPR count per file** (`min_cpr_count`, default 1) — a file is only flagged if it contains at least this many *distinct* CPR numbers. Set to 2 to avoid reporting a student's own consent form or registration document (one CPR) while still flagging class lists and grade sheets with multiple students' CPRs. Deduplication is by value — a CPR repeated 10 times counts as 1 distinct number. Applies to M365, Google, and file scans.

View File

@ -86,6 +86,7 @@ Large M365 tenants can generate enormous memory pressure. Key rules to preserve:
- **`GDPRDb.get_session_sources()`** — returns a `set` of source-key strings (e.g. `{"gmail", "gdrive", "email"}`) for every scan in the current session window. Used by both `_build_excel_bytes()` and `_build_article30_docx()` to include zero-hit sources in summary tables. Do not derive the scanned-source set from `by_source` alone — that dict only contains sources with flagged items.
- **Excel Summary sheet vs. per-source tabs** — the Summary sheet shows all scanned sources (even with 0 items). Per-source tabs are only created for sources with items; an empty tab has no value.
- **ART.30 breakdown table** — iterates `scanned_sources` (not `by_source`) so Gmail, Google Drive, etc. appear with `0 | 0 | 0 | —` when the scan found nothing.
- **Role-filtered exports**`_build_excel_bytes(role='')` and `_build_article30_docx(role='')` accept `role='student'` or `role='staff'`. A local `_items` list is built at the top of each function and used everywhere instead of `state.flagged_items` directly — GPS sheet, External transfers sheet, and Art.30 staff/student tables all see only the filtered subset. Route handlers read `request.args.get('role', '')` and forward it. Filenames get `_elever` / `_ansatte` suffix. The `#filterRole` dropdown in the filter bar drives both the client-side grid filter and the export URL param — do not separate them.
## SSE teardown — static/js/scan.js

View File

@ -145,14 +145,17 @@ Each flagged item appears as a card showing:
- **Ext.** / **** badge — external email recipient or externally shared file (Art. 4446 transfer risk)
- **delete button** — appears on hover (grid view) or always visible (list view)
**Filter bar** — always visible above both the results grid and the preview panel. Narrow results by source, disposition, transfer risk, and risk level:
**Filter bar** — always visible above both the results grid and the preview panel. Narrow results by source, disposition, transfer risk, risk level, and role:
| Filter | Options |
|---|---|
| Source | All / Email / OneDrive / SharePoint / Teams |
| Disposition | All / Unreviewed / Retain (legal/legitimate/contract) / Delete-scheduled / Deleted |
| Transfer risk | All / External recipient / External share / Shared |
| Risk level | All risk levels / Art. 9 special category / Photos / biometric |
| Risk level | All risk levels / Art. 9 special category / Photos / biometric |
| **Role** | **All roles / Ansatte (staff) / Elever (students)** |
The Role filter also scopes exports — selecting **Elever** before clicking **Excel** or **Art.30** produces a report containing only student items. The exported filename gets an `_elever` or `_ansatte` suffix so recipients can distinguish the files.
#### Delete items

View File

@ -226,6 +226,7 @@ Brug filterbjælken over resultaterne til at indsnævre visningen:
- **Disposition** — vis elementer efter gennemgangsstatus.
- **Deling** — filtrer på delt / ekstern / alle.
- **Risiko** — vis kun Art. 9, fotos, GPS eller høj-risiko-elementer.
- **Rolle** — vis kun **Ansatte** eller **Elever**. Påvirker også eksporten: klikker du på **Excel** eller **Art.30**, mens en rolle er valgt, indeholder rapporten kun den pågældende gruppe, og filnavnet får suffikset `_elever` eller `_ansatte`.
---

View File

@ -226,6 +226,7 @@ Use the filter bar above the results to narrow down what you see:
- **Disposition dropdown** — show items by their review status.
- **Transfer dropdown** — filter by shared / external / all.
- **Risk dropdown** — show only Art. 9, photos, GPS, or high-risk items.
- **Role dropdown** — show only **Ansatte** (staff) or **Elever** (students). Also scopes exports: clicking **Excel** or **Art.30** while a role is selected produces a report containing only that group, with `_elever` or `_ansatte` appended to the filename.
---

View File

@ -564,6 +564,9 @@
"m365_opt_min_cpr": "Min. CPR-antal pr. fil",
"m365_opt_min_cpr_hint": "Filer med færre distinkte CPR-numre end denne tærskel rapporteres ikke. Sæt til 2 for at undgå falske positive, når elever har egne CPR-numre i filer.",
"m365_filter_photo_only": "📷 Billeder / biometrisk",
"m365_filter_all_roles": "Alle roller",
"m365_filter_staff": "Ansatte",
"m365_filter_student": "Elever",
"m365_badge_faces": "ansigter",
"a30_photo_items": "Billeder med registrerede ansigter (Art. 9 biometrisk)",
"a30_photo_note": "Fotografier af identificerbare personer er biometriske data i henhold til Art. 9 GDPR. Opbevaring kræver et dokumenteret retsgrundlag i henhold til Art. 9(2). For skolefotografier af elever under 15 år er forældrenes samtykke påkrævet (Databeskyttelsesloven §6). Se Datatilsynets vejledning om fotografering i skoler.",

View File

@ -564,6 +564,9 @@
"m365_opt_min_cpr": "Min. CPR-Anzahl pro Datei",
"m365_opt_min_cpr_hint": "Dateien mit weniger eindeutigen CPR-Nummern als dieser Schwellenwert werden nicht gemeldet. Auf 2 setzen, um Falsch-Positive zu vermeiden, wenn Schüler eigene CPR-Nummern in Dateien haben.",
"m365_filter_photo_only": "📷 Fotos / biometrisch",
"m365_filter_all_roles": "Alle Rollen",
"m365_filter_staff": "Personal",
"m365_filter_student": "Schüler",
"m365_badge_faces": "Gesichter",
"a30_photo_items": "Fotos mit erkannten Gesichtern (Art. 9 biometrisch)",
"a30_photo_note": "Fotografien identifizierbarer Personen sind biometrische Daten gemäß Art. 9 DSGVO. Die Aufbewahrung erfordert eine dokumentierte Rechtsgrundlage gemäß Art. 9(2). Für Schulfotos von Schülern unter 15 Jahren ist die elterliche Einwilligung erforderlich (Databeskyttelsesloven §6). Siehe Leitfaden des Datatilsynet zur Schulfotografie.",

View File

@ -564,6 +564,9 @@
"m365_opt_min_cpr": "Min. CPR count per file",
"m365_opt_min_cpr_hint": "Files with fewer distinct CPR numbers than this threshold are not reported. Set to 2 to avoid false positives when students have their own CPR in documents.",
"m365_filter_photo_only": "📷 Photos / biometric",
"m365_filter_all_roles": "All roles",
"m365_filter_staff": "Staff",
"m365_filter_student": "Students",
"m365_badge_faces": "faces",
"a30_photo_items": "Photos with detected faces (Art. 9 biometric)",
"a30_photo_note": "Photographs of identifiable persons are biometric data under Art. 9 GDPR. Retention requires a documented legal basis under Art. 9(2). For school photographs of pupils under 15, parental consent is required (Databeskyttelsesloven §6). See Datatilsynet guidance on school photography.",

View File

@ -24,9 +24,10 @@ bp = Blueprint("export", __name__)
logger = logging.getLogger(__name__)
def _build_excel_bytes() -> tuple[bytes, str]:
def _build_excel_bytes(role: str = "") -> tuple[bytes, str]:
"""Build the M365 scan Excel workbook and return (bytes, filename).
Raises on error. Used by export_excel() and send_report()."""
Raises on error. Used by export_excel() and send_report().
role: '' = all, 'student' = students only, 'staff' = staff + other."""
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment, Border, Side
from openpyxl.utils import get_column_letter
@ -131,11 +132,20 @@ def _build_excel_bytes() -> tuple[bytes, str]:
ws.auto_filter.ref = f"A1:{get_column_letter(len(COLS))}1"
# Apply role filter — '' means all roles
if role == "student":
_items = [i for i in state.flagged_items if i.get("user_role") == "student"]
elif role == "staff":
_items = [i for i in state.flagged_items if i.get("user_role") != "student"]
else:
_items = list(state.flagged_items)
wb = Workbook()
ws_sum = wb.active
ws_sum.title = "Summary"
ws_sum.sheet_properties.tabColor = "1F3864"
ws_sum["A1"] = "GDPRScanner — Export"
_role_label = {"student": " — Elever", "staff": " — Ansatte"}.get(role, "")
ws_sum["A1"] = f"GDPRScanner — Export{_role_label}"
ws_sum["A1"].font = Font(name="Arial", bold=True, size=14, color=HEADER_FG)
ws_sum["A1"].fill = _fill(HEADER_BG)
ws_sum.merge_cells("A1:D1")
@ -146,8 +156,8 @@ def _build_excel_bytes() -> tuple[bytes, str]:
ws_sum["A2"] = "Generated:"
ws_sum["B2"] = _dt.datetime.now().strftime("%Y-%m-%d %H:%M")
ws_sum["A3"] = "Total flagged items:"
ws_sum["B3"] = len(state.flagged_items)
gps_count = sum(1 for i in state.flagged_items if (i.get("exif") or {}).get("gps"))
ws_sum["B3"] = len(_items)
gps_count = sum(1 for i in _items if (i.get("exif") or {}).get("gps"))
if gps_count:
ws_sum["A4"] = "Items with GPS data:"
ws_sum["B4"] = gps_count
@ -168,7 +178,7 @@ def _build_excel_bytes() -> tuple[bytes, str]:
ws_sum.column_dimensions["C"].width = 16
by_source: dict = {}
for item in state.flagged_items:
for item in _items:
by_source.setdefault(item.get("source_type", "other"), []).append(item)
# Determine which sources were actually scanned (even if they found nothing)
@ -204,7 +214,7 @@ def _build_excel_bytes() -> tuple[bytes, str]:
_write_sheet(wb.create_sheet(title=clean_label), items, tab_bg)
# GPS items sheet
gps_items = [i for i in state.flagged_items if (i.get("exif") or {}).get("gps")]
gps_items = [i for i in _items if (i.get("exif") or {}).get("gps")]
if gps_items:
ws_gps = wb.create_sheet(title="GPS locations")
ws_gps.sheet_properties.tabColor = "1A7A6E"
@ -242,7 +252,7 @@ def _build_excel_bytes() -> tuple[bytes, str]:
ws_gps.auto_filter.ref = f"A1:{get_column_letter(len(GPS_COLS))}1"
# External transfers sheet
ext_items = [i for i in state.flagged_items
ext_items = [i for i in _items
if i.get("transfer_risk") in ("external-recipient", "external-share", "shared")]
if ext_items:
ws_ext = wb.create_sheet(title="External transfers")
@ -258,8 +268,11 @@ def _build_excel_bytes() -> tuple[bytes, str]:
buf = io.BytesIO()
wb.save(buf)
buf.seek(0)
fname = f"m365_scan_{_dt.datetime.now().strftime('%Y%m%d_%H%M%S')}.xlsx"
_role_suffix = {"student": "_elever", "staff": "_ansatte"}.get(role, "")
fname = f"m365_scan{_role_suffix}_{_dt.datetime.now().strftime('%Y%m%d_%H%M%S')}.xlsx"
return buf.read(), fname
@bp.route("/api/export_excel")
def export_excel():
"""Export flagged items as an Excel workbook with per-source tabs."""
@ -275,8 +288,9 @@ def export_excel():
state.flagged_items[:] = db_items
except Exception:
pass
role = request.args.get("role", "")
try:
xl_bytes, fname = _build_excel_bytes()
xl_bytes, fname = _build_excel_bytes(role=role)
return Response(
xl_bytes,
mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
@ -292,9 +306,10 @@ def export_excel():
# ── Article 30 report ─────────────────────────────────────────────────────────
def _build_article30_docx() -> tuple[bytes, str]:
def _build_article30_docx(role: str = "") -> tuple[bytes, str]:
"""Generate a GDPR Article 30 Register of Processing Activities as .docx.
Returns (bytes, filename). Strings are translated using the active state.LANG dict."""
Returns (bytes, filename). Strings are translated using the active state.LANG dict.
role: '' = all, 'student' = students only, 'staff' = staff + other."""
try:
from docx import Document as _Document
from docx.shared import Pt, RGBColor, Inches, Cm
@ -314,6 +329,10 @@ def _build_article30_docx() -> tuple[bytes, str]:
db = _get_db() if DB_OK else None
stats = db.get_stats() if db else {}
items = db.get_session_items() if db else list(state.flagged_items)
if role == "student":
items = [i for i in items if i.get("user_role") == "student"]
elif role == "staff":
items = [i for i in items if i.get("user_role") != "student"]
trend = db.get_trend(10) if db else []
overdue = db.get_overdue_items(5) if db else []
@ -357,7 +376,8 @@ def _build_article30_docx() -> tuple[bytes, str]:
now_str = _dt.datetime.now().strftime("%Y-%m-%d %H:%M")
date_str = _dt.datetime.now().strftime("%Y-%m-%d")
fname = f"article30_{date_str}.docx"
_role_suffix = {"student": "_elever", "staff": "_ansatte"}.get(role, "")
fname = f"article30{_role_suffix}_{date_str}.docx"
# Aggregate by source
by_source: dict = {}
@ -1121,7 +1141,8 @@ def export_article30():
if not state.flagged_items:
return jsonify({"error": "No results to export — run a scan first"}), 400
try:
docx_bytes, fname = _build_article30_docx()
role = request.args.get("role", "")
docx_bytes, fname = _build_article30_docx(role=role)
return Response(
docx_bytes,
mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document",

View File

@ -669,6 +669,7 @@ function applyFilters() {
const dispVal = document.getElementById('filterDisposition')?.value || '';
const transferVal = document.getElementById('filterTransfer')?.value || '';
const specialVal = document.getElementById('filterSpecial')?.value || '';
const roleVal = document.getElementById('filterRole')?.value || '';
S.filteredData = S.flaggedData.filter(f => {
if (search && !f.name.toLowerCase().includes(search)) return false;
if (srcVal && f.source_type !== srcVal) return false;
@ -676,6 +677,8 @@ function applyFilters() {
if (transferVal && (f.transfer_risk || '') !== transferVal) return false;
if (specialVal === '1' && !(f.special_category && f.special_category.length)) return false;
if (specialVal === 'photo' && !(f.face_count > 0)) return false;
if (roleVal === 'student' && f.user_role !== 'student') return false;
if (roleVal === 'staff' && f.user_role === 'student') return false;
return true;
});
const grid = document.getElementById('grid');
@ -721,7 +724,8 @@ async function exportExcel() {
return;
}
// Browser / localhost fallback: fetch as blob and trigger download
const r = await fetch('/api/export_excel');
const _roleParam = document.getElementById('filterRole')?.value || '';
const r = await fetch('/api/export_excel' + (_roleParam ? '?role=' + encodeURIComponent(_roleParam) : ''));
if (!r.ok) {
const err = await r.json().catch(() => ({error: 'Export failed'}));
log('Export error: ' + (err.error || r.status), 'err');
@ -762,7 +766,8 @@ async function exportArticle30() {
const btn = document.getElementById('exportA30Btn');
if (btn) { btn.disabled = true; btn.textContent = '⏳'; }
try {
const r = await fetch('/api/export_article30');
const _roleParam30 = document.getElementById('filterRole')?.value || '';
const r = await fetch('/api/export_article30' + (_roleParam30 ? '?role=' + encodeURIComponent(_roleParam30) : ''));
if (!r.ok) {
const err = await r.json().catch(() => ({error: 'Export failed'}));
log('Article 30 export error: ' + (err.error || r.status), 'err');
@ -796,6 +801,8 @@ function clearFilters() {
if (ft) ft.value = '';
const fs = document.getElementById('filterSpecial');
if (fs) fs.value = '';
const fr = document.getElementById('filterRole');
if (fr) fr.value = '';
applyFilters();
}

View File

@ -361,6 +361,11 @@ document.addEventListener('DOMContentLoaded', applyI18n);
<option value="1" data-i18n="m365_filter_special_only">⚠ Art. 9 only</option>
<option value="photo" data-i18n="m365_filter_photo_only">📷 Photos / biometric</option>
</select>
<select id="filterRole" onchange="applyFilters()" style="width:120px">
<option value="" data-i18n="m365_filter_all_roles">All roles</option>
<option value="staff" data-i18n="m365_filter_staff">Ansatte</option>
<option value="student" data-i18n="m365_filter_student">Elever</option>
</select>
<button class="filter-clear" onclick="clearFilters()" data-i18n="m365_filter_clear">Ryd</button>
<div class="spacer"></div>
<button id="exportBtn" onclick="exportExcel()" style="background:none;border:1px solid var(--border);color:var(--muted)" data-i18n="m365_btn_export_excel" title="Export results as Excel">Excel</button>