19 Commits

Author SHA1 Message Date
StyxX65
8a446509c6 Hide landing/last-scan card whenever results render
The live scan_file_flagged handler showed the grid but never hid
#emptyState / #lastScanSummary, so when a scan ran with the landing
card visible, results appeared underneath it until a manual refresh
(which re-ran loadOpenItems and cleared it). Hide both panels in
renderGrid whenever files are present, covering every render path
(live SSE, open-items load, history, filters).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 14:45:50 +02:00
StyxX65
b661a94f98 Restore user/group badges on DB-loaded result cards
The card badge only rendered when f.account_name was set, and the
group (role) badge was nested inside that same check. But save_item
never persisted account_name — only account_id (a GUID) and user_role.
Live SSE cards carried account_name so badges showed during a scan;
now that the grid loads finalized scans from the DB, the gap is exposed
and both badges vanish for earlier scans.

- Persist account_name (migration 11 + save_item) so future scans show
  the user badge. Both M365 and Google cards already carry it.
- _accountPill() in results.js drives the group badge off user_role
  alone (shows for legacy rows) and resolves a best-effort user label:
  account_name → S._allUsers (id/email) → email-style account_id → omit.
  Both card layouts share the one helper.

Legacy rows still lack account_name (never captured), but now show the
group badge and a resolved/email user label where possible.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 10:15:19 +02:00
StyxX65
f84c8516df Reliably restore last session on refresh after a server restart
The page-load restore was one-shot and bailed when a completed scan's
replayed scan_phase left a running flag set; sse_replay_done (the other
retry) only fires for a non-empty replay buffer, which is empty after a
restart — so refreshing post-update showed a blank grid despite the
results being in the DB. The watchdog now retries the restore on each
4s poll while nothing is shown and no scan runs, clearing stale flags
first. /api/scan/status also reports google_running separately so a
refresh during a live Google scan is no longer treated as idle.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-16 11:53:07 +02:00
StyxX65
95f1f39a1f Keep data-subject-deleted cards in grid until next scan
Apply the keep-until-next-scan behaviour to deleteSubjectItems: mark the
deleted items _deleted (using deleted_ids from the response) and keep them
greyed in the grid instead of filtering them out. Also fixes a latent bug
where renderGrid() was called with no argument and threw on files.forEach,
which the surrounding try/catch swallowed as a false "Delete failed" after a
successful erasure.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:47:52 +02:00
StyxX65
386831c423 Keep bulk-deleted cards in grid until next scan
Extend the keep-until-next-scan behaviour to the bulk delete modal: instead
of removing matched cards on success, mark them _deleted and keep them greyed
with a "🗑 Deleted" badge and hidden buttons. /api/delete_bulk now returns
deleted_ids so the grid marks exactly the items the server actually deleted —
partial failures stay active and re-deletable. Already-handled (_deleted /
_redacted) items are excluded from the bulk-delete match set so they aren't
re-counted or re-processed.

201 tests pass.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:46:14 +02:00
StyxX65
ed3c3a80d6 Keep deleted cards in grid until next scan
Mirror the redact behaviour for the card delete button (🗑): instead of
removing the card on success, mark the item _deleted and keep it in the grid
— greyed via card-resolved, shown with a red "🗑 Deleted" badge, action
buttons hidden so it can't be re-processed. The grid is rebuilt on the next
scan run, clearing the markers. results.js only — no server change.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:44:10 +02:00
StyxX65
7c1c2b390d Keep selected card in view when opening preview
Opening the preview panel narrows .grid-area and reflows the auto-fill grid
to fewer columns, moving the clicked card to a new row. The single-frame
scrollIntoView ran while the browser's scroll-anchoring re-adjusted scrollTop
mid-reflow, so the card scrolled out of view. Disable scroll anchoring on
.grid-area (overflow-anchor:none) and defer the scroll by two animation
frames against the settled layout, centring the card (block:'center').

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:35:04 +02:00
StyxX65
d82a0d6004 Keep redacted cards in grid until next scan
Redacting a card (✏) previously removed it from the grid and from
S.flaggedData/S.filteredData immediately. Now the item is marked _redacted
and kept: greyed via card-resolved styling, shown with a "✏ Redacted" badge,
and its delete/redact buttons hidden so it can't be re-processed. The grid is
rebuilt on the next scan run, which clears the markers. results.js only — no
server change.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:30:41 +02:00
StyxX65
b6d2915d49 Harden XSS escaping and encrypt Claude API key at rest
- results.js: add esc() helper and apply to all scan-derived fields
  (name, account_name, folder, source, modified, label, img alt) across
  card/list/preview/subject-lookup/related views. Scan-derived strings can
  carry attacker-controlled markup (e.g. a OneDrive file named with HTML),
  so they must be escaped before innerHTML/attribute embedding. Also escape
  the related-docs onclick JSON to match the delete/redact &quot; pattern.
- cpr_detector._placeholder_svg: escape label/name before embedding — served
  as image/svg+xml via /api/thumb?name=, so an unescaped value was a
  reflected-XSS vector when the URL is opened directly.
- cpr_detector: remove 44-line unreachable duplicate of the face-detection
  body left inside _extract_audio_metadata after its return.
- app_config: encrypt claude_api_key at rest with the machine-keyed Fernet
  (same as the SMTP password); add get_claude_api_key() for decryption.
  Legacy plaintext keys still read and are re-encrypted on next save.
  Update readers in document_scanner.py and routes/app_routes.py.

201 tests pass.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:06:36 +02:00
StyxX65
034ced943e Extended document redaction to Google Drive, SFTP, SMB, and local PDFs Extends the ✂ in-place redaction feature beyond local DOCX/XLSX/CSV/TXT files to cover all remaining file source types and adds PDF support for local files. 2026-05-28 17:47:02 +02:00
StyxX65
23b9555dcf Built-in file redaction for local files 2026-05-27 14:49:06 +02:00
StyxX65
78fb406422 Fixed two bugs: selected cards staying visible after preview opens, and stale history results showing when a new scan starts. 2026-04-29 15:18:58 +02:00
StyxX65
d84e57239a Add CPR cross-referencing (related documents)
Clicking any flagged card that contains CPR hits now shows a "Related documents" section in the preview panel,
  listing other items from the same scan session that share at least one CPR number. Items are ordered by number of
  shared CPRs; clicking any entry opens it in the preview panel. Works in both live mode and scan history mode.

  Implementation
  - GDPRDb.get_related_items() — SQL self-join on the existing cpr_index table using the same symmetric 300 s session
  window as get_session_items. No new data collection needed.
  - GET /api/db/related/<item_id>?ref=N — new endpoint in routes/database.py, consistent with the ?ref convention used
   by /api/db/flagged.
  - #previewRelated div injected between the metadata block and disposition row in the preview panel.
  - _loadRelated(f) in results.js fetches and renders the list; window._openRelated() resolves items from the live
  grid or falls back to the API response for history-mode items.

  Also
  - Added keyword/FTS5 search as a deferred idea in SUGGESTIONS.md
  - Updated CHANGELOG.md, README.md, and CLAUDE.md
2026-04-25 21:15:50 +02:00
StyxX65
2254e00481 recap: Added email and phone number detection as opt-in scan options across all three engines, plus translation fixes. Both CHANGELOG and SUGGESTIONS are updated — everything is committed and ready to test. 2026-04-25 19:33:28 +02:00
StyxX65
7c1afca80b Bugfixes
fix: select mode onclick exports, multi-source progress counter, OCR       page-by-page
2026-04-21 13:12:54 +02:00
StyxX65
d8083eb0c0 feat: interface PIN, bulk disposition tagging, Google Drive delta scan, OCR memory fixes
- Interface PIN: optional session-level auth gate for the main scanner UI
  (Settings → Security → Interface PIN). Salted SHA-256 in config.json,
  rate-limited (5 attempts/5 min per IP). /view and viewer auth exempt.
  New /login page, before_request hook, GET/POST/DELETE /api/interface/pin,
  POST /api/interface/pin/verify, POST /api/interface/logout.

- Bulk disposition tagging: Select mode (filter bar "Vælg" button) reveals
  per-card checkboxes. Bulk tag bar at bottom of grid; POST /api/db/disposition/bulk.
  Disposition stats bar (total · unreviewed · retain · delete · % reviewed)
  updates after every save.

- Google Drive delta scan: uses Drive Changes API when delta is enabled.
  Per-user token stored as gdrive:{email} in delta.json. Load-then-merge
  save avoids racing with concurrent M365 token writes.

- PDF OCR OOM fix: render one page at a time with convert_from_path
  (first_page=N, last_page=N). Added _ocr_mem_ok() psutil guard (500 MB
  threshold) before each page render across scan_pdf, redact_fitz_pdf,
  redact_pdf.

- Email test message translation fix: routes/email.py returns structured
  {ok, method, recipients} instead of a hardcoded English string;
  scheduler.js builds the translated message client-side.

- Docs: CHANGELOG, README, TODO, MANUAL-EN, MANUAL-DA all updated.
  Lang files (en/da/de) extended with bulk, interface PIN, and SMTP keys.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 18:46:45 +02:00
StyxX65
c9aab19a97 feat: scan history browser, user-scoped viewer tokens, export fixes, email fixes (v1.6.20)
- Scan history browser (history.js, GET /api/db/sessions, get_sessions(),
  get_session_items(ref_scan_id)) — review any past session without rescanning
- User-scoped viewer tokens (#34) — scope by individual employee across M365
  and GWS; autocomplete from Accounts list; dual-email support
- Fix: GWS scan never marked finished (end_scan → finish_scan) and emitted
  wrong SSE event (scan_done → google_scan_done), excluding GWS items from all
  exports
- Fix: file scan begin_scan called with wrong keyword args (TypeError swallowed),
  so local/SMB items were never written to DB
- Fix: Graph sendMail reported failure on success — _post() now returns {} on
  empty 202 response instead of raising JSONDecodeError
- Fix: Graph error hidden behind generic "No SMTP host" message when both Graph
  and SMTP were unavailable
- Fix: Gmail vs Google Workspace SMTP error messages distinguished by username
  domain; Workspace errors point to admin console, not personal security settings
- Docs: update README, MANUAL-EN, MANUAL-DA, CLAUDE.md, TODO.md, CHANGELOG.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 13:57:54 +02:00
StyxX65
0c35a7a83d feat: role filter in results grid + role-scoped Excel and Art.30 exports
- New Role dropdown in filter bar (All / Ansatte / Elever) — filters the
  results grid client-side via applyFilters() and clearFilters().

- Exports respect the active role: exportExcel() and exportArticle30()
  append ?role=student|staff to the fetch URL when a role is selected.

- _build_excel_bytes(role='') and _build_article30_docx(role='') filter
  to a local _items list at the top; all internal sheets (Summary, GPS,
  External transfers, Art.30 staff/student tables) see only the filtered
  subset. Filenames get _elever or _ansatte suffix.

- i18n: m365_filter_all_roles / m365_filter_staff / m365_filter_student
  added to en/da/de.json.

- CLAUDE.md, README.md, CHANGELOG.md, MANUAL-EN.md, MANUAL-DA.md updated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 09:02:52 +02:00
Henrik Højmark
9c7df76fbd Initial commit 2026-04-11 04:38:11 +02:00