97 Commits

Author SHA1 Message Date
StyxX65
68076eba52 Show all open (unactioned) items by default, not just the last scan
The default results view loaded only the latest scan session (±300s
window), so items dropped out of sight once a newer scan started — and
a long scheduled scan could show little or nothing on browser open.

Add get_open_items(): every flagged item with no disposition (or status
'unreviewed') across all scans, deduped by id to the latest finished
scan. GET /api/db/flagged now serves it when no ?ref is given; ?ref=N
still loads a specific past session. Frontend loadHistorySession(null)
routes to a new loadOpenItems() loader. Rename the banner button to
"Open items" (da/de/en).

get_session_items() default is unchanged — export.py and
scan_scheduler.py still rely on latest-session for the current scan's
report/email.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 09:19:55 +02:00
StyxX65
67f66c8441 Document self-update system and related changes in CLAUDE.md
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-16 12:16:14 +02:00
StyxX65
8bb482925f Release 1.7.8
- CHANGELOG: cut the 1.7.8 release (dated 2026-06-16); reset Unreleased.
- VERSION: 1.7.7 -> 1.7.8.
- Manuals (DA + EN): bump version stamps.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-16 11:56:12 +02:00
StyxX65
f84c8516df Reliably restore last session on refresh after a server restart
The page-load restore was one-shot and bailed when a completed scan's
replayed scan_phase left a running flag set; sse_replay_done (the other
retry) only fires for a non-empty replay buffer, which is empty after a
restart — so refreshing post-update showed a blank grid despite the
results being in the DB. The watchdog now retries the restore on each
4s poll while nothing is shown and no scan runs, clearing stale flags
first. /api/scan/status also reports google_running separately so a
refresh during a live Google scan is no longer treated as idle.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-16 11:53:07 +02:00
StyxX65
9fd1aa1f8a Manuals: describe new share-link create flow
After Create the form clears and the new link appears highlighted in
the Active links list, copied from there — not from a preview row.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-15 10:52:22 +02:00
StyxX65
da356fb310 Release 1.7.7
- CHANGELOG: cut the 1.7.7 release (dated 2026-06-15); reset Unreleased.
- VERSION: 1.7.6 -> 1.7.7.
- Manuals (DA + EN): bump version stamps.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-15 10:12:00 +02:00
StyxX65
bdba80e72d Remove stale link preview from share modal after create
The generated-link "Copy link:" row stayed visible after creating,
looking like the form hadn't reset — but the new link was already in
the Active links list with its own Copy button. Drop the redundant
preview row; on create, reset the form and briefly highlight the new
entry in the active list. Removes the now-dead shareNewLinkRow markup
and copyShareLink().

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-15 10:11:03 +02:00
StyxX65
c26dd7d320 Add Zoraxy HTTPS setup guide, correct SECURITY.md bind address
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 15:20:33 +02:00
StyxX65
841311a6bd Release 1.7.6
- CHANGELOG: cut the 1.7.6 release (dated 2026-06-11); reset Unreleased.
- VERSION: 1.7.5 -> 1.7.6.
- Manuals (DA + EN): bump version stamps.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 15:02:44 +02:00
StyxX65
dd19be8bbf Close leaked listening socket on update restart
Werkzeug sets its server socket inheritable unconditionally, so the
os.execv restart carried it into the new process as a zombie listener:
one PID listening on both 5100 (never accepted) and 5101 (the real
server). Mark all fds above stderr close-on-exec before exec'ing so
the old socket dies and the new server rebinds the original port.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 15:01:17 +02:00
StyxX65
c43725ca7f Release 1.7.5
- CHANGELOG: cut the 1.7.5 release (dated 2026-06-11); reset Unreleased.
- VERSION: 1.7.4 -> 1.7.5.
- Manuals (DA + EN): bump version stamps.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 14:42:06 +02:00
StyxX65
a1712ae178 Make static files revalidate so the UI is fresh after updates
No Cache-Control header meant browsers cached JS/CSS heuristically for
days; after a server update (including the in-app self-update reload)
the backend was new but the frontend stayed stale. SEND_FILE_MAX_AGE
_DEFAULT=0 forces ETag revalidation — 304 when unchanged, fresh file
immediately after an update.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 14:39:45 +02:00
StyxX65
c1cddb8ea7 Release 1.7.4
- CHANGELOG: cut the 1.7.4 release (dated 2026-06-10); reset Unreleased.
- VERSION: 1.7.3 -> 1.7.4.
- Manuals (DA + EN): bump version stamps.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 15:33:16 +02:00
StyxX65
9cbd93e1f5 Reset all share modal fields after creating a link
Create only cleared the label; scope type, user email, date range, and
expiry carried over, so the next link silently inherited the previous
link's scope. Extracted openShareModal's reset logic into
_resetShareForm() and call it after every successful create.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 15:32:03 +02:00
StyxX65
d4cf2db347 Release 1.7.3
- CHANGELOG: cut the 1.7.3 release (dated 2026-06-10); reset Unreleased.
- VERSION: 1.7.2 -> 1.7.3.
- Manuals (DA + EN): bump version stamps.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 15:24:27 +02:00
StyxX65
d6bf80a68a Keep the same port across app restarts
The port probe did a plain bind() without SO_REUSEADDR, so TIME_WAIT
connections left by the previous instance (e.g. the in-app update
restart) made the port look occupied and the app hopped to the next
one. Probe with SO_REUSEADDR like Werkzeug binds, and give the
requested port a 10-second grace period before auto-incrementing.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 15:23:18 +02:00
StyxX65
679f91da2c Use page origin for share links except when browsing at localhost
The LAN-IP rewrite in _getShareBaseUrl() exists to fix unusable
127.0.0.1 links; applying it to every origin meant links copied behind
a reverse proxy pointed at http://<LAN-IP>:5100, bypassing TLS. HTTPS
and non-localhost origins are now used as-is.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 15:14:33 +02:00
StyxX65
c79e7097ea Release 1.7.2
- CHANGELOG: cut the 1.7.2 release (dated 2026-06-10); reset Unreleased.
- VERSION: 1.7.1 -> 1.7.2.
- Manuals (DA + EN): bump version stamps.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 15:10:59 +02:00
StyxX65
35e767b506 Fix copy buttons doing nothing over plain HTTP
navigator.clipboard is undefined in non-secure contexts, so the direct
writeText() call threw synchronously and the execCommand fallback in its
.catch() never ran. _copyText() now feature-detects the API, falls back
to execCommand('copy'), then to a prompt() for manual copying. log.js
reuses the helper; _getShareBaseUrl() caches the LAN-IP lookup so token
Copy buttons stay within the click gesture execCommand requires.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 15:09:34 +02:00
StyxX65
652031b31d Release 1.7.1
- CHANGELOG: cut the 1.7.1 release (dated 2026-06-10); reset Unreleased.
- VERSION: 1.7.0 -> 1.7.1.
- Manuals (DA + EN): bump version stamps; document the new
  Settings -> General -> Software update group (check/install/auto-update,
  git-checkout-only, self-restart, refused during scans).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 14:50:31 +02:00
StyxX65
df54b20735 Document software updates in README, refresh test suite table
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 14:47:58 +02:00
StyxX65
a325349ecd Fix stale ~/.gdpr_scanner_* paths in help text, docs, and UI strings
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 14:41:23 +02:00
StyxX65
6a4b0e1706 Show delta token source count, add hint bubble, fix README data paths
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 14:27:14 +02:00
StyxX65
c0e45df440 Add software update from Settings GUI and update_gdpr.sh script
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 12:54:29 +02:00
StyxX65
fcf32f3751 Release 1.7.0
- CHANGELOG: cut the 1.7.0 release (dated 2026-06-10); reset Unreleased.
- VERSION: 1.6.28 → 1.7.0.
- Manuals (DA + EN): bump version stamps; correct the redaction section
  (cards are now kept/greyed until the next scan, not removed) and add the
  same keep-until-next-scan note to the deletion section, including the
  partial-failure behaviour.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 12:06:36 +02:00
StyxX65
95f1f39a1f Keep data-subject-deleted cards in grid until next scan
Apply the keep-until-next-scan behaviour to deleteSubjectItems: mark the
deleted items _deleted (using deleted_ids from the response) and keep them
greyed in the grid instead of filtering them out. Also fixes a latent bug
where renderGrid() was called with no argument and threw on files.forEach,
which the surrounding try/catch swallowed as a false "Delete failed" after a
successful erasure.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:47:52 +02:00
StyxX65
386831c423 Keep bulk-deleted cards in grid until next scan
Extend the keep-until-next-scan behaviour to the bulk delete modal: instead
of removing matched cards on success, mark them _deleted and keep them greyed
with a "🗑 Deleted" badge and hidden buttons. /api/delete_bulk now returns
deleted_ids so the grid marks exactly the items the server actually deleted —
partial failures stay active and re-deletable. Already-handled (_deleted /
_redacted) items are excluded from the bulk-delete match set so they aren't
re-counted or re-processed.

201 tests pass.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:46:14 +02:00
StyxX65
ed3c3a80d6 Keep deleted cards in grid until next scan
Mirror the redact behaviour for the card delete button (🗑): instead of
removing the card on success, mark the item _deleted and keep it in the grid
— greyed via card-resolved, shown with a red "🗑 Deleted" badge, action
buttons hidden so it can't be re-processed. The grid is rebuilt on the next
scan run, clearing the markers. results.js only — no server change.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:44:10 +02:00
StyxX65
7c1c2b390d Keep selected card in view when opening preview
Opening the preview panel narrows .grid-area and reflows the auto-fill grid
to fewer columns, moving the clicked card to a new row. The single-frame
scrollIntoView ran while the browser's scroll-anchoring re-adjusted scrollTop
mid-reflow, so the card scrolled out of view. Disable scroll anchoring on
.grid-area (overflow-anchor:none) and defer the scroll by two animation
frames against the settled layout, centring the card (block:'center').

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:35:04 +02:00
StyxX65
d82a0d6004 Keep redacted cards in grid until next scan
Redacting a card (✏) previously removed it from the grid and from
S.flaggedData/S.filteredData immediately. Now the item is marked _redacted
and kept: greyed via card-resolved styling, shown with a "✏ Redacted" badge,
and its delete/redact buttons hidden so it can't be re-processed. The grid is
rebuilt on the next scan run, which clears the markers. results.js only — no
server change.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:30:41 +02:00
StyxX65
1b3d7f5698 Fix card action buttons clipped in grid view (missing position:relative)
The real cause behind the invisible redact/delete buttons: .card lacked
position:relative, so the position:absolute action buttons (delete, redact)
and the bulk-select checkbox anchored to the viewport instead of the card
and were clipped by .card overflow:hidden. They only showed in list view,
where those elements are position:static. Add position:relative to .card so
all three position within each card. Keep the 0.35 baseline opacity on the
redact button for discoverability.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:24:00 +02:00
StyxX65
39500edfbc Changelog: note redact button visibility fix
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:21:37 +02:00
StyxX65
35fd00437f Fix redact button invisible in grid view
.card-redact-btn had opacity:0 at rest (only opacity:1 on .card:hover), so
the ✏ redact button was completely invisible in the default grid/thumbnail
view — it only showed in list view, which forces opacity:1. Give it the same
0.35 baseline opacity as .card-delete-btn so it's discoverable at rest and
brightens on hover. The button was always rendered in the DOM; this is a
pure visibility fix.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:20:06 +02:00
StyxX65
c39d68ca19 Document XSS escaping + secret-encryption hardening
- CHANGELOG: add Unreleased ### Security section covering the stored XSS
  in the results grid, the reflected XSS in /api/thumb, and the Claude API
  key now being encrypted at rest.
- CLAUDE.md / static/js/CLAUDE.md: add the esc() / _html_esc escaping rule
  for scan-derived strings and the onclick-JSON &quot; pattern.
- CLAUDE.md / routes/CLAUDE.md: note that secret config fields use the
  machine-keyed Fernet and must be read via a decrypting accessor
  (get_claude_api_key()), never config.json directly.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:15:39 +02:00
StyxX65
b6d2915d49 Harden XSS escaping and encrypt Claude API key at rest
- results.js: add esc() helper and apply to all scan-derived fields
  (name, account_name, folder, source, modified, label, img alt) across
  card/list/preview/subject-lookup/related views. Scan-derived strings can
  carry attacker-controlled markup (e.g. a OneDrive file named with HTML),
  so they must be escaped before innerHTML/attribute embedding. Also escape
  the related-docs onclick JSON to match the delete/redact &quot; pattern.
- cpr_detector._placeholder_svg: escape label/name before embedding — served
  as image/svg+xml via /api/thumb?name=, so an unescaped value was a
  reflected-XSS vector when the URL is opened directly.
- cpr_detector: remove 44-line unreachable duplicate of the face-detection
  body left inside _extract_audio_metadata after its return.
- app_config: encrypt claude_api_key at rest with the machine-keyed Fernet
  (same as the SMTP password); add get_claude_api_key() for decryption.
  Legacy plaintext keys still read and are re-encrypted on next save.
  Update readers in document_scanner.py and routes/app_routes.py.

201 tests pass.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:06:36 +02:00
StyxX65
1903115e02 CLAUDE.md restructured 2026-06-08 14:44:37 +02:00
StyxX65
f845a2f686 ### Fixed - **Cards not shown after browser refresh** — when the browser reconnected to the SSE stream after a completed scan, the scan_phase events in the replay buffer temporarily set S._m365ScanRunning = true (all running flags start at false after a page reload). The watchdog's loadHistorySession call fired in this window and bailed on the stale flag; once scan_done cleared the flag, _initialStatusChecked was already true so loadHistorySession was never retried. Fixed by having the sse_replay_done handler retry loadHistorySession(null) when no scan is running and S._historyRefScanId is still null after replay. 2026-06-08 14:28:24 +02:00
StyxX65
79e589b525 Bugfix in Scheduler 2026-06-04 14:47:01 +02:00
StyxX65
fa6601ffdd Bugfixes 2026-06-01 15:15:43 +02:00
StyxX65
4e5a8934d7 Fix Google scan not stopping cleanly before a new scan starts 2026-05-29 04:53:42 +02:00
StyxX65
66986a16f9 ※ recap: Extended in-place CPR redaction to Google Drive, SFTP, SMB, and local PDFs, then updated CLAUDE.md and both manuals. Everything is committed and all 201 tests pass. (disable recaps in /config) 2026-05-28 17:53:53 +02:00
StyxX65
034ced943e Extended document redaction to Google Drive, SFTP, SMB, and local PDFs Extends the ✂ in-place redaction feature beyond local DOCX/XLSX/CSV/TXT files to cover all remaining file source types and adds PDF support for local files. 2026-05-28 17:47:02 +02:00
StyxX65
6ce7583b26 Added NER/AI integration 2026-05-28 11:50:10 +02:00
StyxX65
6e0dc8ee92 Minor changes to layout in Manuals 2026-05-28 11:23:20 +02:00
StyxX65
26c45165b9 v1.6.28 — Scheduled report-only jobs, compliance audit log, and documentation update
- Scheduled jobs can now run in report-only mode (skip scan, email latest DB results)
- Compliance audit log records all significant admin actions in an immutable DB table
- VERSION bumped to 1.6.28; CHANGELOG [Unreleased] sealed as [1.6.28] — 2026-05-28
- Both manuals updated: CPR-only mode, OCR language, file redaction, related documents,
  date-range token scoping, report-only jobs, audit log tab, two new FAQ entries
- TODO.md updated with all completed tasks

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 11:08:52 +02:00
StyxX65
744813f4ac Add compliance audit log
Immutable audit_log table in the scanner DB records every significant
admin action (profile save/delete, token create/revoke, PIN changes,
source add/update/delete, scheduler job changes, scan start/stop, SMTP
save, dispositions, item delete/redact). GET /api/audit_log exposes
entries newest-first. New Audit Log tab in the Settings modal renders
the table on demand. Settings modal widened 540→640 px and tab labels
set to white-space:nowrap so the six-tab row fits on one line.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 10:51:23 +02:00
StyxX65
4ef2dfb352 Date-range scoping for viewer tokens 2026-05-28 10:34:55 +02:00
StyxX65
c820d6f6db Two bugs in the abort mechanism: 1. POST /api/scan/stop only set state._scan_abort (M365/file abort event) but never touched state._google_scan_abort. Now sets both. 2. _check_abort() inside _run_google_scan imported gdpr_scanner._scan_abort (= state._scan_abort, the M365 event) instead of using the module-level _scan_abort alias (= state._google_scan_abort). This meant the dedicated /api/google/scan/cancel endpoint — which correctly sets _google_scan_abort — was silently ignored by the scan loop. Fixed to use the module-level alias consistently. Also aligned the end-of-scan checkpoint-clear check. 2026-05-28 10:20:22 +02:00
StyxX65
7ffd8370f4 Fix Stop button not halting Google Workspace scan
Two bugs in the abort mechanism:

1. POST /api/scan/stop only set state._scan_abort (M365/file abort event)
   but never touched state._google_scan_abort. Now sets both.

2. _check_abort() inside _run_google_scan imported gdpr_scanner._scan_abort
   (= state._scan_abort, the M365 event) instead of using the module-level
   _scan_abort alias (= state._google_scan_abort). This meant the dedicated
   /api/google/scan/cancel endpoint — which correctly sets _google_scan_abort
   — was silently ignored by the scan loop. Fixed to use the module-level
   alias consistently. Also aligned the end-of-scan checkpoint-clear check.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 10:19:54 +02:00
StyxX65
2c5f5d3283 Add OCR language override setting
Operators can now choose Tesseract language pack(s) per profile via a
sidebar select (#optOcrLang) and profile editor (#peOptOcrLang). Presets:
dan+eng (default), dan, eng, dan+eng+deu, dan+eng+swe, dan+eng+fra. The
ocr_lang option flows from the UI through all three scan engines (M365
files/attachments, Google Drive, Gmail) down to document_scanner.scan_pdf
and scan_image — including the spawned PDF-OCR subprocess worker.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 09:59:40 +02:00