GDPRScanner

Author	SHA1	Message	Date
StyxX65	c26dd7d320	Add Zoraxy HTTPS setup guide, correct SECURITY.md bind address Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 15:20:33 +02:00
StyxX65	841311a6bd	Release 1.7.6 - CHANGELOG: cut the 1.7.6 release (dated 2026-06-11); reset Unreleased. - VERSION: 1.7.5 -> 1.7.6. - Manuals (DA + EN): bump version stamps. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 15:02:44 +02:00
StyxX65	dd19be8bbf	Close leaked listening socket on update restart Werkzeug sets its server socket inheritable unconditionally, so the os.execv restart carried it into the new process as a zombie listener: one PID listening on both 5100 (never accepted) and 5101 (the real server). Mark all fds above stderr close-on-exec before exec'ing so the old socket dies and the new server rebinds the original port. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 15:01:17 +02:00
StyxX65	c43725ca7f	Release 1.7.5 - CHANGELOG: cut the 1.7.5 release (dated 2026-06-11); reset Unreleased. - VERSION: 1.7.4 -> 1.7.5. - Manuals (DA + EN): bump version stamps. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 14:42:06 +02:00
StyxX65	a1712ae178	Make static files revalidate so the UI is fresh after updates No Cache-Control header meant browsers cached JS/CSS heuristically for days; after a server update (including the in-app self-update reload) the backend was new but the frontend stayed stale. SEND_FILE_MAX_AGE _DEFAULT=0 forces ETag revalidation — 304 when unchanged, fresh file immediately after an update. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 14:39:45 +02:00
StyxX65	c1cddb8ea7	Release 1.7.4 - CHANGELOG: cut the 1.7.4 release (dated 2026-06-10); reset Unreleased. - VERSION: 1.7.3 -> 1.7.4. - Manuals (DA + EN): bump version stamps. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 15:33:16 +02:00
StyxX65	9cbd93e1f5	Reset all share modal fields after creating a link Create only cleared the label; scope type, user email, date range, and expiry carried over, so the next link silently inherited the previous link's scope. Extracted openShareModal's reset logic into _resetShareForm() and call it after every successful create. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 15:32:03 +02:00
StyxX65	d4cf2db347	Release 1.7.3 - CHANGELOG: cut the 1.7.3 release (dated 2026-06-10); reset Unreleased. - VERSION: 1.7.2 -> 1.7.3. - Manuals (DA + EN): bump version stamps. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 15:24:27 +02:00
StyxX65	d6bf80a68a	Keep the same port across app restarts The port probe did a plain bind() without SO_REUSEADDR, so TIME_WAIT connections left by the previous instance (e.g. the in-app update restart) made the port look occupied and the app hopped to the next one. Probe with SO_REUSEADDR like Werkzeug binds, and give the requested port a 10-second grace period before auto-incrementing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 15:23:18 +02:00
StyxX65	679f91da2c	Use page origin for share links except when browsing at localhost The LAN-IP rewrite in _getShareBaseUrl() exists to fix unusable 127.0.0.1 links; applying it to every origin meant links copied behind a reverse proxy pointed at http://<LAN-IP>:5100, bypassing TLS. HTTPS and non-localhost origins are now used as-is. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 15:14:33 +02:00
StyxX65	c79e7097ea	Release 1.7.2 - CHANGELOG: cut the 1.7.2 release (dated 2026-06-10); reset Unreleased. - VERSION: 1.7.1 -> 1.7.2. - Manuals (DA + EN): bump version stamps. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 15:10:59 +02:00
StyxX65	35e767b506	Fix copy buttons doing nothing over plain HTTP navigator.clipboard is undefined in non-secure contexts, so the direct writeText() call threw synchronously and the execCommand fallback in its .catch() never ran. _copyText() now feature-detects the API, falls back to execCommand('copy'), then to a prompt() for manual copying. log.js reuses the helper; _getShareBaseUrl() caches the LAN-IP lookup so token Copy buttons stay within the click gesture execCommand requires. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 15:09:34 +02:00
StyxX65	652031b31d	Release 1.7.1 - CHANGELOG: cut the 1.7.1 release (dated 2026-06-10); reset Unreleased. - VERSION: 1.7.0 -> 1.7.1. - Manuals (DA + EN): bump version stamps; document the new Settings -> General -> Software update group (check/install/auto-update, git-checkout-only, self-restart, refused during scans). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 14:50:31 +02:00
StyxX65	df54b20735	Document software updates in README, refresh test suite table Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 14:47:58 +02:00
StyxX65	a325349ecd	Fix stale ~/.gdpr_scanner_* paths in help text, docs, and UI strings Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 14:41:23 +02:00
StyxX65	6a4b0e1706	Show delta token source count, add hint bubble, fix README data paths Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 14:27:14 +02:00
StyxX65	c0e45df440	Add software update from Settings GUI and update_gdpr.sh script Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 12:54:29 +02:00
StyxX65	fcf32f3751	Release 1.7.0 - CHANGELOG: cut the 1.7.0 release (dated 2026-06-10); reset Unreleased. - VERSION: 1.6.28 → 1.7.0. - Manuals (DA + EN): bump version stamps; correct the redaction section (cards are now kept/greyed until the next scan, not removed) and add the same keep-until-next-scan note to the deletion section, including the partial-failure behaviour. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 12:06:36 +02:00
StyxX65	95f1f39a1f	Keep data-subject-deleted cards in grid until next scan Apply the keep-until-next-scan behaviour to deleteSubjectItems: mark the deleted items _deleted (using deleted_ids from the response) and keep them greyed in the grid instead of filtering them out. Also fixes a latent bug where renderGrid() was called with no argument and threw on files.forEach, which the surrounding try/catch swallowed as a false "Delete failed" after a successful erasure. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 11:47:52 +02:00
StyxX65	386831c423	Keep bulk-deleted cards in grid until next scan Extend the keep-until-next-scan behaviour to the bulk delete modal: instead of removing matched cards on success, mark them _deleted and keep them greyed with a "🗑 Deleted" badge and hidden buttons. /api/delete_bulk now returns deleted_ids so the grid marks exactly the items the server actually deleted — partial failures stay active and re-deletable. Already-handled (_deleted / _redacted) items are excluded from the bulk-delete match set so they aren't re-counted or re-processed. 201 tests pass. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 11:46:14 +02:00
StyxX65	ed3c3a80d6	Keep deleted cards in grid until next scan Mirror the redact behaviour for the card delete button (🗑): instead of removing the card on success, mark the item _deleted and keep it in the grid — greyed via card-resolved, shown with a red "🗑 Deleted" badge, action buttons hidden so it can't be re-processed. The grid is rebuilt on the next scan run, clearing the markers. results.js only — no server change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 11:44:10 +02:00
StyxX65	7c1c2b390d	Keep selected card in view when opening preview Opening the preview panel narrows .grid-area and reflows the auto-fill grid to fewer columns, moving the clicked card to a new row. The single-frame scrollIntoView ran while the browser's scroll-anchoring re-adjusted scrollTop mid-reflow, so the card scrolled out of view. Disable scroll anchoring on .grid-area (overflow-anchor:none) and defer the scroll by two animation frames against the settled layout, centring the card (block:'center'). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 11:35:04 +02:00
StyxX65	d82a0d6004	Keep redacted cards in grid until next scan Redacting a card (✏) previously removed it from the grid and from S.flaggedData/S.filteredData immediately. Now the item is marked _redacted and kept: greyed via card-resolved styling, shown with a "✏ Redacted" badge, and its delete/redact buttons hidden so it can't be re-processed. The grid is rebuilt on the next scan run, which clears the markers. results.js only — no server change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 11:30:41 +02:00
StyxX65	1b3d7f5698	Fix card action buttons clipped in grid view (missing position:relative) The real cause behind the invisible redact/delete buttons: .card lacked position:relative, so the position:absolute action buttons (delete, redact) and the bulk-select checkbox anchored to the viewport instead of the card and were clipped by .card overflow:hidden. They only showed in list view, where those elements are position:static. Add position:relative to .card so all three position within each card. Keep the 0.35 baseline opacity on the redact button for discoverability. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 11:24:00 +02:00
StyxX65	39500edfbc	Changelog: note redact button visibility fix Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 11:21:37 +02:00
StyxX65	35fd00437f	Fix redact button invisible in grid view .card-redact-btn had opacity:0 at rest (only opacity:1 on .card:hover), so the ✏ redact button was completely invisible in the default grid/thumbnail view — it only showed in list view, which forces opacity:1. Give it the same 0.35 baseline opacity as .card-delete-btn so it's discoverable at rest and brightens on hover. The button was always rendered in the DOM; this is a pure visibility fix. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 11:20:06 +02:00
StyxX65	c39d68ca19	Document XSS escaping + secret-encryption hardening - CHANGELOG: add Unreleased ### Security section covering the stored XSS in the results grid, the reflected XSS in /api/thumb, and the Claude API key now being encrypted at rest. - CLAUDE.md / static/js/CLAUDE.md: add the esc() / _html_esc escaping rule for scan-derived strings and the onclick-JSON " pattern. - CLAUDE.md / routes/CLAUDE.md: note that secret config fields use the machine-keyed Fernet and must be read via a decrypting accessor (get_claude_api_key()), never config.json directly. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 11:15:39 +02:00
StyxX65	b6d2915d49	Harden XSS escaping and encrypt Claude API key at rest - results.js: add esc() helper and apply to all scan-derived fields (name, account_name, folder, source, modified, label, img alt) across card/list/preview/subject-lookup/related views. Scan-derived strings can carry attacker-controlled markup (e.g. a OneDrive file named with HTML), so they must be escaped before innerHTML/attribute embedding. Also escape the related-docs onclick JSON to match the delete/redact " pattern. - cpr_detector._placeholder_svg: escape label/name before embedding — served as image/svg+xml via /api/thumb?name=, so an unescaped value was a reflected-XSS vector when the URL is opened directly. - cpr_detector: remove 44-line unreachable duplicate of the face-detection body left inside _extract_audio_metadata after its return. - app_config: encrypt claude_api_key at rest with the machine-keyed Fernet (same as the SMTP password); add get_claude_api_key() for decryption. Legacy plaintext keys still read and are re-encrypted on next save. Update readers in document_scanner.py and routes/app_routes.py. 201 tests pass. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 11:06:36 +02:00
StyxX65	1903115e02	CLAUDE.md restructured	2026-06-08 14:44:37 +02:00
StyxX65	f845a2f686	### Fixed - Cards not shown after browser refresh — when the browser reconnected to the SSE stream after a completed scan, the `scan_phase` events in the replay buffer temporarily set `S._m365ScanRunning = true` (all running flags start at `false` after a page reload). The watchdog's `loadHistorySession` call fired in this window and bailed on the stale flag; once `scan_done` cleared the flag, `_initialStatusChecked` was already `true` so `loadHistorySession` was never retried. Fixed by having the `sse_replay_done` handler retry `loadHistorySession(null)` when no scan is running and `S._historyRefScanId` is still `null` after replay.	2026-06-08 14:28:24 +02:00
StyxX65	79e589b525	Bugfix in Scheduler	2026-06-04 14:47:01 +02:00
StyxX65	fa6601ffdd	Bugfixes	2026-06-01 15:15:43 +02:00
StyxX65	4e5a8934d7	Fix Google scan not stopping cleanly before a new scan starts	2026-05-29 04:53:42 +02:00
StyxX65	66986a16f9	※ recap: Extended in-place CPR redaction to Google Drive, SFTP, SMB, and local PDFs, then updated CLAUDE.md and both manuals. Everything is committed and all 201 tests pass. (disable recaps in /config)	2026-05-28 17:53:53 +02:00
StyxX65	034ced943e	Extended document redaction to Google Drive, SFTP, SMB, and local PDFs Extends the ✂ in-place redaction feature beyond local DOCX/XLSX/CSV/TXT files to cover all remaining file source types and adds PDF support for local files.	2026-05-28 17:47:02 +02:00
StyxX65	6ce7583b26	Added NER/AI integration	2026-05-28 11:50:10 +02:00
StyxX65	6e0dc8ee92	Minor changes to layout in Manuals	2026-05-28 11:23:20 +02:00
StyxX65	26c45165b9	v1.6.28 — Scheduled report-only jobs, compliance audit log, and documentation update - Scheduled jobs can now run in report-only mode (skip scan, email latest DB results) - Compliance audit log records all significant admin actions in an immutable DB table - VERSION bumped to 1.6.28; CHANGELOG [Unreleased] sealed as [1.6.28] — 2026-05-28 - Both manuals updated: CPR-only mode, OCR language, file redaction, related documents, date-range token scoping, report-only jobs, audit log tab, two new FAQ entries - TODO.md updated with all completed tasks Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-28 11:08:52 +02:00
StyxX65	744813f4ac	Add compliance audit log Immutable audit_log table in the scanner DB records every significant admin action (profile save/delete, token create/revoke, PIN changes, source add/update/delete, scheduler job changes, scan start/stop, SMTP save, dispositions, item delete/redact). GET /api/audit_log exposes entries newest-first. New Audit Log tab in the Settings modal renders the table on demand. Settings modal widened 540→640 px and tab labels set to white-space:nowrap so the six-tab row fits on one line. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-28 10:51:23 +02:00
StyxX65	4ef2dfb352	Date-range scoping for viewer tokens	2026-05-28 10:34:55 +02:00
StyxX65	c820d6f6db	Two bugs in the abort mechanism: 1. POST /api/scan/stop only set state._scan_abort (M365/file abort event) but never touched state._google_scan_abort. Now sets both. 2. _check_abort() inside _run_google_scan imported gdpr_scanner._scan_abort (= state._scan_abort, the M365 event) instead of using the module-level _scan_abort alias (= state._google_scan_abort). This meant the dedicated /api/google/scan/cancel endpoint — which correctly sets _google_scan_abort — was silently ignored by the scan loop. Fixed to use the module-level alias consistently. Also aligned the end-of-scan checkpoint-clear check.	2026-05-28 10:20:22 +02:00
StyxX65	7ffd8370f4	Fix Stop button not halting Google Workspace scan Two bugs in the abort mechanism: 1. POST /api/scan/stop only set state._scan_abort (M365/file abort event) but never touched state._google_scan_abort. Now sets both. 2. _check_abort() inside _run_google_scan imported gdpr_scanner._scan_abort (= state._scan_abort, the M365 event) instead of using the module-level _scan_abort alias (= state._google_scan_abort). This meant the dedicated /api/google/scan/cancel endpoint — which correctly sets _google_scan_abort — was silently ignored by the scan loop. Fixed to use the module-level alias consistently. Also aligned the end-of-scan checkpoint-clear check. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-28 10:19:54 +02:00
StyxX65	2c5f5d3283	Add OCR language override setting Operators can now choose Tesseract language pack(s) per profile via a sidebar select (#optOcrLang) and profile editor (#peOptOcrLang). Presets: dan+eng (default), dan, eng, dan+eng+deu, dan+eng+swe, dan+eng+fra. The ocr_lang option flows from the UI through all three scan engines (M365 files/attachments, Google Drive, Gmail) down to document_scanner.scan_pdf and scan_image — including the spawned PDF-OCR subprocess worker. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-28 09:59:40 +02:00
StyxX65	23b9555dcf	Built-in file redaction for local files	2026-05-27 14:49:06 +02:00
StyxX65	c490b3d76a	Merge remote CHANGELOG entries and add Preview section to CLAUDE.md Resolved conflict in CHANGELOG.md: combined the two bug fixes from the remote branch (stale history results, selected card scroll) with the local Gmail/Drive preview fix under a single [1.6.26] — 2026-04-29 entry. Added Preview dispatch documentation to CLAUDE.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-27 13:43:59 +02:00
StyxX65	051a53ae85	Update CHANGELOG.md	2026-05-27 13:40:21 +02:00
Henrik Højmark	99157e6fd7	Update CHANGELOG for version 1.6.26 Updated release date for version 1.6.26 and added detailed fixes related to scan history, card visibility, and Google Drive/Gmail previews.	2026-05-27 13:38:40 +02:00
StyxX65	78fb406422	Fixed two bugs: selected cards staying visible after preview opens, and stale history results showing when a new scan starts.	2026-04-29 15:18:58 +02:00
StyxX65	a76df463e8	Changelog updated	2026-04-27 18:47:43 +02:00
StyxX65	ce5a5f1cbb	Fixed Gmail and Google Drive preview: items were being sent to the Microsoft Graph API instead of handled correctly.	2026-04-26 11:04:05 +02:00

1 2

90 Commits