9 Commits

Author SHA1 Message Date
StyxX65
8b55e9d933 Extended the M365 checkpoint/resume mechanism to all three scan engines. Each engine writes its own +file (checkpoint_m365.json, checkpoint_google.json, checkpoint_file_{source_id}.json) every 25 + items. 2026-04-25 20:30:59 +02:00
StyxX65
2254e00481 recap: Added email and phone number detection as opt-in scan options across all three engines, plus translation fixes. Both CHANGELOG and SUGGESTIONS are updated — everything is committed and ready to test. 2026-04-25 19:33:28 +02:00
StyxX65
e35bbe78a5 Added SFTP to sources 2026-04-25 08:48:54 +02:00
StyxX65
d42518dc81 Added tests for Video & Audio
feat: video/audio metadata scanning, profile rename fix, route tests

  - Scan .mp4/.mov/.avi/.mkv and .mp3/.flac/.ogg/.m4a/.wma (+ 7 more)
    for GPS coordinates, artist/author, title, comment — metadata only,
    no frame or audio analysis. Uses mutagen (added to requirements.txt).
    GPS-tagged phone recordings now flag with gps_location like photos.

  - Fix _extract_audio_metadata silently returning empty results:
    mutagen.File() first positional arg is `filename`, not `fileobj` —
    was passing BytesIO as the filename. Fixed to keyword args.

  - Fix profile copy rename not reflected in left column until modal
    reopen: _pmgmtSaveFullEdit called loadProfiles() but never
    _renderProfileMgmt(). Added re-render and active-row highlight.

  - Add TestProfileRoutes (10 tests) covering all profile API endpoints
    including a rename regression test. Total: 182 tests.

  - generate_fixtures.py now produces 6 audio/video fixtures (14–19):
    2 MP3, 2 FLAC, 2 MP4 — 4 flagged, 2 negative cases.
2026-04-21 21:26:58 +02:00
StyxX65
c350014b16 fix: scan button stuck, CPR dedup crash, role scope filter, profile race conditions; add auto-email toggle and route integration tests 2026-04-21 18:43:25 +02:00
StyxX65
c9aab19a97 feat: scan history browser, user-scoped viewer tokens, export fixes, email fixes (v1.6.20)
- Scan history browser (history.js, GET /api/db/sessions, get_sessions(),
  get_session_items(ref_scan_id)) — review any past session without rescanning
- User-scoped viewer tokens (#34) — scope by individual employee across M365
  and GWS; autocomplete from Accounts list; dual-email support
- Fix: GWS scan never marked finished (end_scan → finish_scan) and emitted
  wrong SSE event (scan_done → google_scan_done), excluding GWS items from all
  exports
- Fix: file scan begin_scan called with wrong keyword args (TypeError swallowed),
  so local/SMB items were never written to DB
- Fix: Graph sendMail reported failure on success — _post() now returns {} on
  empty 202 response instead of raising JSONDecodeError
- Fix: Graph error hidden behind generic "No SMTP host" message when both Graph
  and SMTP were unavailable
- Fix: Gmail vs Google Workspace SMTP error messages distinguished by username
  domain; Workspace errors point to admin console, not personal security settings
- Docs: update README, MANUAL-EN, MANUAL-DA, CLAUDE.md, TODO.md, CHANGELOG.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 13:57:54 +02:00
StyxX65
4dfbae49a4 fix: suppress OneDrive 404 errors during delta scans as non-provisioned
Add M365DriveNotFound(M365Error) exception raised by _get() on HTTP 404.
Catch it explicitly in _scan_user_onedrive before the generic handler,
broadcasting a scan_phase ("not provisioned — skipped") instead of a red
scan_error card. Full-scan path is unaffected (bare except Exception: return
in _iter_drive_folder_for already silenced the same 404).

Root cause: _get() fell through to raise_for_status() on 404, caught by
the generic except Exception handler and broadcast as scan_error. The
asymmetry with full scans (which silently skipped 404s) was confusing.

Common causes of OneDrive 404: no licence assigned, service plan disabled,
drive never provisioned (account never signed in), account suspended.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 14:05:59 +02:00
StyxX65
28c9effd17 feat: student scan filters — skip GPS images and min CPR threshold
New profile options to reduce noise when scanning student accounts:

- skip_gps_images: images flagged solely by GPS coordinates are suppressed.
  GPS data is still extracted and shown in the detail card when the item
  is flagged by another signal (faces, EXIF author/comment).

- min_cpr_count (default 1): only flag a file if it contains at least N
  distinct CPR numbers. Deduplication is by value. Faces and EXIF PII
  still trigger flags regardless of CPR count.

Both options apply to M365, Google, and file scan paths. Saved in profiles
and editable in the Profile Manager editor. Docs, manuals, i18n (DA/EN/DE),
CHANGELOG, and VERSION (1.6.14 → 1.6.15) updated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 08:48:12 +02:00
Henrik Højmark
9c7df76fbd Initial commit 2026-04-11 04:38:11 +02:00