Compare commits

..

10 Commits

Author SHA1 Message Date
StyxX65
efbbeb7306 Restore M365Connector.delete_message (was an orphaned method body)
Some checks are pending
Build — Windows, Linux & macOS / GDPRScanner / linux (push) Waiting to run
Build — Windows, Linux & macOS / GDPRScanner / macos (push) Waiting to run
Build — Windows, Linux & macOS / GDPRScanner / windows (push) Waiting to run
Build — Windows, Linux & macOS / Create GitHub Release (push) Blocked by required conditions
The def line for delete_message had been lost, leaving its body as
unreachable dead code at the end of _delete() and no delete_message
attribute on the connector. Deleting an Outlook message therefore failed
with "'M365Connector' object has no attribute 'delete_message'". Restored
the method (soft-delete: move to Deleted Items, fall back to DELETE).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 15:43:46 +02:00
StyxX65
54f8848e30 Document renderGrid landing-card hiding in static/js/CLAUDE.md
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 14:49:43 +02:00
StyxX65
8a446509c6 Hide landing/last-scan card whenever results render
The live scan_file_flagged handler showed the grid but never hid
#emptyState / #lastScanSummary, so when a scan ran with the landing
card visible, results appeared underneath it until a manual refresh
(which re-ran loadOpenItems and cleared it). Hide both panels in
renderGrid whenever files are present, covering every render path
(live SSE, open-items load, history, filters).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 14:45:50 +02:00
StyxX65
d55778ab35 Release 1.7.9: changelog + manual updates
Document this cycle's changes: open-items default results view,
interrupted-scan recovery, restored user/group badges, the SMTP
username-key fix, and the new "always send via SMTP" toggle. Stamp
manuals (EN/DA) to 1.7.9.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 11:36:41 +02:00
StyxX65
874c3ccec1 Add "prefer SMTP" toggle to skip Microsoft Graph for email
When the M365 connector is connected the app always tries Graph first,
and a Graph 202 ends the send — so report mail to recipients Exchange
silently drops (Google-hosted subdomains of the O365 domain) never
reaches them, even with working SMTP configured.

New prefer_smtp flag gates all three Graph branches (smtp_test,
send_report, _maybe_send_auto_email) so they go straight to SMTP. UI
toggle #st-smtpPreferSmtp in Settings → E-mailrapport, saved/loaded by
scheduler.js, with da/de/en strings.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 11:30:45 +02:00
StyxX65
526e2b0b78 Fix SMTP auth: settings tab saved wrong config keys
The Settings → E-mailrapport tab (scheduler.js) saved the SMTP username
as `user` and TLS flag as `starttls`, but every backend reader expects
`username`/`use_tls` (routes/email.py). Result: username was always
empty, server.login() was skipped, and the SMTP server rejected the
send — surfacing as a misleading "authentication failed" message even
with a valid App Password. The bug was latent because Graph is preferred
whenever M365 is connected, so the SMTP path was rarely exercised.

- scheduler.js: send/load canonical keys (username, use_tls). The
  send-report modal (scan.js) already used these.
- _load_smtp_config(): normalise legacy user→username / starttls→use_tls
  so configs saved before the fix work without re-entry.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 11:25:15 +02:00
StyxX65
b661a94f98 Restore user/group badges on DB-loaded result cards
The card badge only rendered when f.account_name was set, and the
group (role) badge was nested inside that same check. But save_item
never persisted account_name — only account_id (a GUID) and user_role.
Live SSE cards carried account_name so badges showed during a scan;
now that the grid loads finalized scans from the DB, the gap is exposed
and both badges vanish for earlier scans.

- Persist account_name (migration 11 + save_item) so future scans show
  the user badge. Both M365 and Google cards already carry it.
- _accountPill() in results.js drives the group badge off user_role
  alone (shows for legacy rows) and resolves a best-effort user label:
  account_name → S._allUsers (id/email) → email-style account_id → omit.
  Both card layouts share the one helper.

Legacy rows still lack account_name (never captured), but now show the
group badge and a resolved/email user label where possible.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 10:15:19 +02:00
StyxX65
29d9168643 Recover unfinished scans so their items aren't stranded
get_session_items / get_open_items / latest_scan_id all require
finished_at IS NOT NULL, but the M365 and Google engines return early
on abort (skipping finish_scan) and a process kill mid-scan (deploy,
OOM, crash) never reaches it either. Result on prod: 41/42 scans had
finished_at NULL, so 291 already-saved flagged items were invisible —
the grid showed nothing.

- finalize_orphan_scans(): finalises every finished_at-NULL scan; runs
  once at startup before the scheduler (nothing is scanning at boot, so
  any unfinished scan is dead). Recovers existing stranded items and
  guards against future mid-scan restarts.
- run_scan: finalise the DB scan on the abort early-return too, so a
  stopped scan's items stay visible without waiting for a restart.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 09:51:22 +02:00
StyxX65
7bf589bf7a Update ZORAXY_SETUP.md 2026-06-22 09:21:08 +02:00
StyxX65
68076eba52 Show all open (unactioned) items by default, not just the last scan
The default results view loaded only the latest scan session (±300s
window), so items dropped out of sight once a newer scan started — and
a long scheduled scan could show little or nothing on browser open.

Add get_open_items(): every flagged item with no disposition (or status
'unreviewed') across all scans, deduped by id to the latest finished
scan. GET /api/db/flagged now serves it when no ?ref is given; ?ref=N
still loads a specific past session. Frontend loadHistorySession(null)
routes to a new loadOpenItems() loader. Rename the banner button to
"Open items" (da/de/en).

get_session_items() default is unchanged — export.py and
scan_scheduler.py still rely on latest-session for the current scan's
report/email.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 09:19:55 +02:00
26 changed files with 446 additions and 40 deletions

View File

@ -11,6 +11,24 @@ Version numbers follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html
--- ---
## [1.7.9] — 2026-06-22
### Added
- **"Always send via SMTP" option for email reports** — new toggle in **Settings → E-mailrapport**. When the scanner is signed in to Microsoft 365 it normally sends email through Microsoft Graph; Graph reports "accepted" the instant a message is queued, which hides the case where Exchange Online later silently drops it (e.g. a recipient on a Google-hosted subdomain of your Microsoft 365 domain — the message is treated as internal, finds no mailbox, and is discarded, with no delivery and no bounce). Enabling this option makes the manual report, the test email, and the after-scan auto-email all go straight through your configured SMTP server (e.g. Google Workspace `smtp.gmail.com` / `smtp-relay.gmail.com`), bypassing the Graph routing entirely.
### Changed
- **The results grid now shows every open item by default, not just the last scan** — when you open the app (or refresh after a scheduled or manual scan), the grid loads *all* flagged items that still need action — i.e. those with no disposition — across every scan, instead of only the most recent scan session. Items you have already tagged (kept, redacted, deleted, false positive, …) drop out of the view. Re-scans are de-duplicated so each item appears once, showing its most recent state. The session picker still loads any individual past scan, and the history banner button (formerly "Latest scan") is now **"Open items"** and returns to this default view.
### Fixed
- **Interrupted scans no longer lose their results** — a scan only became visible once it was *finalised*, but the Microsoft 365 and Google scan engines skipped finalisation when a scan was stopped, and any scan cut short by a server restart, crash, or out-of-memory kill never finalised at all. Its already-found items were then stranded in the database and invisible in the grid (this is what caused "scan finished but no results shown", especially after the in-app self-update restarts). Unfinished scans are now finalised automatically on startup (nothing is scanning at boot, so any unfinished scan is known to be dead), and a manually stopped Microsoft 365 scan finalises immediately so its partial results stay visible.
- **User and group badges were missing on result cards loaded from the database** — the reviewer's display name was shown live during a scan but never saved, so cards loaded from a past scan (now the default view) lost both the person badge and the Elev/Ansat group badge. The display name is now stored with each item, and the group badge is shown from the saved role even for older items that predate this fix (where a name can't be recovered, the group badge and a resolved e-mail still appear).
- **Email reports sent via SMTP failed with "authentication failed"** — the **Settings → E-mailrapport** tab saved the SMTP username under the wrong field name, so the username never reached the mail server and sign-in was skipped — the server then rejected the unauthenticated message, which surfaced as a misleading authentication error even with a correct password or app password. The setting is now saved correctly, and configurations saved before the fix are migrated automatically.
---
## [1.7.8] — 2026-06-16 ## [1.7.8] — 2026-06-16
### Fixed ### Fixed

View File

@ -93,7 +93,10 @@ All options live in the profile `options` dict and apply to **all three scan eng
- **`get_sessions(limit=50, window_seconds=300)`** — groups `scans` rows by 300 s window. Groups built ascending, returned descending. `ref_scan_id` is the highest `scan_id` in each group. Do not change window size independently of `get_session_items`. - **`get_sessions(limit=50, window_seconds=300)`** — groups `scans` rows by 300 s window. Groups built ascending, returned descending. `ref_scan_id` is the highest `scan_id` in each group. Do not change window size independently of `get_session_items`.
- **`get_session_items(ref_scan_id=N)`** — anchors 300 s window to that scan's `started_at`. Window is **symmetric**: `started_at BETWEEN ref.started_at - 300 AND ref.started_at + 300`. Do not revert to a one-sided lower bound. - **`get_session_items(ref_scan_id=N)`** — anchors 300 s window to that scan's `started_at`. Window is **symmetric**: `started_at BETWEEN ref.started_at - 300 AND ref.started_at + 300`. Do not revert to a one-sided lower bound.
- **`get_related_items(item_id, ref_scan_id, window_seconds=300)`** — self-joins `cpr_index` to find items sharing ≥1 CPR hash. Uses same 300 s symmetric window — do not change independently. - **`get_related_items(item_id, ref_scan_id, window_seconds=300)`** — self-joins `cpr_index` to find items sharing ≥1 CPR hash. Uses same 300 s symmetric window — do not change independently.
- **`GET /api/db/flagged?ref=N`** — passes `ref_scan_id` to `get_session_items`; viewer scope enforcement still applies. - **`account_name` (display name) is persisted** (migration 11) so DB-loaded cards show the user badge. Legacy rows predating it have `account_name=''` — the frontend `_accountPill` resolves a fallback and still shows the group badge from `user_role`. `save_item` must keep writing `card["account_name"]` (both M365 and Google cards carry it).
- **Scans must be finalised or their items are invisible**`get_session_items`, `get_open_items`, and `latest_scan_id` all filter on `finished_at IS NOT NULL`. The file scan finalises in a `finally`; M365 (`run_scan`) and Google (`_run_google_scan`) `return` early on abort, so each now calls `finish_scan` before that abort-return. A process kill (deploy/OOM/crash) mid-scan still strands a scan → **`finalize_orphan_scans()`** runs once at server startup (`gdpr_scanner.py` `__main__`, before the scheduler) and finalises every `finished_at IS NULL` scan (safe because nothing is scanning at boot). Do not add a scan-results query that ignores `finished_at` instead of fixing finalisation.
- **`get_open_items()`** — returns every flagged item with **no action taken**, across **all** scans (not just the latest session window). "Open" = no `dispositions` row, or one whose `status='unreviewed'`. Because `flagged_items` PK is `(id, scan_id)`, the same item recurs per scan; the query dedupes by `id`, keeping the row from the highest finished `scan_id`. This powers the **default landing view** so items don't drop out of sight once a newer scan opens a fresh session.
- **`GET /api/db/flagged`** — **with `?ref=N`**`get_session_items(ref_scan_id=N)` (history mode); **without ref**`get_open_items()` (default + viewer). Viewer scope enforcement applies to both. Do not change the no-ref `get_session_items()` default elsewhere (`export.py`, `scan_scheduler.py` still rely on latest-session for the current scan's report/email).
- See `static/js/CLAUDE.md` for the frontend history browser behaviour and `sse_replay_done` retry fix. - See `static/js/CLAUDE.md` for the frontend history browser behaviour and `sse_replay_done` retry fix.
## Global gotchas ## Global gotchas

View File

@ -1 +1 @@
1.7.8 1.7.9

View File

@ -878,6 +878,13 @@ def _load_smtp_config() -> dict:
cfg = json.loads(_SMTP_CONFIG_PATH.read_text(encoding="utf-8")) cfg = json.loads(_SMTP_CONFIG_PATH.read_text(encoding="utf-8"))
if cfg.get("password"): if cfg.get("password"):
cfg["password"] = _decrypt_password(cfg["password"]) cfg["password"] = _decrypt_password(cfg["password"])
# Normalise legacy key names written by an older settings-tab UI
# (`user`/`starttls`) to the canonical keys every reader expects
# (`username`/`use_tls`), so configs saved before the fix still work.
if "username" not in cfg and "user" in cfg:
cfg["username"] = cfg["user"]
if "use_tls" not in cfg and "starttls" in cfg:
cfg["use_tls"] = cfg["starttls"]
return cfg return cfg
except Exception: except Exception:
pass pass

View File

@ -1,6 +1,6 @@
# GDPR Scanner — Brugermanual # GDPR Scanner — Brugermanual
Version 1.7.8 Version 1.7.9
--- ---
@ -200,6 +200,8 @@ Klik på **▶ Genoptag** for at fortsætte fra det sted, scanningen slap. Klik
## 5. Forstå resultaterne ## 5. Forstå resultaterne
Når du åbner appen, viser gitteret **alle åbne fund** — alle markerede elementer, der stadig kræver handling (dvs. uden disposition), på tværs af alle dine scanninger og ikke kun den seneste. Efterhånden som du mærker elementer (behold, anonymisér, slet, falsk positiv …), forsvinder de fra denne visning, så det, der står tilbage, er dit udestående arbejde. Hvert element vises én gang med sin nyeste tilstand. Vil du i stedet se en enkelt tidligere scanning, så brug sessionsvælgeren (se *Gennemse tidligere scanningssessioner* nedenfor).
Hvert fundet element vises som et kort. Her er forklaringen på mærker og labels: Hvert fundet element vises som et kort. Her er forklaringen på mærker og labels:
### Kildemærker ### Kildemærker
@ -256,7 +258,7 @@ Når en scanning er afsluttet, kan du gennemse resultaterne fra en tidligere sca
- Klik på **Sessioner**-knappen i historikbanneret (der vises over resultatgitteret, når en scanning er afsluttet) for at åbne sessionsvælgeren. - Klik på **Sessioner**-knappen i historikbanneret (der vises over resultatgitteret, når en scanning er afsluttet) for at åbne sessionsvælgeren.
- Hver række viser dato og tidspunkt, hvilke kilder der blev scannet, og hvor mange elementer der blev fundet. Et **Δ**-mærkat angiver delta-scanninger; **Seneste** markerer den nyeste session. - Hver række viser dato og tidspunkt, hvilke kilder der blev scannet, og hvor mange elementer der blev fundet. Et **Δ**-mærkat angiver delta-scanninger; **Seneste** markerer den nyeste session.
- Klik på en række for at indlæse den pågældende sessions resultater i gitteret. Et historikbanner erstatter statuslinjen med sessionens oplysninger. - Klik på en række for at indlæse den pågældende sessions resultater i gitteret. Et historikbanner erstatter statuslinjen med sessionens oplysninger.
- Klik på **Seneste scanning** i banneret for at vende tilbage til den nyeste session. - Klik på **Åbne fund** i banneret for at forlade den tidligere session og vende tilbage til standardvisningen med alle elementer, der stadig kræver handling.
- Start af en ny scanning afslutter automatisk historiktilstanden og skifter til live-resultater. - Start af en ny scanning afslutter automatisk historiktilstanden og skifter til live-resultater.
Alle filtre, eksporter og dispositionsmærkning fungerer normalt, mens du gennemser tidligere sessioner. Alle filtre, eksporter og dispositionsmærkning fungerer normalt, mens du gennemser tidligere sessioner.
@ -526,7 +528,17 @@ Klik på **Gem** for at gemme, og klik derefter på **Test** for at sende en tes
> Hvis din konto har MFA (to-faktor-godkendelse) aktiveret, kan du ikke bruge din almindelige adgangskode. Du skal oprette en **app-adgangskode** i din kontos sikkerhedsindstillinger: > Hvis din konto har MFA (to-faktor-godkendelse) aktiveret, kan du ikke bruge din almindelige adgangskode. Du skal oprette en **app-adgangskode** i din kontos sikkerhedsindstillinger:
> - **Personlig Microsoft-konto**: account.microsoft.com/security → App-adgangskoder > - **Personlig Microsoft-konto**: account.microsoft.com/security → App-adgangskoder
> - **Gmail**: myaccount.google.com → Sikkerhed → 2-trinsbekræftelse → App-adgangskoder > - **Gmail / Google Workspace**: myaccount.google.com → Sikkerhed → 2-trinsbekræftelse → App-adgangskoder (for Google Workspace-konti skal din administrator først tillade app-adgangskoder eller opsætte et SMTP-relay)
### Send altid via SMTP (spring Microsoft Graph over)
Når scanneren er logget på Microsoft 365, sender den normalt e-mail gennem Microsoft 365 direkte, uden at bruge SMTP-indstillingerne ovenfor. Det er praktisk, men det kan ikke levere til visse adresser — især en adresse på et Google-hostet underdomæne af dit Microsoft 365-domæne, som Microsoft 365 opfatter som intern og kasserer i stilhed (ingen levering, ingen fejl).
Slå **Send altid via SMTP (spring Microsoft Graph over)** til for at tvinge al e-mail — test-e-mails, manuelle rapporter og automatisk e-mail efter scanning — gennem den SMTP-server, du har konfigureret ovenfor. Brug dette, når dine rapporter sendes til en postkasse, som Microsoft 365 ikke kan levere til (f.eks. en Google Workspace-adresse), med `smtp.gmail.com` / `smtp-relay.gmail.com` som SMTP-vært.
### Send rapport efter manuel scanning
Slå **Send rapport efter manuel scanning** til for automatisk at sende rapporten pr. e-mail til dine konfigurerede modtagere, hver gang en manuel scanning er færdig.
### Send en rapport manuelt ### Send en rapport manuelt
@ -671,4 +683,4 @@ For en typisk skole- eller kommunescanning er omkostningen ubetydelig — Claude
--- ---
*GDPR Scanner v1.7.8 — teknisk opsætning og konfiguration: se README.md* *GDPR Scanner v1.7.9 — teknisk opsætning og konfiguration: se README.md*

View File

@ -1,6 +1,6 @@
# GDPR Scanner — User Manual # GDPR Scanner — User Manual
Version 1.7.8 Version 1.7.9
--- ---
@ -200,6 +200,8 @@ Click **▶ Genoptag** to continue from where the scan left off. Click **Start f
## 5. Understanding the Results ## 5. Understanding the Results
When you open the app, the grid shows **all open items** — every flagged item that still needs action (i.e. has no disposition), across all of your scans, not just the most recent one. As you tag items (kept, redacted, deleted, false positive, …) they drop out of this view, so what remains is your outstanding work. Each item appears once, showing its most recent state. To look at a single past scan instead, use the session picker (see *Browsing past scan sessions* below).
Each flagged item appears as a card. Here is what the badges and labels mean: Each flagged item appears as a card. Here is what the badges and labels mean:
### Source badges ### Source badges
@ -256,7 +258,7 @@ Once a scan has completed, you can review results from any earlier scan session
- Click the **Sessions** button in the history banner (which appears above the results grid after a scan completes) to open the session picker. - Click the **Sessions** button in the history banner (which appears above the results grid after a scan completes) to open the session picker.
- Each row shows the date and time, which sources were scanned, and how many items were flagged. A **Δ** badge marks delta scans; **Latest** marks the most recent session. - Each row shows the date and time, which sources were scanned, and how many items were flagged. A **Δ** badge marks delta scans; **Latest** marks the most recent session.
- Click any row to load that session's results into the grid. A history banner replaces the progress bar, showing the session details. - Click any row to load that session's results into the grid. A history banner replaces the progress bar, showing the session details.
- Click **Latest scan** in the banner to jump back to the most recent session. - Click **Open items** in the banner to leave the past session and return to the default view of all items still needing action.
- Starting a new scan automatically exits history mode and switches back to live results. - Starting a new scan automatically exits history mode and switches back to live results.
All filters, exports, and disposition tagging work normally while browsing past sessions. All filters, exports, and disposition tagging work normally while browsing past sessions.
@ -526,7 +528,17 @@ Click **Gem** to save, then click **Test** to send a test email and verify the c
> If your account has MFA (two-factor authentication) enabled, you cannot use your regular password. You need to create an **App Password** in your account security settings: > If your account has MFA (two-factor authentication) enabled, you cannot use your regular password. You need to create an **App Password** in your account security settings:
> - **Microsoft personal account**: account.microsoft.com/security → App passwords > - **Microsoft personal account**: account.microsoft.com/security → App passwords
> - **Gmail**: myaccount.google.com → Security → 2-Step Verification → App passwords > - **Gmail / Google Workspace**: myaccount.google.com → Security → 2-Step Verification → App passwords (for Google Workspace accounts your administrator must first allow App Passwords, or set up an SMTP relay)
### Always send via SMTP (skip Microsoft Graph)
When the scanner is signed in to Microsoft 365, it normally sends email through Microsoft 365 directly, without using the SMTP settings above. This is convenient, but it cannot deliver to some addresses — most notably an address on a Google-hosted subdomain of your Microsoft 365 domain, which Microsoft 365 treats as internal and silently discards (no delivery, no error).
Turn on **Send altid via SMTP (spring Microsoft Graph over)** to force all email — test emails, manual reports, and the after-scan auto-email — through the SMTP server you configured above. Use this when your reports go to a mailbox Microsoft 365 won't deliver to (for example a Google Workspace address), with `smtp.gmail.com` / `smtp-relay.gmail.com` as the SMTP host.
### Email report after manual scan
Turn on **Send rapport efter manuel scanning** to automatically email the report to your configured recipients every time a manual scan finishes.
### Sending a report manually ### Sending a report manually
@ -671,4 +683,4 @@ For a typical school or municipality scan the cost is negligible — Claude Haik
--- ---
*GDPR Scanner v1.7.8 — for technical setup and configuration see README.md* *GDPR Scanner v1.7.9 — for technical setup and configuration see README.md*

View File

@ -111,7 +111,25 @@ Optional hardening:
--- ---
## 7. Verify the scanner-specific behaviour ## 7. Firewall / perimeter checklist
The Zoraxy whitelist (step 6) is an **application-layer** control — a rejected request has still completed the TCP and TLS handshake against your box, and any proxy host you forget to tag is fully exposed. The firewall is the real perimeter. Work this checklist whenever you stand up or replace the edge firewall:
- [ ] **No inbound port-forward unless a service is intentionally public.** A LAN-only deployment needs *zero* inbound forwards — DNS-01 (step 4) is outbound-only, so certificates issue and renew with the firewall fully closed.
- [ ] **If any service is intentionally public** (e.g. a media server), forward **443 only to the Zoraxy host** — never to individual app hosts. Everything then enters through Zoraxy, where the per-host Access Rule decides public vs. private.
- [ ] **The per-host whitelist stays your public/private boundary even with the firewall in place** — it is not made redundant by the firewall. Public hosts use the `default` rule; every internal-only host gets **Local Access Only**.
- [ ] **New proxy hosts default to public.** Zoraxy applies the `default` rule to any host with no rule set, so a freshly-added internal service is reachable the moment it exists. Set its Access Rule to **Local Access Only** *at creation time*.
- [ ] **Management ports are LAN-only.** Zoraxy admin (`:8000`) and any app admin UI must never be forwarded; tag them **Local Access Only** as well.
- [ ] **Verify from off-network.** From a connection outside the LAN (e.g. a phone on mobile data), confirm private hostnames are blocked and only the intentionally-public ones respond:
```bash
curl -v https://gdprscanner.example.dk # should fail/refuse from outside
nmap -Pn -p 80,443,5100 <your-public-IP> # only intentionally-open ports listed
```
---
## 8. Verify the scanner-specific behaviour
1. `https://gdprscanner.example.dk` loads with a valid padlock; `http://` redirects. 1. `https://gdprscanner.example.dk` loads with a valid padlock; `http://` redirects.
2. **Run a scan and watch result cards stream in live** — that is the Server-Sent Events connection (`/api/scan/stream`) passing through the proxy. If progress stalls while the scan log advances, look at proxy buffering/timeout settings. 2. **Run a scan and watch result cards stream in live** — that is the Server-Sent Events connection (`/api/scan/stream`) passing through the proxy. If progress stalls while the scan log advances, look at proxy buffering/timeout settings.

View File

@ -29,11 +29,14 @@ Usage (from gdpr_scanner.py)
import hashlib import hashlib
import json import json
import logging
import sqlite3 import sqlite3
import time import time
from pathlib import Path from pathlib import Path
from typing import Iterator from typing import Iterator
logger = logging.getLogger(__name__)
from pathlib import Path as _P from pathlib import Path as _P
_DATA_DIR = _P.home() / ".gdprscanner" _DATA_DIR = _P.home() / ".gdprscanner"
_DATA_DIR.mkdir(exist_ok=True) _DATA_DIR.mkdir(exist_ok=True)
@ -225,6 +228,7 @@ _MIGRATIONS: list[tuple[int, str]] = [
emailed INTEGER NOT NULL DEFAULT 0, emailed INTEGER NOT NULL DEFAULT 0,
error TEXT NOT NULL DEFAULT '' error TEXT NOT NULL DEFAULT ''
)"""), )"""),
(11, "ALTER TABLE flagged_items ADD COLUMN account_name TEXT NOT NULL DEFAULT ''"),
] ]
@ -326,8 +330,8 @@ class ScanDB:
url, drive_id, size_kb, modified, cpr_count, risk, url, drive_id, size_kb, modified, cpr_count, risk,
thumb_b64, thumb_mime, attachments, user_role, transfer_risk, thumb_b64, thumb_mime, attachments, user_role, transfer_risk,
special_category, face_count, exif_json, full_path, special_category, face_count, exif_json, full_path,
email_count, phone_count, body_excerpt, scanned_at) email_count, phone_count, body_excerpt, account_name, scanned_at)
VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)""", VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)""",
( (
card.get("id", ""), card.get("id", ""),
scan_id, scan_id,
@ -354,6 +358,7 @@ class ScanDB:
card.get("email_count", 0), card.get("email_count", 0),
card.get("phone_count", 0), card.get("phone_count", 0),
card.get("body_excerpt", ""), card.get("body_excerpt", ""),
card.get("account_name", ""),
now, now,
), ),
) )
@ -432,6 +437,33 @@ class ScanDB:
c.commit() c.commit()
def finalize_orphan_scans(self) -> int:
"""Finalise scans left unfinished by a crash, kill, or mid-scan restart.
After a fresh process start nothing is scanning, so any scan still
carrying finished_at IS NULL is dead the process that owned it is gone.
Its already-saved flagged_items were stranded: both get_session_items
and get_open_items require finished_at, so those items are invisible and
effectively lost. Finalising the orphans on startup makes them show up
and prevents permanent data loss from interrupted scans (the M365 and
Google engines return early on abort and never reach finish_scan; only
the file scan finalises in a finally block).
Safe to call only when no scan is running (i.e. at startup). Returns the
number of scans finalised.
"""
rows = self._connect().execute(
"SELECT id, total_scanned FROM scans WHERE finished_at IS NULL"
).fetchall()
count = 0
for sid, total in rows:
try:
self.finish_scan(sid, total or 0)
count += 1
except Exception as e:
logger.warning("[db] finalize_orphan_scans: scan %s failed: %s", sid, e)
return count
# ── Query helpers ───────────────────────────────────────────────────────── # ── Query helpers ─────────────────────────────────────────────────────────
def latest_scan_id(self) -> int | None: def latest_scan_id(self) -> int | None:
@ -536,6 +568,40 @@ class ScanDB:
result.append(d) result.append(d)
return result return result
def get_open_items(self) -> list[dict]:
"""Return every flagged item across all scans that has no action taken.
"Open" means the item has no disposition row (or a row whose status is
still 'unreviewed'). Unlike get_session_items this is NOT limited to the
latest scan window it surfaces all outstanding items so nothing slips
out of view once a newer scan starts a fresh session.
flagged_items has a composite PK of (id, scan_id), so the same logical
item appears once per scan that flagged it. We deduplicate by id, keeping
the row from the most recent finished scan, so each open item shows once.
"""
rows = self._connect().execute(
"""SELECT fi.*, COALESCE(d.status, 'unreviewed') AS disposition
FROM flagged_items fi
JOIN scans s ON fi.scan_id = s.id
LEFT JOIN dispositions d ON d.item_id = fi.id
WHERE s.finished_at IS NOT NULL
AND (d.item_id IS NULL OR d.status = 'unreviewed')
AND fi.scan_id = (
SELECT MAX(fi2.scan_id)
FROM flagged_items fi2
JOIN scans s2 ON fi2.scan_id = s2.id
WHERE fi2.id = fi.id AND s2.finished_at IS NOT NULL
)
ORDER BY fi.cpr_count DESC""",
).fetchall()
result = []
for r in rows:
d = dict(r)
d["attachments"] = json.loads(d.get("attachments") or "[]")
result.append(d)
return result
def get_related_items(self, item_id: str, ref_scan_id: int | None = None, def get_related_items(self, item_id: str, ref_scan_id: int | None = None,
window_seconds: int = 300) -> list[dict]: window_seconds: int = 300) -> list[dict]:
"""Return flagged items from the same session that share at least one CPR """Return flagged items from the same session that share at least one CPR

View File

@ -2305,6 +2305,19 @@ Example --settings file with SMTP:
print(f"\n GDPRScanner\n ──────────────────────────────") print(f"\n GDPRScanner\n ──────────────────────────────")
print(f" Open: http://{args.host}:{args.port}") print(f" Open: http://{args.host}:{args.port}")
# Recover scans left unfinished by a crash / kill / mid-scan restart.
# Nothing is scanning at startup, so any scan with finished_at IS NULL is
# dead; finalising it makes its already-saved items visible again instead
# of stranding them (both get_session_items and get_open_items require a
# finished scan). Must run before the scheduler can start a new scan.
try:
if DB_OK:
_recovered = _get_db().finalize_orphan_scans()
if _recovered:
print(f" Recovered {_recovered} unfinished scan(s) from a prior restart")
except Exception as _orphan_err:
print(f" Orphan-scan recovery: failed ({_orphan_err})")
# Start in-process scheduler (#19) # Start in-process scheduler (#19)
try: try:
import scan_scheduler as _sched_mod import scan_scheduler as _sched_mod

View File

@ -106,7 +106,7 @@
"history_lbl": "Historik", "history_lbl": "Historik",
"history_items": "fund", "history_items": "fund",
"history_btn_sessions": "Sessioner", "history_btn_sessions": "Sessioner",
"history_btn_latest": "Seneste scanning", "history_btn_latest": "Åbne fund",
"history_picker_empty": "Ingen tidligere scanninger", "history_picker_empty": "Ingen tidligere scanninger",
"history_delta_badge": "Delta", "history_delta_badge": "Delta",
"history_latest_badge": "Seneste", "history_latest_badge": "Seneste",
@ -366,6 +366,7 @@
"m365_smtp_recipients_hint": "Adskil med komma eller semikolon", "m365_smtp_recipients_hint": "Adskil med komma eller semikolon",
"m365_smtp_save": "Gem", "m365_smtp_save": "Gem",
"m365_smtp_auto_email_manual": "Send rapport efter manuel scanning", "m365_smtp_auto_email_manual": "Send rapport efter manuel scanning",
"m365_smtp_prefer_smtp": "Send altid via SMTP (spring Microsoft Graph over)",
"m365_smtp_send": "Send nu", "m365_smtp_send": "Send nu",
"m365_smtp_saved": "Indstillinger gemt.", "m365_smtp_saved": "Indstillinger gemt.",
"m365_smtp_sending": "Sender…", "m365_smtp_sending": "Sender…",

View File

@ -167,7 +167,7 @@
"history_lbl": "Verlauf", "history_lbl": "Verlauf",
"history_items": "Treffer", "history_items": "Treffer",
"history_btn_sessions": "Sessionen", "history_btn_sessions": "Sessionen",
"history_btn_latest": "Letzter Scan", "history_btn_latest": "Offene Einträge",
"history_picker_empty": "Keine früheren Scans", "history_picker_empty": "Keine früheren Scans",
"history_delta_badge": "Delta", "history_delta_badge": "Delta",
"history_latest_badge": "Aktuell", "history_latest_badge": "Aktuell",
@ -366,6 +366,7 @@
"m365_smtp_recipients_hint": "Komma- oder semikolongetrennt", "m365_smtp_recipients_hint": "Komma- oder semikolongetrennt",
"m365_smtp_save": "Speichern", "m365_smtp_save": "Speichern",
"m365_smtp_auto_email_manual": "Bericht nach manueller Suche senden", "m365_smtp_auto_email_manual": "Bericht nach manueller Suche senden",
"m365_smtp_prefer_smtp": "Immer via SMTP senden (Microsoft Graph überspringen)",
"m365_smtp_send": "Jetzt senden", "m365_smtp_send": "Jetzt senden",
"m365_smtp_saved": "Einstellungen gespeichert.", "m365_smtp_saved": "Einstellungen gespeichert.",
"m365_smtp_sending": "Senden…", "m365_smtp_sending": "Senden…",

View File

@ -106,7 +106,7 @@
"history_lbl": "History", "history_lbl": "History",
"history_items": "items", "history_items": "items",
"history_btn_sessions": "Sessions", "history_btn_sessions": "Sessions",
"history_btn_latest": "Latest scan", "history_btn_latest": "Open items",
"history_picker_empty": "No past scans", "history_picker_empty": "No past scans",
"history_delta_badge": "Delta", "history_delta_badge": "Delta",
"history_latest_badge": "Latest", "history_latest_badge": "Latest",
@ -366,6 +366,7 @@
"m365_smtp_recipients_hint": "Comma or semicolon separated", "m365_smtp_recipients_hint": "Comma or semicolon separated",
"m365_smtp_save": "Save", "m365_smtp_save": "Save",
"m365_smtp_auto_email_manual": "Email report after manual scan", "m365_smtp_auto_email_manual": "Email report after manual scan",
"m365_smtp_prefer_smtp": "Always send via SMTP (skip Microsoft Graph)",
"m365_smtp_send": "Send now", "m365_smtp_send": "Send now",
"m365_smtp_saved": "Settings saved.", "m365_smtp_saved": "Settings saved.",
"m365_smtp_sending": "Sending…", "m365_smtp_sending": "Sending…",

View File

@ -552,6 +552,8 @@ class M365Connector:
r.raise_for_status() r.raise_for_status()
return True # 204 No Content = success return True # 204 No Content = success
raise _requests.exceptions.RetryError(f"Gave up after {self._MAX_RETRIES} attempts: {url}") raise _requests.exceptions.RetryError(f"Gave up after {self._MAX_RETRIES} attempts: {url}")
def delete_message(self, user_id: str, message_id: str) -> bool:
"""Move an email to Deleted Items (soft delete).""" """Move an email to Deleted Items (soft delete)."""
base = "/me" if (not user_id or user_id == "me") else f"/users/{user_id}" base = "/me" if (not user_id or user_id == "me") else f"/users/{user_id}"
try: try:

View File

@ -68,6 +68,9 @@ Exception hierarchy (all inherit `M365Error(Exception)`):
- **Graph preferred over SMTP**`smtp_test` and `send_report` try `_send_email_graph()` first; fall back to SMTP only if Graph raises. If Graph fails and no SMTP host saved, the Graph exception surfaces directly. - **Graph preferred over SMTP**`smtp_test` and `send_report` try `_send_email_graph()` first; fall back to SMTP only if Graph raises. If Graph fails and no SMTP host saved, the Graph exception surfaces directly.
- **Auto-email after manual scan**`_maybe_send_auto_email()` in `routes/scan.py` called from the `_run()` thread after `run_scan()` returns. Reads `smtp_cfg.get("auto_email_manual")`; no-ops if false, no flagged items, or no recipients. - **Auto-email after manual scan**`_maybe_send_auto_email()` in `routes/scan.py` called from the `_run()` thread after `run_scan()` returns. Reads `smtp_cfg.get("auto_email_manual")`; no-ops if false, no flagged items, or no recipients.
- **Gmail vs Google Workspace** — auth error handlers check if SMTP username ends in `@gmail.com`/`@googlemail.com`; custom domains are treated as Google Workspace and error message points to the Workspace admin console. - **Gmail vs Google Workspace** — auth error handlers check if SMTP username ends in `@gmail.com`/`@googlemail.com`; custom domains are treated as Google Workspace and error message points to the Workspace admin console.
- **Canonical SMTP config keys are `username` and `use_tls`** — all backend readers (`smtp_test`, `_send_report_email`, `_send_email_graph`) use these. The Settings → E-mailrapport tab (`scheduler.js`) historically saved `user`/`starttls`, which left `username` empty so `server.login()` was skipped and the server rejected the send. Frontend now sends the canonical keys, and `_load_smtp_config()` normalises legacy `user``username` / `starttls``use_tls` for already-saved configs. The send-report modal (`scan.js`) already used the canonical keys. Keep both UIs and the backend on `username`/`use_tls`.
- **Graph 202 ≠ delivered**`_send_email_graph` returns on Graph's HTTP 202 (queued), and `smtp_test`/`send_report` treat that as success and never fall back to SMTP. A recipient on a domain Exchange Online considers an accepted/internal domain (e.g. a Google-hosted subdomain of the O365 domain) is silently dropped after the 202. There is no in-app fix for that routing; reaching such recipients requires SMTP (e.g. Google Workspace `smtp.gmail.com`/`smtp-relay.gmail.com`) or fixing Exchange Accepted Domains.
- **`prefer_smtp` config flag** — when truthy, `smtp_test`, `send_report`, and `_maybe_send_auto_email` (routes/scan.py) skip the Graph path entirely and send via SMTP. This is the in-app escape hatch for the Graph-202 routing trap above. The gate is `... and not smtp_cfg.get("prefer_smtp")` on each Graph branch — keep all three in sync. UI: `#st-smtpPreferSmtp` toggle (key `m365_smtp_prefer_smtp`), saved/loaded by `scheduler.js`.
## Scheduler — scan_scheduler.py + routes/scheduler.py ## Scheduler — scan_scheduler.py + routes/scheduler.py

View File

@ -180,7 +180,11 @@ def db_get_disposition(item_id):
@bp.route("/api/db/flagged") @bp.route("/api/db/flagged")
def db_flagged_items(): def db_flagged_items():
"""Return flagged items from the most recent completed scan session. """Return flagged items for the results grid.
With ?ref=N, returns the items from that specific past scan session (history
mode). Without ref, returns every item still awaiting action across all
scans (the default landing view) not just the latest session window.
Used by the read-only viewer to load results without an active SSE connection. Used by the read-only viewer to load results without an active SSE connection.
Respects viewer_scope.role stored in the session for scoped tokens. Respects viewer_scope.role stored in the session for scoped tokens.
""" """
@ -197,7 +201,13 @@ def db_flagged_items():
else: else:
user_filt = {raw_user.lower()} if raw_user else set() user_filt = {raw_user.lower()} if raw_user else set()
ref_scan_id = request.args.get("ref", type=int) ref_scan_id = request.args.get("ref", type=int)
items = _get_db().get_session_items(ref_scan_id=ref_scan_id) if ref_scan_id:
# History mode — a specific past session was requested.
items = _get_db().get_session_items(ref_scan_id=ref_scan_id)
else:
# Default landing / viewer — show every item still awaiting action,
# across all scans, not just the latest session window.
items = _get_db().get_open_items()
# Normalise JSON-encoded columns the same way scan_engine does for SSE cards # Normalise JSON-encoded columns the same way scan_engine does for SSE cards
import json as _json import json as _json
out = [] out = []

View File

@ -148,8 +148,12 @@ def smtp_test():
"</body></html>" "</body></html>"
) )
# Try Graph API first # Try Graph API first — unless the user opted to always use SMTP. Graph
if state.connector and state.connector.is_authenticated(): # returns 202 (queued) even for recipients Exchange later silently drops
# (e.g. a Google-hosted subdomain of the O365 domain), so SMTP is the only
# reliable path for those; prefer_smtp forces it.
prefer_smtp = bool(saved.get("prefer_smtp"))
if state.connector and state.connector.is_authenticated() and not prefer_smtp:
try: try:
_send_email_graph(subject, body_html, recipients) _send_email_graph(subject, body_html, recipients)
return jsonify({"ok": True, "method": "graph", "recipients": recipients}) return jsonify({"ok": True, "method": "graph", "recipients": recipients})
@ -285,8 +289,8 @@ def send_report():
"</body></html>" "</body></html>"
) )
# Try Graph API first # Try Graph API first — unless prefer_smtp is set (see smtp_test for why).
if state.connector and state.connector.is_authenticated(): if state.connector and state.connector.is_authenticated() and not smtp_cfg.get("prefer_smtp"):
try: try:
_send_email_graph(subject, body_html, recipients, _send_email_graph(subject, body_html, recipients,
attachment_bytes=xl_bytes, attachment_name=fname) attachment_bytes=xl_bytes, attachment_name=fname)

View File

@ -54,7 +54,7 @@ def _maybe_send_auto_email():
"</body></html>" "</body></html>"
) )
if state.connector and state.connector.is_authenticated(): if state.connector and state.connector.is_authenticated() and not smtp_cfg.get("prefer_smtp"):
try: try:
_send_email_graph(subject, body_html, recipients, _send_email_graph(subject, body_html, recipients,
attachment_bytes=xl_bytes, attachment_name=fname) attachment_bytes=xl_bytes, attachment_name=fname)

View File

@ -1078,6 +1078,14 @@ def run_scan(options: dict):
if _check_abort(): if _check_abort():
# Save checkpoint so scan can be resumed later # Save checkpoint so scan can be resumed later
_save_checkpoint(ck_key, scanned_ids, _state.flagged_items, _state.scan_meta) _save_checkpoint(ck_key, scanned_ids, _state.flagged_items, _state.scan_meta)
# Finalise the DB scan record so items found before the stop stay
# visible — this early return otherwise skips finish_scan below,
# stranding them (invisible to get_session_items / get_open_items).
if _db and _db_scan_id:
try:
_db.finish_scan(_db_scan_id, resumed_count + idx + 1)
except Exception as _e:
logger.error("[db] finish_scan (aborted) failed: %s", _e)
return return
idx += 1 idx += 1
kind, meta, _ = _work_q.popleft() # releases this item from the deque immediately kind, meta, _ = _work_q.popleft() # releases this item from the deque immediately

View File

@ -40,13 +40,19 @@ Never revert to `!!window._googleConnected` / `_fileSources.length > 0` — thos
## Scan history browser — history.js + results.js ## Scan history browser — history.js + results.js
- **`S._historyRefScanId`** — `null` = live/SSE mode; positive int = viewing a past session. Set by `loadHistorySession()`; cleared by `exitHistoryMode()`. - **`S._historyRefScanId`** — `null` = live/SSE mode **or** the default open-items view; positive int = viewing a past session. Set by `loadHistorySession()`; cleared by `exitHistoryMode()`.
- **`loadHistorySession(null)``loadOpenItems()`** — passing `null` no longer resolves to the latest session. It now loads **all open (unactioned) items across every scan** via `GET /api/db/flagged` (no `ref`), leaves `_historyRefScanId` null, and shows no history banner. The "Open items" banner button (`onclick="loadHistorySession(null)"`, key `history_btn_latest`) therefore returns to this open-items view. Specific sessions are still loaded with a positive `ref`, which keeps the re-scan resolved-diff. Do not revert `null` to "resolve latest ref" — that reintroduces the "only the last scan is shown" complaint.
- **Auto-load on page load**`_sseWatchdog()` in `results.js` calls `window.loadHistorySession?.(null)` whenever `/api/scan/status` reports neither `running` (M365 + file lock) nor `google_running` (Google lock) **and** nothing is shown yet (`!S._historyRefScanId && !S.flaggedData.length`). This is **not one-shot** — it retries on every 4s poll until a session is restored, because (a) the replay buffer is empty after a server restart so `sse_replay_done` never fires, and (b) a completed scan's replayed `scan_phase` can leave a running flag set that would otherwise block the load forever. Because both locks are confirmed free, the watchdog clears the stale `_m365/_google/_fileScanRunning` flags before calling. Do not revert to a one-shot `_initialStatusChecked` gate — that reintroduces the "blank grid after refresh/restart" bug. `/api/scan/status` **must** report `google_running` separately; `running` alone misses live Google scans. The `sse_replay_done` handler in `scan.js` still retries for the non-empty-buffer (no-restart) case. - **Auto-load on page load**`_sseWatchdog()` in `results.js` calls `window.loadHistorySession?.(null)` whenever `/api/scan/status` reports neither `running` (M365 + file lock) nor `google_running` (Google lock) **and** nothing is shown yet (`!S._historyRefScanId && !S.flaggedData.length`). This is **not one-shot** — it retries on every 4s poll until a session is restored, because (a) the replay buffer is empty after a server restart so `sse_replay_done` never fires, and (b) a completed scan's replayed `scan_phase` can leave a running flag set that would otherwise block the load forever. Because both locks are confirmed free, the watchdog clears the stale `_m365/_google/_fileScanRunning` flags before calling. Do not revert to a one-shot `_initialStatusChecked` gate — that reintroduces the "blank grid after refresh/restart" bug. `/api/scan/status` **must** report `google_running` separately; `running` alone misses live Google scans. The `sse_replay_done` handler in `scan.js` still retries for the non-empty-buffer (no-restart) case.
- **History banner** (`#historyBanner`) — shown when `S._historyRefScanId` is set. Do not hide/show from outside `history.js`. - **History banner** (`#historyBanner`) — shown when `S._historyRefScanId` is set. Do not hide/show from outside `history.js`.
- **Session picker** (`#historyDropdown`) — rendered inside `[data-history-wrap]` so the outside-click handler works correctly. Do not move the picker outside this wrapper. - **Session picker** (`#historyDropdown`) — rendered inside `[data-history-wrap]` so the outside-click handler works correctly. Do not move the picker outside this wrapper.
- **Cache invalidation**`invalidateHistoryCache()` clears `_sessions` and `_latestRefScanId`. All three `*_done` SSE handlers call `window.invalidateHistoryCache?.()`. - **Cache invalidation**`invalidateHistoryCache()` clears `_sessions` and `_latestRefScanId`. All three `*_done` SSE handlers call `window.invalidateHistoryCache?.()`.
- **Re-scan diff** — items present in the previous session but absent from the current one are tagged `_resolved: true`, rendered with `.card-resolved` and a green ✓ badge, and NOT added to `S.flaggedData` (grid-only, cannot be bulk-selected or exported). - **Re-scan diff** — items present in the previous session but absent from the current one are tagged `_resolved: true`, rendered with `.card-resolved` and a green ✓ badge, and NOT added to `S.flaggedData` (grid-only, cannot be bulk-selected or exported).
- **Mode transitions**`startScan()` calls `window.exitHistoryMode?.()` before clearing the grid. - **Mode transitions**`startScan()` calls `window.exitHistoryMode?.()` before clearing the grid.
- **`renderGrid(files)` hides the landing cards** — whenever `files.length > 0` it hides `#emptyState` and `#lastScanSummary` and shows `#grid`. This is centralised here because the live `scan_file_flagged` handler (`scan.js`) shows the grid but does NOT clear those panels, so results would render *underneath* a still-visible landing/last-scan card until a manual refresh. Do not move this hiding back into individual callers — every render path (live SSE, `loadOpenItems`, history, filters) must clear the landing. The empty case (`files.length === 0`) is left untouched so callers still control the empty/landing state.
## Card user/group badge — results.js
- **`_accountPill(f)`** builds the account/role pill for both card layouts (list + grid). The **group badge is driven by `f.user_role`** (`student`/`staff`) alone, so it renders even with no display name — items from scans saved before `account_name` was persisted (DB migration 11) have only `user_role` + `account_id`. The user label resolves best-effort: `f.account_name``S._allUsers` match (by `id` or `email`) → email-style `account_id` → omit. Do not re-nest the role badge inside an `account_name` check (the old bug) — that hides the group badge for legacy items. Both layouts call `_accountPill(f)`; keep them sharing the one helper.
## CPR cross-referencing — results.js ## CPR cross-referencing — results.js

View File

@ -38,20 +38,50 @@ function invalidateHistoryCache() {
// ── Load a session into the results grid ────────────────────────────────────── // ── Load a session into the results grid ──────────────────────────────────────
async function loadHistorySession(refScanId) { // Default landing view: every flagged item still awaiting action, across all
// refScanId: null → latest session, positive int → specific session // scans (not just the latest session). Leaves S._historyRefScanId null (live
let resolvedRef = refScanId; // mode) and shows no history banner — this is "now", not a past session.
if (resolvedRef === null) { async function loadOpenItems() {
const sessions = _sessions !== null ? _sessions : await _fetchSessions(); // Bail if a scan is running — live SSE owns the grid then.
// Bail if a scan started while we were fetching sessions if (S._m365ScanRunning || S._googleScanRunning || S._fileScanRunning) return;
try {
const r = await fetch('/api/db/flagged');
const items = await r.json();
if (S._m365ScanRunning || S._googleScanRunning || S._fileScanRunning) return; if (S._m365ScanRunning || S._googleScanRunning || S._fileScanRunning) return;
if (!sessions.length) { closeHistoryPicker();
// No scans in DB — nothing to show
if (!Array.isArray(items) || items.length === 0) {
S._historyRefScanId = null;
_setHistoryBanner(false);
window.loadLastScanSummary?.(); window.loadLastScanSummary?.();
return; return;
} }
resolvedRef = sessions[0].ref_scan_id;
S._historyRefScanId = null;
S.flaggedData = items;
S.filteredData = [];
const grid = document.getElementById('grid');
const emptyState = document.getElementById('emptyState');
const lastScan = document.getElementById('lastScanSummary');
if (emptyState) emptyState.style.display = 'none';
if (lastScan) lastScan.style.display = 'none';
if (grid) { grid.innerHTML = ''; grid.style.display = 'grid'; }
window.renderGrid(items);
try { window.markOverdueCards(); } catch(_) {}
try { window.loadTrend(); } catch(_) {}
_setHistoryBanner(false);
} catch(e) {
console.error('[history] failed to load open items:', e);
} }
}
async function loadHistorySession(refScanId) {
// refScanId: null → all open (unreviewed) items across every scan,
// positive int → a specific past session
if (refScanId === null) return loadOpenItems();
const resolvedRef = refScanId;
try { try {
const r = await fetch('/api/db/flagged?ref=' + resolvedRef); const r = await fetch('/api/db/flagged?ref=' + resolvedRef);

View File

@ -25,6 +25,31 @@ const SOURCE_BADGES = {
smb: ['🌐', 'badge-smb', 'Network'], smb: ['🌐', 'badge-smb', 'Network'],
}; };
// Build the user/group pill for a card. The group (role) badge is driven by
// user_role alone so it shows even when no display name is available — e.g.
// items from earlier scans saved before account_name was persisted. For those
// the user label is resolved best-effort from the loaded user list (by id or
// email), falling back to an email-style account_id. Returns '' when there is
// neither a label nor a role to show.
function _accountPill(f) {
const roleBadge =
f.user_role === 'student' ? '<span class="role-badge">' + t('role_student', 'Elev') + '</span>' :
f.user_role === 'staff' ? '<span class="role-badge">' + t('role_staff', 'Ansat') + '</span>' : '';
let label = f.account_name || '';
if (!label && f.account_id) {
const aid = String(f.account_id);
const u = (S._allUsers || []).find(function(u) {
return u.id === f.account_id ||
(u.email && u.email.toLowerCase() === aid.toLowerCase());
});
if (u) label = u.displayName || '';
else if (aid.includes('@')) label = aid; // an email is already human-readable
}
if (!label && !roleBadge) return '';
const title = label || f.user_role || '';
return '<span class="account-pill" title="' + esc(title) + '">' + roleBadge + (label ? esc(label) : '') + '</span>';
}
function appendCard(f) { function appendCard(f) {
const search = document.getElementById('filterSearch').value.trim().toLowerCase(); const search = document.getElementById('filterSearch').value.trim().toLowerCase();
const srcVal = document.getElementById('filterSource').value; const srcVal = document.getElementById('filterSource').value;
@ -61,6 +86,7 @@ function appendCard(f) {
(f.source_type === 'smb' || f.source_type === 'sftp') ? _redactExts.has(_fileExt) : false (f.source_type === 'smb' || f.source_type === 'sftp') ? _redactExts.has(_fileExt) : false
); );
const redactBtn = _redactable ? `<button class="card-redact-btn" title="${t('redact_btn','Redact CPR')}" onclick="event.stopPropagation();redactItem(${JSON.stringify(f).replace(/"/g,'&quot;')},this.closest('.card'))">✏</button>` : ''; const redactBtn = _redactable ? `<button class="card-redact-btn" title="${t('redact_btn','Redact CPR')}" onclick="event.stopPropagation();redactItem(${JSON.stringify(f).replace(/"/g,'&quot;')},this.closest('.card'))">✏</button>` : '';
const acctPill = _accountPill(f);
if (S.isListView) { if (S.isListView) {
card.innerHTML = ` card.innerHTML = `
@ -68,7 +94,7 @@ function appendCard(f) {
<div class="card-info list-info"> <div class="card-info list-info">
<div class="card-name" title="${esc(f.name)}">${esc(f.name)}</div> <div class="card-name" title="${esc(f.name)}">${esc(f.name)}</div>
<div class="card-meta">${f.size_kb} KB · ${esc(f.modified || '')}${f.folder ? ' · 📂 ' + esc(f.folder) : ''}</div> <div class="card-meta">${f.size_kb} KB · ${esc(f.modified || '')}${f.folder ? ' · 📂 ' + esc(f.folder) : ''}</div>
<div class="card-source"><span class="source-badge ${badgeCls}">${esc(label)}</span> ${esc(f.source || '')}${f.account_name ? ' · <span class="account-pill" title="' + esc(f.account_name) + '">' + (f.user_role === 'student' ? '<span class="role-badge">' + t('role_student','Elev') + '</span>' : f.user_role === 'staff' ? '<span class="role-badge">' + t('role_staff','Ansat') + '</span>' : '') + esc(f.account_name) + '</span>' : ''}${f.transfer_risk === 'external-recipient' ? ' <span class="role-pill" style="background:#7B2D00;color:#FFD0B0"> Ext.</span>' : f.transfer_risk ? ' <span class="role-pill" style="background:#003D7B;color:#B0D4FF">🔗</span>' : ''}</div> <div class="card-source"><span class="source-badge ${badgeCls}">${esc(label)}</span> ${esc(f.source || '')}${acctPill ? ' · ' + acctPill : ''}${f.transfer_risk === 'external-recipient' ? ' <span class="role-pill" style="background:#7B2D00;color:#FFD0B0"> Ext.</span>' : f.transfer_risk ? ' <span class="role-pill" style="background:#003D7B;color:#B0D4FF">🔗</span>' : ''}</div>
</div> </div>
<span class="cpr-badge">${f.cpr_count} CPR</span> <span class="cpr-badge">${f.cpr_count} CPR</span>
${f.email_count > 0 ? '<span class="email-badge">' + f.email_count + ' ' + t('m365_badge_emails', 'e-mail') + '</span> ' : ''} ${f.email_count > 0 ? '<span class="email-badge">' + f.email_count + ' ' + t('m365_badge_emails', 'e-mail') + '</span> ' : ''}
@ -84,7 +110,7 @@ function appendCard(f) {
<div class="card-name" title="${esc(f.name)}">${esc(f.name)}</div> <div class="card-name" title="${esc(f.name)}">${esc(f.name)}</div>
<div class="card-meta">${f.size_kb} KB · ${esc(f.modified || '')}</div> <div class="card-meta">${f.size_kb} KB · ${esc(f.modified || '')}</div>
${f.folder ? `<div class="card-meta" style="font-size:10px" title="${esc(f.folder)}">📂 ${esc(f.folder)}</div>` : ''} ${f.folder ? `<div class="card-meta" style="font-size:10px" title="${esc(f.folder)}">📂 ${esc(f.folder)}</div>` : ''}
<div class="card-source"><span class="source-badge ${badgeCls}">${esc(label)}</span>${f.account_name ? ' <span class="account-pill" title="' + esc(f.account_name) + '">' + (f.user_role === "student" ? '<span class="role-badge">' + t("role_student","Elev") + "</span>" : f.user_role === "staff" ? '<span class="role-badge">' + t("role_staff","Ansat") + "</span>" : "") + esc(f.account_name) + '</span>' : ''}${f.transfer_risk === "external-recipient" ? ' <span class="role-pill" style="background:#7B2D00;color:#FFD0B0"> Ext.</span>' : f.transfer_risk ? ' <span class="role-pill" style="background:#003D7B;color:#B0D4FF">🔗</span>' : ''}</div> <div class="card-source"><span class="source-badge ${badgeCls}">${esc(label)}</span>${acctPill ? ' ' + acctPill : ''}${f.transfer_risk === "external-recipient" ? ' <span class="role-pill" style="background:#7B2D00;color:#FFD0B0"> Ext.</span>' : f.transfer_risk ? ' <span class="role-pill" style="background:#003D7B;color:#B0D4FF">🔗</span>' : ''}</div>
<span class="cpr-badge">${f.cpr_count} CPR</span>${f.email_count > 0 ? ' <span class="email-badge">' + f.email_count + ' ' + t('m365_badge_emails', 'e-mail') + '</span>' : ''}${f.phone_count > 0 ? ' <span class="phone-badge">' + f.phone_count + ' ' + t('m365_badge_phones', 'tlf.') + '</span>' : ''}${f.face_count > 0 ? ' <span class="photo-face-badge">' + f.face_count + ' ' + t('m365_badge_faces', f.face_count === 1 ? 'face' : 'faces') + '</span>' : ''}${f.exif && f.exif.gps ? ' <span class="photo-face-badge" style="background:#0a3a5a;color:#7ec8d0">🌍 GPS</span>' : ''}${f._deleted ? ' <span class="resolved-badge" style="background:#3a1a1a;color:#ff9b9b">🗑 ' + t('delete_badge', 'Deleted') + '</span>' : ''}${f._redacted ? ' <span class="resolved-badge"> ' + t('redact_badge', 'Redacted') + '</span>' : ''}${f._resolved ? ' <span class="resolved-badge"> ' + t('history_resolved_badge', 'Resolved') + '</span>' : ''}${f.overdue ? ' <span class="overdue-badge">🗓 Overdue</span>' : ''} <span class="cpr-badge">${f.cpr_count} CPR</span>${f.email_count > 0 ? ' <span class="email-badge">' + f.email_count + ' ' + t('m365_badge_emails', 'e-mail') + '</span>' : ''}${f.phone_count > 0 ? ' <span class="phone-badge">' + f.phone_count + ' ' + t('m365_badge_phones', 'tlf.') + '</span>' : ''}${f.face_count > 0 ? ' <span class="photo-face-badge">' + f.face_count + ' ' + t('m365_badge_faces', f.face_count === 1 ? 'face' : 'faces') + '</span>' : ''}${f.exif && f.exif.gps ? ' <span class="photo-face-badge" style="background:#0a3a5a;color:#7ec8d0">🌍 GPS</span>' : ''}${f._deleted ? ' <span class="resolved-badge" style="background:#3a1a1a;color:#ff9b9b">🗑 ' + t('delete_badge', 'Deleted') + '</span>' : ''}${f._redacted ? ' <span class="resolved-badge"> ' + t('redact_badge', 'Redacted') + '</span>' : ''}${f._resolved ? ' <span class="resolved-badge"> ' + t('history_resolved_badge', 'Resolved') + '</span>' : ''}${f.overdue ? ' <span class="overdue-badge">🗓 Overdue</span>' : ''}
</div> </div>
${delBtn}${redactBtn}`; ${delBtn}${redactBtn}`;
@ -96,6 +122,17 @@ function renderGrid(files) {
const grid = document.getElementById('grid'); const grid = document.getElementById('grid');
grid.innerHTML = ''; grid.innerHTML = '';
files.forEach(f => appendCard(f)); files.forEach(f => appendCard(f));
// Whenever results are rendered, the landing/last-scan cards must be hidden —
// the live scan_file_flagged path shows the grid but does not clear them, so
// results would otherwise appear underneath the still-visible landing page
// until a manual refresh. Centralised here so every render path is covered.
if (files && files.length) {
const es = document.getElementById('emptyState');
if (es) es.style.display = 'none';
const ls = document.getElementById('lastScanSummary');
if (ls) ls.style.display = 'none';
if (grid) grid.style.display = S.isListView ? 'block' : 'grid';
}
_updateBulkBar(); _updateBulkBar();
updateDispositionStats(); updateDispositionStats();
} }

View File

@ -314,15 +314,17 @@ function stLoadSmtp() {
const set = function(id, val) { const el=document.getElementById(id); if(el) el.value=val||''; }; const set = function(id, val) { const el=document.getElementById(id); if(el) el.value=val||''; };
set('st-smtpHost', d.host); set('st-smtpHost', d.host);
set('st-smtpPort', d.port || 587); set('st-smtpPort', d.port || 587);
set('st-smtpUser', d.user); set('st-smtpUser', d.username);
set('st-smtpFrom', d.from_addr); set('st-smtpFrom', d.from_addr);
set('st-smtpTo', Array.isArray(d.recipients) ? d.recipients.join(', ') : (d.recipients||'')); set('st-smtpTo', Array.isArray(d.recipients) ? d.recipients.join(', ') : (d.recipients||''));
const tls = document.getElementById('st-smtpTls'); const tls = document.getElementById('st-smtpTls');
if (tls) tls.checked = d.starttls !== false; if (tls) tls.checked = d.use_tls !== false;
const pw = document.getElementById('st-smtpPw'); const pw = document.getElementById('st-smtpPw');
if (pw) pw.value = d.has_password ? '\u2022\u2022\u2022\u2022\u2022\u2022\u2022\u2022' : ''; if (pw) pw.value = d.has_password ? '\u2022\u2022\u2022\u2022\u2022\u2022\u2022\u2022' : '';
const ae = document.getElementById('st-smtpAutoEmail'); const ae = document.getElementById('st-smtpAutoEmail');
if (ae) ae.checked = !!d.auto_email_manual; if (ae) ae.checked = !!d.auto_email_manual;
const ps = document.getElementById('st-smtpPreferSmtp');
if (ps) ps.checked = !!d.prefer_smtp;
}).catch(function(){}); }).catch(function(){});
} }
@ -333,11 +335,15 @@ async function stSmtpSave() {
const body = { const body = {
host: document.getElementById('st-smtpHost').value.trim(), host: document.getElementById('st-smtpHost').value.trim(),
port: parseInt(document.getElementById('st-smtpPort').value) || 587, port: parseInt(document.getElementById('st-smtpPort').value) || 587,
user: document.getElementById('st-smtpUser').value.trim(), // Backend (routes/email.py) reads these exact keys — `username`/`use_tls`,
// not `user`/`starttls`. Sending the wrong keys leaves username empty so
// server.login() is skipped and the SMTP server rejects the send.
username: document.getElementById('st-smtpUser').value.trim(),
from_addr: document.getElementById('st-smtpFrom').value.trim(), from_addr: document.getElementById('st-smtpFrom').value.trim(),
recipients: document.getElementById('st-smtpTo').value.split(/[,;]/).map(function(s){return s.trim();}).filter(Boolean), recipients: document.getElementById('st-smtpTo').value.split(/[,;]/).map(function(s){return s.trim();}).filter(Boolean),
starttls: document.getElementById('st-smtpTls').checked, use_tls: document.getElementById('st-smtpTls').checked,
auto_email_manual: !!(document.getElementById('st-smtpAutoEmail') || {}).checked, auto_email_manual: !!(document.getElementById('st-smtpAutoEmail') || {}).checked,
prefer_smtp: !!(document.getElementById('st-smtpPreferSmtp') || {}).checked,
}; };
if (pw !== null) body.password = pw; if (pw !== null) body.password = pw;
st.style.color = 'var(--muted)'; st.textContent = t('m365_smtp_saving','Saving...'); st.style.color = 'var(--muted)'; st.textContent = t('m365_smtp_saving','Saving...');

View File

@ -375,7 +375,7 @@ document.addEventListener('DOMContentLoaded', applyI18n);
<button id="historyPickerBtn" type="button" onclick="openHistoryPicker()" style="height:24px;padding:0 10px;background:none;border:1px solid var(--border);color:var(--muted);border-radius:4px;font-size:11px;cursor:pointer" data-i18n="history_btn_sessions">Sessions</button> <button id="historyPickerBtn" type="button" onclick="openHistoryPicker()" style="height:24px;padding:0 10px;background:none;border:1px solid var(--border);color:var(--muted);border-radius:4px;font-size:11px;cursor:pointer" data-i18n="history_btn_sessions">Sessions</button>
<div id="historyDropdown" style="display:none;position:absolute;right:0;top:calc(100% + 4px);background:var(--surface);border:1px solid var(--border);border-radius:6px;z-index:9999;width:300px;max-height:260px;overflow-y:auto;box-shadow:0 4px 12px rgba(0,0,0,.25)"></div> <div id="historyDropdown" style="display:none;position:absolute;right:0;top:calc(100% + 4px);background:var(--surface);border:1px solid var(--border);border-radius:6px;z-index:9999;width:300px;max-height:260px;overflow-y:auto;box-shadow:0 4px 12px rgba(0,0,0,.25)"></div>
</div> </div>
<button id="historyLatestBtn" type="button" onclick="loadHistorySession(null)" style="display:none;height:24px;padding:0 10px;background:none;border:1px solid var(--accent);color:var(--accent);border-radius:4px;font-size:11px;cursor:pointer;flex-shrink:0" data-i18n="history_btn_latest">Latest scan</button> <button id="historyLatestBtn" type="button" onclick="loadHistorySession(null)" style="display:none;height:24px;padding:0 10px;background:none;border:1px solid var(--accent);color:var(--accent);border-radius:4px;font-size:11px;cursor:pointer;flex-shrink:0" data-i18n="history_btn_latest">Open items</button>
</div> </div>
<!-- Filter bar — full width, above grid + preview --> <!-- Filter bar — full width, above grid + preview -->
@ -845,6 +845,10 @@ document.addEventListener('DOMContentLoaded', applyI18n);
<label data-i18n="m365_smtp_auto_email_manual">Email report after manual scan</label> <label data-i18n="m365_smtp_auto_email_manual">Email report after manual scan</label>
<label class="toggle" style="flex:unset"><input type="checkbox" id="st-smtpAutoEmail"><span class="toggle-slider"></span></label> <label class="toggle" style="flex:unset"><input type="checkbox" id="st-smtpAutoEmail"><span class="toggle-slider"></span></label>
</div> </div>
<div class="settings-row">
<label data-i18n="m365_smtp_prefer_smtp">Always send via SMTP (skip Microsoft Graph)</label>
<label class="toggle" style="flex:unset"><input type="checkbox" id="st-smtpPreferSmtp"><span class="toggle-slider"></span></label>
</div>
<div style="display:flex;justify-content:flex-end;gap:8px;margin-top:4px"> <div style="display:flex;justify-content:flex-end;gap:8px;margin-top:4px">
<div id="st-smtpStatus" style="flex:1;font-size:11px;color:var(--muted);align-self:center"></div> <div id="st-smtpStatus" style="flex:1;font-size:11px;color:var(--muted);align-self:center"></div>
<button onclick="stSmtpSave()" style="background:none;border:1px solid var(--border);color:var(--muted);height:26px;padding:0 12px;border-radius:6px;font-size:12px;cursor:pointer;box-sizing:border-box" data-i18n="btn_save">Save</button> <button onclick="stSmtpSave()" style="background:none;border:1px solid var(--border);color:var(--muted);height:26px;padding:0 12px;border-radius:6px;font-size:12px;cursor:pointer;box-sizing:border-box" data-i18n="btn_save">Save</button>

View File

@ -252,3 +252,36 @@ class TestFernet:
def test_decrypt_empty_returns_empty(self): def test_decrypt_empty_returns_empty(self):
result = app_config._decrypt_password("") result = app_config._decrypt_password("")
assert result == "" assert result == ""
class TestSmtpConfigLegacyKeys:
"""SMTP config saved by the older settings tab used `user`/`starttls`;
readers expect `username`/`use_tls`. _load_smtp_config must normalise them."""
def test_legacy_keys_normalised_on_load(self, tmp_path, monkeypatch):
import json
p = tmp_path / "smtp.json"
p.write_text(json.dumps({
"host": "smtp.gmail.com", "port": 587,
"user": "netadmin@adm.example.dk", # legacy key
"starttls": True, # legacy key
"from_addr": "netadmin@adm.example.dk",
"recipients": ["a@example.dk"],
}), encoding="utf-8")
monkeypatch.setattr(app_config, "_SMTP_CONFIG_PATH", p)
cfg = app_config._load_smtp_config()
assert cfg["username"] == "netadmin@adm.example.dk"
assert cfg["use_tls"] is True
def test_canonical_keys_take_precedence(self, tmp_path, monkeypatch):
import json
p = tmp_path / "smtp.json"
p.write_text(json.dumps({
"username": "canonical@example.dk",
"user": "legacy@example.dk",
}), encoding="utf-8")
monkeypatch.setattr(app_config, "_SMTP_CONFIG_PATH", p)
cfg = app_config._load_smtp_config()
assert cfg["username"] == "canonical@example.dk"

View File

@ -265,3 +265,71 @@ class TestExportImport:
tgt.import_db(str(export_path), mode="replace") tgt.import_db(str(export_path), mode="replace")
results = tgt.lookup_data_subject("290472-1234") results = tgt.lookup_data_subject("290472-1234")
assert len(results) >= 1 assert len(results) >= 1
# ─────────────────────────────────────────────────────────────────────────────
# Orphan-scan recovery (crash / kill / mid-scan restart)
# ─────────────────────────────────────────────────────────────────────────────
class TestOrphanScanRecovery:
def _start_unfinished_scan(self, db, item_id):
"""Begin a scan and save an item but never call finish_scan."""
sid = db.begin_scan({"sources": ["email"], "user_ids": []})
db.save_item(sid, _make_card(item_id=item_id))
return sid
def test_unfinished_scan_items_hidden_until_recovery(self, tmp_db):
self._start_unfinished_scan(tmp_db, "orphan-1")
# Not finalised → invisible to the open-items view
assert tmp_db.get_open_items() == []
def test_recovery_finalises_and_reveals_items(self, tmp_db):
self._start_unfinished_scan(tmp_db, "orphan-1")
self._start_unfinished_scan(tmp_db, "orphan-2")
recovered = tmp_db.finalize_orphan_scans()
assert recovered == 2
ids = {row["id"] for row in tmp_db.get_open_items()}
assert ids == {"orphan-1", "orphan-2"}
def test_recovery_leaves_finished_scans_untouched(self, tmp_db):
sid = tmp_db.begin_scan({"sources": ["email"], "user_ids": []})
tmp_db.save_item(sid, _make_card(item_id="done-1"))
tmp_db.finish_scan(sid, total_scanned=1)
before = tmp_db._connect().execute(
"SELECT finished_at FROM scans WHERE id=?", (sid,)
).fetchone()[0]
assert tmp_db.finalize_orphan_scans() == 0 # nothing to recover
after = tmp_db._connect().execute(
"SELECT finished_at FROM scans WHERE id=?", (sid,)
).fetchone()[0]
assert after == before # finished_at not rewritten
def test_recovery_is_idempotent(self, tmp_db):
self._start_unfinished_scan(tmp_db, "orphan-1")
assert tmp_db.finalize_orphan_scans() == 1
assert tmp_db.finalize_orphan_scans() == 0
# ─────────────────────────────────────────────────────────────────────────────
# account_name persistence (user/group badge data)
# ─────────────────────────────────────────────────────────────────────────────
class TestAccountNamePersistence:
def test_account_name_round_trips(self, tmp_db):
sid = tmp_db.begin_scan({"sources": ["email"], "user_ids": []})
tmp_db.save_item(sid, _make_card(item_id="an-1")) # account_name="Test User"
tmp_db.finish_scan(sid, total_scanned=1)
row = [r for r in tmp_db.get_open_items() if r["id"] == "an-1"][0]
assert row.get("account_name") == "Test User"
def test_account_name_column_exists(self, tmp_db):
cols = [r[1] for r in tmp_db._connect().execute(
"PRAGMA table_info(flagged_items)").fetchall()]
assert "account_name" in cols

View File

@ -270,6 +270,49 @@ class TestFlaggedScopeEnforcement:
ids = {row["id"] for row in r.get_json()} ids = {row["id"] for row in r.get_json()}
assert "ci1" in ids assert "ci1" in ids
def test_no_ref_returns_open_items_across_all_sessions(self, client, db_patch):
# Two scans in separate session windows. The default (no-ref) view must
# surface unactioned items from BOTH, not just the latest session.
old_id = _seed_scan(db_patch, [_item("o1")])
db_patch._connect().execute(
"UPDATE scans SET started_at = started_at - 400 WHERE id = ?", (old_id,)
)
db_patch._connect().commit()
_seed_scan(db_patch, [_item("o2")])
r = client.get("/api/db/flagged")
ids = {row["id"] for row in r.get_json()}
assert ids == {"o1", "o2"}
def test_no_ref_excludes_items_with_a_disposition(self, client, db_patch):
_seed_scan(db_patch, [_item("d1"), _item("d2")])
db_patch.set_disposition("d1", "kept")
r = client.get("/api/db/flagged")
ids = {row["id"] for row in r.get_json()}
assert "d2" in ids # untouched → still open
assert "d1" not in ids # action taken → hidden
def test_no_ref_unreviewed_disposition_stays_open(self, client, db_patch):
_seed_scan(db_patch, [_item("u1")])
db_patch.set_disposition("u1", "unreviewed")
r = client.get("/api/db/flagged")
ids = {row["id"] for row in r.get_json()}
assert "u1" in ids # 'unreviewed' status is not an action
def test_no_ref_dedupes_rescanned_item_to_latest(self, client, db_patch):
# Same item flagged by two scans → appears once.
old_id = _seed_scan(db_patch, [_item("k1")])
db_patch._connect().execute(
"UPDATE scans SET started_at = started_at - 400 WHERE id = ?", (old_id,)
)
db_patch._connect().commit()
_seed_scan(db_patch, [_item("k1")])
rows = [row for row in client.get("/api/db/flagged").get_json() if row["id"] == "k1"]
assert len(rows) == 1
def test_ref_param_loads_historical_session(self, client, db_patch): def test_ref_param_loads_historical_session(self, client, db_patch):
# Push first scan >300 s into the past so it occupies its own session window. # Push first scan >300 s into the past so it occupies its own session window.
old_id = _seed_scan(db_patch, [_item("h1")]) old_id = _seed_scan(db_patch, [_item("h1")])