Compare commits

..

No commits in common. "efbbeb73062cc0bee44d5ce615a323e5df79b5b9" and "67f66c844157ded83fba282e0e8622494b001e8d" have entirely different histories.

26 changed files with 40 additions and 446 deletions

View File

@ -11,24 +11,6 @@ Version numbers follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html
---
## [1.7.9] — 2026-06-22
### Added
- **"Always send via SMTP" option for email reports** — new toggle in **Settings → E-mailrapport**. When the scanner is signed in to Microsoft 365 it normally sends email through Microsoft Graph; Graph reports "accepted" the instant a message is queued, which hides the case where Exchange Online later silently drops it (e.g. a recipient on a Google-hosted subdomain of your Microsoft 365 domain — the message is treated as internal, finds no mailbox, and is discarded, with no delivery and no bounce). Enabling this option makes the manual report, the test email, and the after-scan auto-email all go straight through your configured SMTP server (e.g. Google Workspace `smtp.gmail.com` / `smtp-relay.gmail.com`), bypassing the Graph routing entirely.
### Changed
- **The results grid now shows every open item by default, not just the last scan** — when you open the app (or refresh after a scheduled or manual scan), the grid loads *all* flagged items that still need action — i.e. those with no disposition — across every scan, instead of only the most recent scan session. Items you have already tagged (kept, redacted, deleted, false positive, …) drop out of the view. Re-scans are de-duplicated so each item appears once, showing its most recent state. The session picker still loads any individual past scan, and the history banner button (formerly "Latest scan") is now **"Open items"** and returns to this default view.
### Fixed
- **Interrupted scans no longer lose their results** — a scan only became visible once it was *finalised*, but the Microsoft 365 and Google scan engines skipped finalisation when a scan was stopped, and any scan cut short by a server restart, crash, or out-of-memory kill never finalised at all. Its already-found items were then stranded in the database and invisible in the grid (this is what caused "scan finished but no results shown", especially after the in-app self-update restarts). Unfinished scans are now finalised automatically on startup (nothing is scanning at boot, so any unfinished scan is known to be dead), and a manually stopped Microsoft 365 scan finalises immediately so its partial results stay visible.
- **User and group badges were missing on result cards loaded from the database** — the reviewer's display name was shown live during a scan but never saved, so cards loaded from a past scan (now the default view) lost both the person badge and the Elev/Ansat group badge. The display name is now stored with each item, and the group badge is shown from the saved role even for older items that predate this fix (where a name can't be recovered, the group badge and a resolved e-mail still appear).
- **Email reports sent via SMTP failed with "authentication failed"** — the **Settings → E-mailrapport** tab saved the SMTP username under the wrong field name, so the username never reached the mail server and sign-in was skipped — the server then rejected the unauthenticated message, which surfaced as a misleading authentication error even with a correct password or app password. The setting is now saved correctly, and configurations saved before the fix are migrated automatically.
---
## [1.7.8] — 2026-06-16
### Fixed

View File

@ -93,10 +93,7 @@ All options live in the profile `options` dict and apply to **all three scan eng
- **`get_sessions(limit=50, window_seconds=300)`** — groups `scans` rows by 300 s window. Groups built ascending, returned descending. `ref_scan_id` is the highest `scan_id` in each group. Do not change window size independently of `get_session_items`.
- **`get_session_items(ref_scan_id=N)`** — anchors 300 s window to that scan's `started_at`. Window is **symmetric**: `started_at BETWEEN ref.started_at - 300 AND ref.started_at + 300`. Do not revert to a one-sided lower bound.
- **`get_related_items(item_id, ref_scan_id, window_seconds=300)`** — self-joins `cpr_index` to find items sharing ≥1 CPR hash. Uses same 300 s symmetric window — do not change independently.
- **`account_name` (display name) is persisted** (migration 11) so DB-loaded cards show the user badge. Legacy rows predating it have `account_name=''` — the frontend `_accountPill` resolves a fallback and still shows the group badge from `user_role`. `save_item` must keep writing `card["account_name"]` (both M365 and Google cards carry it).
- **Scans must be finalised or their items are invisible**`get_session_items`, `get_open_items`, and `latest_scan_id` all filter on `finished_at IS NOT NULL`. The file scan finalises in a `finally`; M365 (`run_scan`) and Google (`_run_google_scan`) `return` early on abort, so each now calls `finish_scan` before that abort-return. A process kill (deploy/OOM/crash) mid-scan still strands a scan → **`finalize_orphan_scans()`** runs once at server startup (`gdpr_scanner.py` `__main__`, before the scheduler) and finalises every `finished_at IS NULL` scan (safe because nothing is scanning at boot). Do not add a scan-results query that ignores `finished_at` instead of fixing finalisation.
- **`get_open_items()`** — returns every flagged item with **no action taken**, across **all** scans (not just the latest session window). "Open" = no `dispositions` row, or one whose `status='unreviewed'`. Because `flagged_items` PK is `(id, scan_id)`, the same item recurs per scan; the query dedupes by `id`, keeping the row from the highest finished `scan_id`. This powers the **default landing view** so items don't drop out of sight once a newer scan opens a fresh session.
- **`GET /api/db/flagged`** — **with `?ref=N`**`get_session_items(ref_scan_id=N)` (history mode); **without ref**`get_open_items()` (default + viewer). Viewer scope enforcement applies to both. Do not change the no-ref `get_session_items()` default elsewhere (`export.py`, `scan_scheduler.py` still rely on latest-session for the current scan's report/email).
- **`GET /api/db/flagged?ref=N`** — passes `ref_scan_id` to `get_session_items`; viewer scope enforcement still applies.
- See `static/js/CLAUDE.md` for the frontend history browser behaviour and `sse_replay_done` retry fix.
## Global gotchas

View File

@ -1 +1 @@
1.7.9
1.7.8

View File

@ -878,13 +878,6 @@ def _load_smtp_config() -> dict:
cfg = json.loads(_SMTP_CONFIG_PATH.read_text(encoding="utf-8"))
if cfg.get("password"):
cfg["password"] = _decrypt_password(cfg["password"])
# Normalise legacy key names written by an older settings-tab UI
# (`user`/`starttls`) to the canonical keys every reader expects
# (`username`/`use_tls`), so configs saved before the fix still work.
if "username" not in cfg and "user" in cfg:
cfg["username"] = cfg["user"]
if "use_tls" not in cfg and "starttls" in cfg:
cfg["use_tls"] = cfg["starttls"]
return cfg
except Exception:
pass

View File

@ -1,6 +1,6 @@
# GDPR Scanner — Brugermanual
Version 1.7.9
Version 1.7.8
---
@ -200,8 +200,6 @@ Klik på **▶ Genoptag** for at fortsætte fra det sted, scanningen slap. Klik
## 5. Forstå resultaterne
Når du åbner appen, viser gitteret **alle åbne fund** — alle markerede elementer, der stadig kræver handling (dvs. uden disposition), på tværs af alle dine scanninger og ikke kun den seneste. Efterhånden som du mærker elementer (behold, anonymisér, slet, falsk positiv …), forsvinder de fra denne visning, så det, der står tilbage, er dit udestående arbejde. Hvert element vises én gang med sin nyeste tilstand. Vil du i stedet se en enkelt tidligere scanning, så brug sessionsvælgeren (se *Gennemse tidligere scanningssessioner* nedenfor).
Hvert fundet element vises som et kort. Her er forklaringen på mærker og labels:
### Kildemærker
@ -258,7 +256,7 @@ Når en scanning er afsluttet, kan du gennemse resultaterne fra en tidligere sca
- Klik på **Sessioner**-knappen i historikbanneret (der vises over resultatgitteret, når en scanning er afsluttet) for at åbne sessionsvælgeren.
- Hver række viser dato og tidspunkt, hvilke kilder der blev scannet, og hvor mange elementer der blev fundet. Et **Δ**-mærkat angiver delta-scanninger; **Seneste** markerer den nyeste session.
- Klik på en række for at indlæse den pågældende sessions resultater i gitteret. Et historikbanner erstatter statuslinjen med sessionens oplysninger.
- Klik på **Åbne fund** i banneret for at forlade den tidligere session og vende tilbage til standardvisningen med alle elementer, der stadig kræver handling.
- Klik på **Seneste scanning** i banneret for at vende tilbage til den nyeste session.
- Start af en ny scanning afslutter automatisk historiktilstanden og skifter til live-resultater.
Alle filtre, eksporter og dispositionsmærkning fungerer normalt, mens du gennemser tidligere sessioner.
@ -528,17 +526,7 @@ Klik på **Gem** for at gemme, og klik derefter på **Test** for at sende en tes
> Hvis din konto har MFA (to-faktor-godkendelse) aktiveret, kan du ikke bruge din almindelige adgangskode. Du skal oprette en **app-adgangskode** i din kontos sikkerhedsindstillinger:
> - **Personlig Microsoft-konto**: account.microsoft.com/security → App-adgangskoder
> - **Gmail / Google Workspace**: myaccount.google.com → Sikkerhed → 2-trinsbekræftelse → App-adgangskoder (for Google Workspace-konti skal din administrator først tillade app-adgangskoder eller opsætte et SMTP-relay)
### Send altid via SMTP (spring Microsoft Graph over)
Når scanneren er logget på Microsoft 365, sender den normalt e-mail gennem Microsoft 365 direkte, uden at bruge SMTP-indstillingerne ovenfor. Det er praktisk, men det kan ikke levere til visse adresser — især en adresse på et Google-hostet underdomæne af dit Microsoft 365-domæne, som Microsoft 365 opfatter som intern og kasserer i stilhed (ingen levering, ingen fejl).
Slå **Send altid via SMTP (spring Microsoft Graph over)** til for at tvinge al e-mail — test-e-mails, manuelle rapporter og automatisk e-mail efter scanning — gennem den SMTP-server, du har konfigureret ovenfor. Brug dette, når dine rapporter sendes til en postkasse, som Microsoft 365 ikke kan levere til (f.eks. en Google Workspace-adresse), med `smtp.gmail.com` / `smtp-relay.gmail.com` som SMTP-vært.
### Send rapport efter manuel scanning
Slå **Send rapport efter manuel scanning** til for automatisk at sende rapporten pr. e-mail til dine konfigurerede modtagere, hver gang en manuel scanning er færdig.
> - **Gmail**: myaccount.google.com → Sikkerhed → 2-trinsbekræftelse → App-adgangskoder
### Send en rapport manuelt
@ -683,4 +671,4 @@ For en typisk skole- eller kommunescanning er omkostningen ubetydelig — Claude
---
*GDPR Scanner v1.7.9 — teknisk opsætning og konfiguration: se README.md*
*GDPR Scanner v1.7.8 — teknisk opsætning og konfiguration: se README.md*

View File

@ -1,6 +1,6 @@
# GDPR Scanner — User Manual
Version 1.7.9
Version 1.7.8
---
@ -200,8 +200,6 @@ Click **▶ Genoptag** to continue from where the scan left off. Click **Start f
## 5. Understanding the Results
When you open the app, the grid shows **all open items** — every flagged item that still needs action (i.e. has no disposition), across all of your scans, not just the most recent one. As you tag items (kept, redacted, deleted, false positive, …) they drop out of this view, so what remains is your outstanding work. Each item appears once, showing its most recent state. To look at a single past scan instead, use the session picker (see *Browsing past scan sessions* below).
Each flagged item appears as a card. Here is what the badges and labels mean:
### Source badges
@ -258,7 +256,7 @@ Once a scan has completed, you can review results from any earlier scan session
- Click the **Sessions** button in the history banner (which appears above the results grid after a scan completes) to open the session picker.
- Each row shows the date and time, which sources were scanned, and how many items were flagged. A **Δ** badge marks delta scans; **Latest** marks the most recent session.
- Click any row to load that session's results into the grid. A history banner replaces the progress bar, showing the session details.
- Click **Open items** in the banner to leave the past session and return to the default view of all items still needing action.
- Click **Latest scan** in the banner to jump back to the most recent session.
- Starting a new scan automatically exits history mode and switches back to live results.
All filters, exports, and disposition tagging work normally while browsing past sessions.
@ -528,17 +526,7 @@ Click **Gem** to save, then click **Test** to send a test email and verify the c
> If your account has MFA (two-factor authentication) enabled, you cannot use your regular password. You need to create an **App Password** in your account security settings:
> - **Microsoft personal account**: account.microsoft.com/security → App passwords
> - **Gmail / Google Workspace**: myaccount.google.com → Security → 2-Step Verification → App passwords (for Google Workspace accounts your administrator must first allow App Passwords, or set up an SMTP relay)
### Always send via SMTP (skip Microsoft Graph)
When the scanner is signed in to Microsoft 365, it normally sends email through Microsoft 365 directly, without using the SMTP settings above. This is convenient, but it cannot deliver to some addresses — most notably an address on a Google-hosted subdomain of your Microsoft 365 domain, which Microsoft 365 treats as internal and silently discards (no delivery, no error).
Turn on **Send altid via SMTP (spring Microsoft Graph over)** to force all email — test emails, manual reports, and the after-scan auto-email — through the SMTP server you configured above. Use this when your reports go to a mailbox Microsoft 365 won't deliver to (for example a Google Workspace address), with `smtp.gmail.com` / `smtp-relay.gmail.com` as the SMTP host.
### Email report after manual scan
Turn on **Send rapport efter manuel scanning** to automatically email the report to your configured recipients every time a manual scan finishes.
> - **Gmail**: myaccount.google.com → Security → 2-Step Verification → App passwords
### Sending a report manually
@ -683,4 +671,4 @@ For a typical school or municipality scan the cost is negligible — Claude Haik
---
*GDPR Scanner v1.7.9 — for technical setup and configuration see README.md*
*GDPR Scanner v1.7.8 — for technical setup and configuration see README.md*

View File

@ -111,25 +111,7 @@ Optional hardening:
---
## 7. Firewall / perimeter checklist
The Zoraxy whitelist (step 6) is an **application-layer** control — a rejected request has still completed the TCP and TLS handshake against your box, and any proxy host you forget to tag is fully exposed. The firewall is the real perimeter. Work this checklist whenever you stand up or replace the edge firewall:
- [ ] **No inbound port-forward unless a service is intentionally public.** A LAN-only deployment needs *zero* inbound forwards — DNS-01 (step 4) is outbound-only, so certificates issue and renew with the firewall fully closed.
- [ ] **If any service is intentionally public** (e.g. a media server), forward **443 only to the Zoraxy host** — never to individual app hosts. Everything then enters through Zoraxy, where the per-host Access Rule decides public vs. private.
- [ ] **The per-host whitelist stays your public/private boundary even with the firewall in place** — it is not made redundant by the firewall. Public hosts use the `default` rule; every internal-only host gets **Local Access Only**.
- [ ] **New proxy hosts default to public.** Zoraxy applies the `default` rule to any host with no rule set, so a freshly-added internal service is reachable the moment it exists. Set its Access Rule to **Local Access Only** *at creation time*.
- [ ] **Management ports are LAN-only.** Zoraxy admin (`:8000`) and any app admin UI must never be forwarded; tag them **Local Access Only** as well.
- [ ] **Verify from off-network.** From a connection outside the LAN (e.g. a phone on mobile data), confirm private hostnames are blocked and only the intentionally-public ones respond:
```bash
curl -v https://gdprscanner.example.dk # should fail/refuse from outside
nmap -Pn -p 80,443,5100 <your-public-IP> # only intentionally-open ports listed
```
---
## 8. Verify the scanner-specific behaviour
## 7. Verify the scanner-specific behaviour
1. `https://gdprscanner.example.dk` loads with a valid padlock; `http://` redirects.
2. **Run a scan and watch result cards stream in live** — that is the Server-Sent Events connection (`/api/scan/stream`) passing through the proxy. If progress stalls while the scan log advances, look at proxy buffering/timeout settings.

View File

@ -29,14 +29,11 @@ Usage (from gdpr_scanner.py)
import hashlib
import json
import logging
import sqlite3
import time
from pathlib import Path
from typing import Iterator
logger = logging.getLogger(__name__)
from pathlib import Path as _P
_DATA_DIR = _P.home() / ".gdprscanner"
_DATA_DIR.mkdir(exist_ok=True)
@ -228,7 +225,6 @@ _MIGRATIONS: list[tuple[int, str]] = [
emailed INTEGER NOT NULL DEFAULT 0,
error TEXT NOT NULL DEFAULT ''
)"""),
(11, "ALTER TABLE flagged_items ADD COLUMN account_name TEXT NOT NULL DEFAULT ''"),
]
@ -330,8 +326,8 @@ class ScanDB:
url, drive_id, size_kb, modified, cpr_count, risk,
thumb_b64, thumb_mime, attachments, user_role, transfer_risk,
special_category, face_count, exif_json, full_path,
email_count, phone_count, body_excerpt, account_name, scanned_at)
VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)""",
email_count, phone_count, body_excerpt, scanned_at)
VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)""",
(
card.get("id", ""),
scan_id,
@ -358,7 +354,6 @@ class ScanDB:
card.get("email_count", 0),
card.get("phone_count", 0),
card.get("body_excerpt", ""),
card.get("account_name", ""),
now,
),
)
@ -437,33 +432,6 @@ class ScanDB:
c.commit()
def finalize_orphan_scans(self) -> int:
"""Finalise scans left unfinished by a crash, kill, or mid-scan restart.
After a fresh process start nothing is scanning, so any scan still
carrying finished_at IS NULL is dead the process that owned it is gone.
Its already-saved flagged_items were stranded: both get_session_items
and get_open_items require finished_at, so those items are invisible and
effectively lost. Finalising the orphans on startup makes them show up
and prevents permanent data loss from interrupted scans (the M365 and
Google engines return early on abort and never reach finish_scan; only
the file scan finalises in a finally block).
Safe to call only when no scan is running (i.e. at startup). Returns the
number of scans finalised.
"""
rows = self._connect().execute(
"SELECT id, total_scanned FROM scans WHERE finished_at IS NULL"
).fetchall()
count = 0
for sid, total in rows:
try:
self.finish_scan(sid, total or 0)
count += 1
except Exception as e:
logger.warning("[db] finalize_orphan_scans: scan %s failed: %s", sid, e)
return count
# ── Query helpers ─────────────────────────────────────────────────────────
def latest_scan_id(self) -> int | None:
@ -568,40 +536,6 @@ class ScanDB:
result.append(d)
return result
def get_open_items(self) -> list[dict]:
"""Return every flagged item across all scans that has no action taken.
"Open" means the item has no disposition row (or a row whose status is
still 'unreviewed'). Unlike get_session_items this is NOT limited to the
latest scan window it surfaces all outstanding items so nothing slips
out of view once a newer scan starts a fresh session.
flagged_items has a composite PK of (id, scan_id), so the same logical
item appears once per scan that flagged it. We deduplicate by id, keeping
the row from the most recent finished scan, so each open item shows once.
"""
rows = self._connect().execute(
"""SELECT fi.*, COALESCE(d.status, 'unreviewed') AS disposition
FROM flagged_items fi
JOIN scans s ON fi.scan_id = s.id
LEFT JOIN dispositions d ON d.item_id = fi.id
WHERE s.finished_at IS NOT NULL
AND (d.item_id IS NULL OR d.status = 'unreviewed')
AND fi.scan_id = (
SELECT MAX(fi2.scan_id)
FROM flagged_items fi2
JOIN scans s2 ON fi2.scan_id = s2.id
WHERE fi2.id = fi.id AND s2.finished_at IS NOT NULL
)
ORDER BY fi.cpr_count DESC""",
).fetchall()
result = []
for r in rows:
d = dict(r)
d["attachments"] = json.loads(d.get("attachments") or "[]")
result.append(d)
return result
def get_related_items(self, item_id: str, ref_scan_id: int | None = None,
window_seconds: int = 300) -> list[dict]:
"""Return flagged items from the same session that share at least one CPR

View File

@ -2305,19 +2305,6 @@ Example --settings file with SMTP:
print(f"\n GDPRScanner\n ──────────────────────────────")
print(f" Open: http://{args.host}:{args.port}")
# Recover scans left unfinished by a crash / kill / mid-scan restart.
# Nothing is scanning at startup, so any scan with finished_at IS NULL is
# dead; finalising it makes its already-saved items visible again instead
# of stranding them (both get_session_items and get_open_items require a
# finished scan). Must run before the scheduler can start a new scan.
try:
if DB_OK:
_recovered = _get_db().finalize_orphan_scans()
if _recovered:
print(f" Recovered {_recovered} unfinished scan(s) from a prior restart")
except Exception as _orphan_err:
print(f" Orphan-scan recovery: failed ({_orphan_err})")
# Start in-process scheduler (#19)
try:
import scan_scheduler as _sched_mod

View File

@ -106,7 +106,7 @@
"history_lbl": "Historik",
"history_items": "fund",
"history_btn_sessions": "Sessioner",
"history_btn_latest": "Åbne fund",
"history_btn_latest": "Seneste scanning",
"history_picker_empty": "Ingen tidligere scanninger",
"history_delta_badge": "Delta",
"history_latest_badge": "Seneste",
@ -366,7 +366,6 @@
"m365_smtp_recipients_hint": "Adskil med komma eller semikolon",
"m365_smtp_save": "Gem",
"m365_smtp_auto_email_manual": "Send rapport efter manuel scanning",
"m365_smtp_prefer_smtp": "Send altid via SMTP (spring Microsoft Graph over)",
"m365_smtp_send": "Send nu",
"m365_smtp_saved": "Indstillinger gemt.",
"m365_smtp_sending": "Sender…",

View File

@ -167,7 +167,7 @@
"history_lbl": "Verlauf",
"history_items": "Treffer",
"history_btn_sessions": "Sessionen",
"history_btn_latest": "Offene Einträge",
"history_btn_latest": "Letzter Scan",
"history_picker_empty": "Keine früheren Scans",
"history_delta_badge": "Delta",
"history_latest_badge": "Aktuell",
@ -366,7 +366,6 @@
"m365_smtp_recipients_hint": "Komma- oder semikolongetrennt",
"m365_smtp_save": "Speichern",
"m365_smtp_auto_email_manual": "Bericht nach manueller Suche senden",
"m365_smtp_prefer_smtp": "Immer via SMTP senden (Microsoft Graph überspringen)",
"m365_smtp_send": "Jetzt senden",
"m365_smtp_saved": "Einstellungen gespeichert.",
"m365_smtp_sending": "Senden…",

View File

@ -106,7 +106,7 @@
"history_lbl": "History",
"history_items": "items",
"history_btn_sessions": "Sessions",
"history_btn_latest": "Open items",
"history_btn_latest": "Latest scan",
"history_picker_empty": "No past scans",
"history_delta_badge": "Delta",
"history_latest_badge": "Latest",
@ -366,7 +366,6 @@
"m365_smtp_recipients_hint": "Comma or semicolon separated",
"m365_smtp_save": "Save",
"m365_smtp_auto_email_manual": "Email report after manual scan",
"m365_smtp_prefer_smtp": "Always send via SMTP (skip Microsoft Graph)",
"m365_smtp_send": "Send now",
"m365_smtp_saved": "Settings saved.",
"m365_smtp_sending": "Sending…",

View File

@ -552,8 +552,6 @@ class M365Connector:
r.raise_for_status()
return True # 204 No Content = success
raise _requests.exceptions.RetryError(f"Gave up after {self._MAX_RETRIES} attempts: {url}")
def delete_message(self, user_id: str, message_id: str) -> bool:
"""Move an email to Deleted Items (soft delete)."""
base = "/me" if (not user_id or user_id == "me") else f"/users/{user_id}"
try:

View File

@ -68,9 +68,6 @@ Exception hierarchy (all inherit `M365Error(Exception)`):
- **Graph preferred over SMTP**`smtp_test` and `send_report` try `_send_email_graph()` first; fall back to SMTP only if Graph raises. If Graph fails and no SMTP host saved, the Graph exception surfaces directly.
- **Auto-email after manual scan**`_maybe_send_auto_email()` in `routes/scan.py` called from the `_run()` thread after `run_scan()` returns. Reads `smtp_cfg.get("auto_email_manual")`; no-ops if false, no flagged items, or no recipients.
- **Gmail vs Google Workspace** — auth error handlers check if SMTP username ends in `@gmail.com`/`@googlemail.com`; custom domains are treated as Google Workspace and error message points to the Workspace admin console.
- **Canonical SMTP config keys are `username` and `use_tls`** — all backend readers (`smtp_test`, `_send_report_email`, `_send_email_graph`) use these. The Settings → E-mailrapport tab (`scheduler.js`) historically saved `user`/`starttls`, which left `username` empty so `server.login()` was skipped and the server rejected the send. Frontend now sends the canonical keys, and `_load_smtp_config()` normalises legacy `user``username` / `starttls``use_tls` for already-saved configs. The send-report modal (`scan.js`) already used the canonical keys. Keep both UIs and the backend on `username`/`use_tls`.
- **Graph 202 ≠ delivered**`_send_email_graph` returns on Graph's HTTP 202 (queued), and `smtp_test`/`send_report` treat that as success and never fall back to SMTP. A recipient on a domain Exchange Online considers an accepted/internal domain (e.g. a Google-hosted subdomain of the O365 domain) is silently dropped after the 202. There is no in-app fix for that routing; reaching such recipients requires SMTP (e.g. Google Workspace `smtp.gmail.com`/`smtp-relay.gmail.com`) or fixing Exchange Accepted Domains.
- **`prefer_smtp` config flag** — when truthy, `smtp_test`, `send_report`, and `_maybe_send_auto_email` (routes/scan.py) skip the Graph path entirely and send via SMTP. This is the in-app escape hatch for the Graph-202 routing trap above. The gate is `... and not smtp_cfg.get("prefer_smtp")` on each Graph branch — keep all three in sync. UI: `#st-smtpPreferSmtp` toggle (key `m365_smtp_prefer_smtp`), saved/loaded by `scheduler.js`.
## Scheduler — scan_scheduler.py + routes/scheduler.py

View File

@ -180,11 +180,7 @@ def db_get_disposition(item_id):
@bp.route("/api/db/flagged")
def db_flagged_items():
"""Return flagged items for the results grid.
With ?ref=N, returns the items from that specific past scan session (history
mode). Without ref, returns every item still awaiting action across all
scans (the default landing view) not just the latest session window.
"""Return flagged items from the most recent completed scan session.
Used by the read-only viewer to load results without an active SSE connection.
Respects viewer_scope.role stored in the session for scoped tokens.
"""
@ -201,13 +197,7 @@ def db_flagged_items():
else:
user_filt = {raw_user.lower()} if raw_user else set()
ref_scan_id = request.args.get("ref", type=int)
if ref_scan_id:
# History mode — a specific past session was requested.
items = _get_db().get_session_items(ref_scan_id=ref_scan_id)
else:
# Default landing / viewer — show every item still awaiting action,
# across all scans, not just the latest session window.
items = _get_db().get_open_items()
# Normalise JSON-encoded columns the same way scan_engine does for SSE cards
import json as _json
out = []

View File

@ -148,12 +148,8 @@ def smtp_test():
"</body></html>"
)
# Try Graph API first — unless the user opted to always use SMTP. Graph
# returns 202 (queued) even for recipients Exchange later silently drops
# (e.g. a Google-hosted subdomain of the O365 domain), so SMTP is the only
# reliable path for those; prefer_smtp forces it.
prefer_smtp = bool(saved.get("prefer_smtp"))
if state.connector and state.connector.is_authenticated() and not prefer_smtp:
# Try Graph API first
if state.connector and state.connector.is_authenticated():
try:
_send_email_graph(subject, body_html, recipients)
return jsonify({"ok": True, "method": "graph", "recipients": recipients})
@ -289,8 +285,8 @@ def send_report():
"</body></html>"
)
# Try Graph API first — unless prefer_smtp is set (see smtp_test for why).
if state.connector and state.connector.is_authenticated() and not smtp_cfg.get("prefer_smtp"):
# Try Graph API first
if state.connector and state.connector.is_authenticated():
try:
_send_email_graph(subject, body_html, recipients,
attachment_bytes=xl_bytes, attachment_name=fname)

View File

@ -54,7 +54,7 @@ def _maybe_send_auto_email():
"</body></html>"
)
if state.connector and state.connector.is_authenticated() and not smtp_cfg.get("prefer_smtp"):
if state.connector and state.connector.is_authenticated():
try:
_send_email_graph(subject, body_html, recipients,
attachment_bytes=xl_bytes, attachment_name=fname)

View File

@ -1078,14 +1078,6 @@ def run_scan(options: dict):
if _check_abort():
# Save checkpoint so scan can be resumed later
_save_checkpoint(ck_key, scanned_ids, _state.flagged_items, _state.scan_meta)
# Finalise the DB scan record so items found before the stop stay
# visible — this early return otherwise skips finish_scan below,
# stranding them (invisible to get_session_items / get_open_items).
if _db and _db_scan_id:
try:
_db.finish_scan(_db_scan_id, resumed_count + idx + 1)
except Exception as _e:
logger.error("[db] finish_scan (aborted) failed: %s", _e)
return
idx += 1
kind, meta, _ = _work_q.popleft() # releases this item from the deque immediately

View File

@ -40,19 +40,13 @@ Never revert to `!!window._googleConnected` / `_fileSources.length > 0` — thos
## Scan history browser — history.js + results.js
- **`S._historyRefScanId`** — `null` = live/SSE mode **or** the default open-items view; positive int = viewing a past session. Set by `loadHistorySession()`; cleared by `exitHistoryMode()`.
- **`loadHistorySession(null)``loadOpenItems()`** — passing `null` no longer resolves to the latest session. It now loads **all open (unactioned) items across every scan** via `GET /api/db/flagged` (no `ref`), leaves `_historyRefScanId` null, and shows no history banner. The "Open items" banner button (`onclick="loadHistorySession(null)"`, key `history_btn_latest`) therefore returns to this open-items view. Specific sessions are still loaded with a positive `ref`, which keeps the re-scan resolved-diff. Do not revert `null` to "resolve latest ref" — that reintroduces the "only the last scan is shown" complaint.
- **`S._historyRefScanId`** — `null` = live/SSE mode; positive int = viewing a past session. Set by `loadHistorySession()`; cleared by `exitHistoryMode()`.
- **Auto-load on page load**`_sseWatchdog()` in `results.js` calls `window.loadHistorySession?.(null)` whenever `/api/scan/status` reports neither `running` (M365 + file lock) nor `google_running` (Google lock) **and** nothing is shown yet (`!S._historyRefScanId && !S.flaggedData.length`). This is **not one-shot** — it retries on every 4s poll until a session is restored, because (a) the replay buffer is empty after a server restart so `sse_replay_done` never fires, and (b) a completed scan's replayed `scan_phase` can leave a running flag set that would otherwise block the load forever. Because both locks are confirmed free, the watchdog clears the stale `_m365/_google/_fileScanRunning` flags before calling. Do not revert to a one-shot `_initialStatusChecked` gate — that reintroduces the "blank grid after refresh/restart" bug. `/api/scan/status` **must** report `google_running` separately; `running` alone misses live Google scans. The `sse_replay_done` handler in `scan.js` still retries for the non-empty-buffer (no-restart) case.
- **History banner** (`#historyBanner`) — shown when `S._historyRefScanId` is set. Do not hide/show from outside `history.js`.
- **Session picker** (`#historyDropdown`) — rendered inside `[data-history-wrap]` so the outside-click handler works correctly. Do not move the picker outside this wrapper.
- **Cache invalidation**`invalidateHistoryCache()` clears `_sessions` and `_latestRefScanId`. All three `*_done` SSE handlers call `window.invalidateHistoryCache?.()`.
- **Re-scan diff** — items present in the previous session but absent from the current one are tagged `_resolved: true`, rendered with `.card-resolved` and a green ✓ badge, and NOT added to `S.flaggedData` (grid-only, cannot be bulk-selected or exported).
- **Mode transitions**`startScan()` calls `window.exitHistoryMode?.()` before clearing the grid.
- **`renderGrid(files)` hides the landing cards** — whenever `files.length > 0` it hides `#emptyState` and `#lastScanSummary` and shows `#grid`. This is centralised here because the live `scan_file_flagged` handler (`scan.js`) shows the grid but does NOT clear those panels, so results would render *underneath* a still-visible landing/last-scan card until a manual refresh. Do not move this hiding back into individual callers — every render path (live SSE, `loadOpenItems`, history, filters) must clear the landing. The empty case (`files.length === 0`) is left untouched so callers still control the empty/landing state.
## Card user/group badge — results.js
- **`_accountPill(f)`** builds the account/role pill for both card layouts (list + grid). The **group badge is driven by `f.user_role`** (`student`/`staff`) alone, so it renders even with no display name — items from scans saved before `account_name` was persisted (DB migration 11) have only `user_role` + `account_id`. The user label resolves best-effort: `f.account_name``S._allUsers` match (by `id` or `email`) → email-style `account_id` → omit. Do not re-nest the role badge inside an `account_name` check (the old bug) — that hides the group badge for legacy items. Both layouts call `_accountPill(f)`; keep them sharing the one helper.
## CPR cross-referencing — results.js

View File

@ -38,50 +38,20 @@ function invalidateHistoryCache() {
// ── Load a session into the results grid ──────────────────────────────────────
// Default landing view: every flagged item still awaiting action, across all
// scans (not just the latest session). Leaves S._historyRefScanId null (live
// mode) and shows no history banner — this is "now", not a past session.
async function loadOpenItems() {
// Bail if a scan is running — live SSE owns the grid then.
async function loadHistorySession(refScanId) {
// refScanId: null → latest session, positive int → specific session
let resolvedRef = refScanId;
if (resolvedRef === null) {
const sessions = _sessions !== null ? _sessions : await _fetchSessions();
// Bail if a scan started while we were fetching sessions
if (S._m365ScanRunning || S._googleScanRunning || S._fileScanRunning) return;
try {
const r = await fetch('/api/db/flagged');
const items = await r.json();
if (S._m365ScanRunning || S._googleScanRunning || S._fileScanRunning) return;
closeHistoryPicker();
if (!Array.isArray(items) || items.length === 0) {
S._historyRefScanId = null;
_setHistoryBanner(false);
if (!sessions.length) {
// No scans in DB — nothing to show
window.loadLastScanSummary?.();
return;
}
S._historyRefScanId = null;
S.flaggedData = items;
S.filteredData = [];
const grid = document.getElementById('grid');
const emptyState = document.getElementById('emptyState');
const lastScan = document.getElementById('lastScanSummary');
if (emptyState) emptyState.style.display = 'none';
if (lastScan) lastScan.style.display = 'none';
if (grid) { grid.innerHTML = ''; grid.style.display = 'grid'; }
window.renderGrid(items);
try { window.markOverdueCards(); } catch(_) {}
try { window.loadTrend(); } catch(_) {}
_setHistoryBanner(false);
} catch(e) {
console.error('[history] failed to load open items:', e);
resolvedRef = sessions[0].ref_scan_id;
}
}
async function loadHistorySession(refScanId) {
// refScanId: null → all open (unreviewed) items across every scan,
// positive int → a specific past session
if (refScanId === null) return loadOpenItems();
const resolvedRef = refScanId;
try {
const r = await fetch('/api/db/flagged?ref=' + resolvedRef);

View File

@ -25,31 +25,6 @@ const SOURCE_BADGES = {
smb: ['🌐', 'badge-smb', 'Network'],
};
// Build the user/group pill for a card. The group (role) badge is driven by
// user_role alone so it shows even when no display name is available — e.g.
// items from earlier scans saved before account_name was persisted. For those
// the user label is resolved best-effort from the loaded user list (by id or
// email), falling back to an email-style account_id. Returns '' when there is
// neither a label nor a role to show.
function _accountPill(f) {
const roleBadge =
f.user_role === 'student' ? '<span class="role-badge">' + t('role_student', 'Elev') + '</span>' :
f.user_role === 'staff' ? '<span class="role-badge">' + t('role_staff', 'Ansat') + '</span>' : '';
let label = f.account_name || '';
if (!label && f.account_id) {
const aid = String(f.account_id);
const u = (S._allUsers || []).find(function(u) {
return u.id === f.account_id ||
(u.email && u.email.toLowerCase() === aid.toLowerCase());
});
if (u) label = u.displayName || '';
else if (aid.includes('@')) label = aid; // an email is already human-readable
}
if (!label && !roleBadge) return '';
const title = label || f.user_role || '';
return '<span class="account-pill" title="' + esc(title) + '">' + roleBadge + (label ? esc(label) : '') + '</span>';
}
function appendCard(f) {
const search = document.getElementById('filterSearch').value.trim().toLowerCase();
const srcVal = document.getElementById('filterSource').value;
@ -86,7 +61,6 @@ function appendCard(f) {
(f.source_type === 'smb' || f.source_type === 'sftp') ? _redactExts.has(_fileExt) : false
);
const redactBtn = _redactable ? `<button class="card-redact-btn" title="${t('redact_btn','Redact CPR')}" onclick="event.stopPropagation();redactItem(${JSON.stringify(f).replace(/"/g,'&quot;')},this.closest('.card'))">✏</button>` : '';
const acctPill = _accountPill(f);
if (S.isListView) {
card.innerHTML = `
@ -94,7 +68,7 @@ function appendCard(f) {
<div class="card-info list-info">
<div class="card-name" title="${esc(f.name)}">${esc(f.name)}</div>
<div class="card-meta">${f.size_kb} KB · ${esc(f.modified || '')}${f.folder ? ' · 📂 ' + esc(f.folder) : ''}</div>
<div class="card-source"><span class="source-badge ${badgeCls}">${esc(label)}</span> ${esc(f.source || '')}${acctPill ? ' · ' + acctPill : ''}${f.transfer_risk === 'external-recipient' ? ' <span class="role-pill" style="background:#7B2D00;color:#FFD0B0"> Ext.</span>' : f.transfer_risk ? ' <span class="role-pill" style="background:#003D7B;color:#B0D4FF">🔗</span>' : ''}</div>
<div class="card-source"><span class="source-badge ${badgeCls}">${esc(label)}</span> ${esc(f.source || '')}${f.account_name ? ' · <span class="account-pill" title="' + esc(f.account_name) + '">' + (f.user_role === 'student' ? '<span class="role-badge">' + t('role_student','Elev') + '</span>' : f.user_role === 'staff' ? '<span class="role-badge">' + t('role_staff','Ansat') + '</span>' : '') + esc(f.account_name) + '</span>' : ''}${f.transfer_risk === 'external-recipient' ? ' <span class="role-pill" style="background:#7B2D00;color:#FFD0B0"> Ext.</span>' : f.transfer_risk ? ' <span class="role-pill" style="background:#003D7B;color:#B0D4FF">🔗</span>' : ''}</div>
</div>
<span class="cpr-badge">${f.cpr_count} CPR</span>
${f.email_count > 0 ? '<span class="email-badge">' + f.email_count + ' ' + t('m365_badge_emails', 'e-mail') + '</span> ' : ''}
@ -110,7 +84,7 @@ function appendCard(f) {
<div class="card-name" title="${esc(f.name)}">${esc(f.name)}</div>
<div class="card-meta">${f.size_kb} KB · ${esc(f.modified || '')}</div>
${f.folder ? `<div class="card-meta" style="font-size:10px" title="${esc(f.folder)}">📂 ${esc(f.folder)}</div>` : ''}
<div class="card-source"><span class="source-badge ${badgeCls}">${esc(label)}</span>${acctPill ? ' ' + acctPill : ''}${f.transfer_risk === "external-recipient" ? ' <span class="role-pill" style="background:#7B2D00;color:#FFD0B0"> Ext.</span>' : f.transfer_risk ? ' <span class="role-pill" style="background:#003D7B;color:#B0D4FF">🔗</span>' : ''}</div>
<div class="card-source"><span class="source-badge ${badgeCls}">${esc(label)}</span>${f.account_name ? ' <span class="account-pill" title="' + esc(f.account_name) + '">' + (f.user_role === "student" ? '<span class="role-badge">' + t("role_student","Elev") + "</span>" : f.user_role === "staff" ? '<span class="role-badge">' + t("role_staff","Ansat") + "</span>" : "") + esc(f.account_name) + '</span>' : ''}${f.transfer_risk === "external-recipient" ? ' <span class="role-pill" style="background:#7B2D00;color:#FFD0B0"> Ext.</span>' : f.transfer_risk ? ' <span class="role-pill" style="background:#003D7B;color:#B0D4FF">🔗</span>' : ''}</div>
<span class="cpr-badge">${f.cpr_count} CPR</span>${f.email_count > 0 ? ' <span class="email-badge">' + f.email_count + ' ' + t('m365_badge_emails', 'e-mail') + '</span>' : ''}${f.phone_count > 0 ? ' <span class="phone-badge">' + f.phone_count + ' ' + t('m365_badge_phones', 'tlf.') + '</span>' : ''}${f.face_count > 0 ? ' <span class="photo-face-badge">' + f.face_count + ' ' + t('m365_badge_faces', f.face_count === 1 ? 'face' : 'faces') + '</span>' : ''}${f.exif && f.exif.gps ? ' <span class="photo-face-badge" style="background:#0a3a5a;color:#7ec8d0">🌍 GPS</span>' : ''}${f._deleted ? ' <span class="resolved-badge" style="background:#3a1a1a;color:#ff9b9b">🗑 ' + t('delete_badge', 'Deleted') + '</span>' : ''}${f._redacted ? ' <span class="resolved-badge"> ' + t('redact_badge', 'Redacted') + '</span>' : ''}${f._resolved ? ' <span class="resolved-badge"> ' + t('history_resolved_badge', 'Resolved') + '</span>' : ''}${f.overdue ? ' <span class="overdue-badge">🗓 Overdue</span>' : ''}
</div>
${delBtn}${redactBtn}`;
@ -122,17 +96,6 @@ function renderGrid(files) {
const grid = document.getElementById('grid');
grid.innerHTML = '';
files.forEach(f => appendCard(f));
// Whenever results are rendered, the landing/last-scan cards must be hidden —
// the live scan_file_flagged path shows the grid but does not clear them, so
// results would otherwise appear underneath the still-visible landing page
// until a manual refresh. Centralised here so every render path is covered.
if (files && files.length) {
const es = document.getElementById('emptyState');
if (es) es.style.display = 'none';
const ls = document.getElementById('lastScanSummary');
if (ls) ls.style.display = 'none';
if (grid) grid.style.display = S.isListView ? 'block' : 'grid';
}
_updateBulkBar();
updateDispositionStats();
}

View File

@ -314,17 +314,15 @@ function stLoadSmtp() {
const set = function(id, val) { const el=document.getElementById(id); if(el) el.value=val||''; };
set('st-smtpHost', d.host);
set('st-smtpPort', d.port || 587);
set('st-smtpUser', d.username);
set('st-smtpUser', d.user);
set('st-smtpFrom', d.from_addr);
set('st-smtpTo', Array.isArray(d.recipients) ? d.recipients.join(', ') : (d.recipients||''));
const tls = document.getElementById('st-smtpTls');
if (tls) tls.checked = d.use_tls !== false;
if (tls) tls.checked = d.starttls !== false;
const pw = document.getElementById('st-smtpPw');
if (pw) pw.value = d.has_password ? '\u2022\u2022\u2022\u2022\u2022\u2022\u2022\u2022' : '';
const ae = document.getElementById('st-smtpAutoEmail');
if (ae) ae.checked = !!d.auto_email_manual;
const ps = document.getElementById('st-smtpPreferSmtp');
if (ps) ps.checked = !!d.prefer_smtp;
}).catch(function(){});
}
@ -335,15 +333,11 @@ async function stSmtpSave() {
const body = {
host: document.getElementById('st-smtpHost').value.trim(),
port: parseInt(document.getElementById('st-smtpPort').value) || 587,
// Backend (routes/email.py) reads these exact keys — `username`/`use_tls`,
// not `user`/`starttls`. Sending the wrong keys leaves username empty so
// server.login() is skipped and the SMTP server rejects the send.
username: document.getElementById('st-smtpUser').value.trim(),
user: document.getElementById('st-smtpUser').value.trim(),
from_addr: document.getElementById('st-smtpFrom').value.trim(),
recipients: document.getElementById('st-smtpTo').value.split(/[,;]/).map(function(s){return s.trim();}).filter(Boolean),
use_tls: document.getElementById('st-smtpTls').checked,
starttls: document.getElementById('st-smtpTls').checked,
auto_email_manual: !!(document.getElementById('st-smtpAutoEmail') || {}).checked,
prefer_smtp: !!(document.getElementById('st-smtpPreferSmtp') || {}).checked,
};
if (pw !== null) body.password = pw;
st.style.color = 'var(--muted)'; st.textContent = t('m365_smtp_saving','Saving...');

View File

@ -375,7 +375,7 @@ document.addEventListener('DOMContentLoaded', applyI18n);
<button id="historyPickerBtn" type="button" onclick="openHistoryPicker()" style="height:24px;padding:0 10px;background:none;border:1px solid var(--border);color:var(--muted);border-radius:4px;font-size:11px;cursor:pointer" data-i18n="history_btn_sessions">Sessions</button>
<div id="historyDropdown" style="display:none;position:absolute;right:0;top:calc(100% + 4px);background:var(--surface);border:1px solid var(--border);border-radius:6px;z-index:9999;width:300px;max-height:260px;overflow-y:auto;box-shadow:0 4px 12px rgba(0,0,0,.25)"></div>
</div>
<button id="historyLatestBtn" type="button" onclick="loadHistorySession(null)" style="display:none;height:24px;padding:0 10px;background:none;border:1px solid var(--accent);color:var(--accent);border-radius:4px;font-size:11px;cursor:pointer;flex-shrink:0" data-i18n="history_btn_latest">Open items</button>
<button id="historyLatestBtn" type="button" onclick="loadHistorySession(null)" style="display:none;height:24px;padding:0 10px;background:none;border:1px solid var(--accent);color:var(--accent);border-radius:4px;font-size:11px;cursor:pointer;flex-shrink:0" data-i18n="history_btn_latest">Latest scan</button>
</div>
<!-- Filter bar — full width, above grid + preview -->
@ -845,10 +845,6 @@ document.addEventListener('DOMContentLoaded', applyI18n);
<label data-i18n="m365_smtp_auto_email_manual">Email report after manual scan</label>
<label class="toggle" style="flex:unset"><input type="checkbox" id="st-smtpAutoEmail"><span class="toggle-slider"></span></label>
</div>
<div class="settings-row">
<label data-i18n="m365_smtp_prefer_smtp">Always send via SMTP (skip Microsoft Graph)</label>
<label class="toggle" style="flex:unset"><input type="checkbox" id="st-smtpPreferSmtp"><span class="toggle-slider"></span></label>
</div>
<div style="display:flex;justify-content:flex-end;gap:8px;margin-top:4px">
<div id="st-smtpStatus" style="flex:1;font-size:11px;color:var(--muted);align-self:center"></div>
<button onclick="stSmtpSave()" style="background:none;border:1px solid var(--border);color:var(--muted);height:26px;padding:0 12px;border-radius:6px;font-size:12px;cursor:pointer;box-sizing:border-box" data-i18n="btn_save">Save</button>

View File

@ -252,36 +252,3 @@ class TestFernet:
def test_decrypt_empty_returns_empty(self):
result = app_config._decrypt_password("")
assert result == ""
class TestSmtpConfigLegacyKeys:
"""SMTP config saved by the older settings tab used `user`/`starttls`;
readers expect `username`/`use_tls`. _load_smtp_config must normalise them."""
def test_legacy_keys_normalised_on_load(self, tmp_path, monkeypatch):
import json
p = tmp_path / "smtp.json"
p.write_text(json.dumps({
"host": "smtp.gmail.com", "port": 587,
"user": "netadmin@adm.example.dk", # legacy key
"starttls": True, # legacy key
"from_addr": "netadmin@adm.example.dk",
"recipients": ["a@example.dk"],
}), encoding="utf-8")
monkeypatch.setattr(app_config, "_SMTP_CONFIG_PATH", p)
cfg = app_config._load_smtp_config()
assert cfg["username"] == "netadmin@adm.example.dk"
assert cfg["use_tls"] is True
def test_canonical_keys_take_precedence(self, tmp_path, monkeypatch):
import json
p = tmp_path / "smtp.json"
p.write_text(json.dumps({
"username": "canonical@example.dk",
"user": "legacy@example.dk",
}), encoding="utf-8")
monkeypatch.setattr(app_config, "_SMTP_CONFIG_PATH", p)
cfg = app_config._load_smtp_config()
assert cfg["username"] == "canonical@example.dk"

View File

@ -265,71 +265,3 @@ class TestExportImport:
tgt.import_db(str(export_path), mode="replace")
results = tgt.lookup_data_subject("290472-1234")
assert len(results) >= 1
# ─────────────────────────────────────────────────────────────────────────────
# Orphan-scan recovery (crash / kill / mid-scan restart)
# ─────────────────────────────────────────────────────────────────────────────
class TestOrphanScanRecovery:
def _start_unfinished_scan(self, db, item_id):
"""Begin a scan and save an item but never call finish_scan."""
sid = db.begin_scan({"sources": ["email"], "user_ids": []})
db.save_item(sid, _make_card(item_id=item_id))
return sid
def test_unfinished_scan_items_hidden_until_recovery(self, tmp_db):
self._start_unfinished_scan(tmp_db, "orphan-1")
# Not finalised → invisible to the open-items view
assert tmp_db.get_open_items() == []
def test_recovery_finalises_and_reveals_items(self, tmp_db):
self._start_unfinished_scan(tmp_db, "orphan-1")
self._start_unfinished_scan(tmp_db, "orphan-2")
recovered = tmp_db.finalize_orphan_scans()
assert recovered == 2
ids = {row["id"] for row in tmp_db.get_open_items()}
assert ids == {"orphan-1", "orphan-2"}
def test_recovery_leaves_finished_scans_untouched(self, tmp_db):
sid = tmp_db.begin_scan({"sources": ["email"], "user_ids": []})
tmp_db.save_item(sid, _make_card(item_id="done-1"))
tmp_db.finish_scan(sid, total_scanned=1)
before = tmp_db._connect().execute(
"SELECT finished_at FROM scans WHERE id=?", (sid,)
).fetchone()[0]
assert tmp_db.finalize_orphan_scans() == 0 # nothing to recover
after = tmp_db._connect().execute(
"SELECT finished_at FROM scans WHERE id=?", (sid,)
).fetchone()[0]
assert after == before # finished_at not rewritten
def test_recovery_is_idempotent(self, tmp_db):
self._start_unfinished_scan(tmp_db, "orphan-1")
assert tmp_db.finalize_orphan_scans() == 1
assert tmp_db.finalize_orphan_scans() == 0
# ─────────────────────────────────────────────────────────────────────────────
# account_name persistence (user/group badge data)
# ─────────────────────────────────────────────────────────────────────────────
class TestAccountNamePersistence:
def test_account_name_round_trips(self, tmp_db):
sid = tmp_db.begin_scan({"sources": ["email"], "user_ids": []})
tmp_db.save_item(sid, _make_card(item_id="an-1")) # account_name="Test User"
tmp_db.finish_scan(sid, total_scanned=1)
row = [r for r in tmp_db.get_open_items() if r["id"] == "an-1"][0]
assert row.get("account_name") == "Test User"
def test_account_name_column_exists(self, tmp_db):
cols = [r[1] for r in tmp_db._connect().execute(
"PRAGMA table_info(flagged_items)").fetchall()]
assert "account_name" in cols

View File

@ -270,49 +270,6 @@ class TestFlaggedScopeEnforcement:
ids = {row["id"] for row in r.get_json()}
assert "ci1" in ids
def test_no_ref_returns_open_items_across_all_sessions(self, client, db_patch):
# Two scans in separate session windows. The default (no-ref) view must
# surface unactioned items from BOTH, not just the latest session.
old_id = _seed_scan(db_patch, [_item("o1")])
db_patch._connect().execute(
"UPDATE scans SET started_at = started_at - 400 WHERE id = ?", (old_id,)
)
db_patch._connect().commit()
_seed_scan(db_patch, [_item("o2")])
r = client.get("/api/db/flagged")
ids = {row["id"] for row in r.get_json()}
assert ids == {"o1", "o2"}
def test_no_ref_excludes_items_with_a_disposition(self, client, db_patch):
_seed_scan(db_patch, [_item("d1"), _item("d2")])
db_patch.set_disposition("d1", "kept")
r = client.get("/api/db/flagged")
ids = {row["id"] for row in r.get_json()}
assert "d2" in ids # untouched → still open
assert "d1" not in ids # action taken → hidden
def test_no_ref_unreviewed_disposition_stays_open(self, client, db_patch):
_seed_scan(db_patch, [_item("u1")])
db_patch.set_disposition("u1", "unreviewed")
r = client.get("/api/db/flagged")
ids = {row["id"] for row in r.get_json()}
assert "u1" in ids # 'unreviewed' status is not an action
def test_no_ref_dedupes_rescanned_item_to_latest(self, client, db_patch):
# Same item flagged by two scans → appears once.
old_id = _seed_scan(db_patch, [_item("k1")])
db_patch._connect().execute(
"UPDATE scans SET started_at = started_at - 400 WHERE id = ?", (old_id,)
)
db_patch._connect().commit()
_seed_scan(db_patch, [_item("k1")])
rows = [row for row in client.get("/api/db/flagged").get_json() if row["id"] == "k1"]
assert len(rows) == 1
def test_ref_param_loads_historical_session(self, client, db_patch):
# Push first scan >300 s into the past so it occupies its own session window.
old_id = _seed_scan(db_patch, [_item("h1")])