The default results view loaded only the latest scan session (±300s window), so items dropped out of sight once a newer scan started — and a long scheduled scan could show little or nothing on browser open. Add get_open_items(): every flagged item with no disposition (or status 'unreviewed') across all scans, deduped by id to the latest finished scan. GET /api/db/flagged now serves it when no ?ref is given; ?ref=N still loads a specific past session. Frontend loadHistorySession(null) routes to a new loadOpenItems() loader. Rename the banner button to "Open items" (da/de/en). get_session_items() default is unchanged — export.py and scan_scheduler.py still rely on latest-session for the current scan's report/email. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
12 KiB
static/js — JS Rules
Profile dropdown — loader model
Profiles are loaders, not persistent modes. Selecting one pushes settings into the sidebar; the sidebar is always the live state.
_setProfileClearBtn(visible)must be called alongside every assignment toS._activeProfileId.- Do not re-add a selectable
value=""option to#profileSelect— deliberately removed in v1.6.6.
Profile editor source panel race condition
_pmgmtSaveFullEdit detects whether Google/file checkboxes have rendered by querying the DOM directly:
const googleRendered = !!document.querySelector('#peSourcesPanel input[data-source-type="google"]');
const fileRendered = !!document.querySelector('#peSourcesPanel input[data-source-type="file"]');
Never revert to !!window._googleConnected / _fileSources.length > 0 — those async proxies can be true before the panel has rendered, silently clearing the user's source selection on save.
Progress bar phase parsing
_setProgressPhase(phase) in scan.js parses the phase string against _PHASE_SOURCE_MAP:
- Source found and
—(em-dash) present → split, resolve via_resolveDisplayName(), updateS._progressCurrentUser. - Source found but no dash → show pill +
S._progressCurrentUser(handles sub-phases like folder counts). - No source match → plain text fallback.
_PHASE_SOURCE_MAP ordering matters — Google Workspace must appear before Gmail in the map. The email regex uses /iu flags — do not drop the i.
Profile startup race conditions — profiles.js + users.js
loadProfiles() (fast, local file) resolves before loadUsers() (slow, Graph API). The user can select a profile before S._allUsers or the sources panel is populated.
user_ids = "all"must be deferred — ifS._allUsersis empty when_applyProfile()runs, setwindow._pendingProfileAllUsers = trueinstead of calling.forEach()on an empty array.loadUsers()checks this flag after populatingS._allUsersand selects everyone. Do not remove this — reverting will silently leave all accounts unchecked whenever a profile is chosen on a fast machine before the user list loads.- Source checkboxes may not exist yet —
_applyProfile()callsrenderSourcesPanel()first if#sourcesPanelcontains noinput[data-source-id]nodes. Same guard used inloadUsers(). Without it,querySelectorAllreturns nothing and the profile's source selection is discarded; the nextrenderSourcesPanel()call re-renders all sources as checked (their default).
SSE teardown — scan.js
- Do not close
S.esinscan_doneif other scans are still running — M365 (scan_done), Google (google_scan_done), and File (file_scan_done) each emit their own done event. CloseS.esonly when all concurrent scans have finished:scan_donechecks!S._googleScanRunning && !S._fileScanRunning;google_scan_donechecks!S._m365ScanRunning && !S._fileScanRunning;file_scan_donechecks!S._m365ScanRunning && !S._googleScanRunning. - Scheduled scans —
S._userStartedScanis false for scheduler-triggered runs, so SSE is never closed and future scheduler events continue to arrive. - Two separate abort events —
state._scan_abort(M365 + file) andstate._google_scan_abort(Google).POST /api/scan/stopsets both._check_abort()inside_run_google_scanmust use the module-level_scan_abortalias (= state._google_scan_abort), notgdpr_scanner._scan_abort. _check_abort()emitsgoogle_scan_done, notscan_cancelled—scan_cancelledunconditionally closes the SSE;google_scan_donechecks whether other scans are still running before closing.scan_phasereplay sets running flags — handled bysse_replay_done— thescan_phasehandler sets running flags totruewhenever all flags arefalseand a source keyword is found in the phase text. On page refresh this fires during SSE replay of a completed scan, temporarily making the scan appear running. Thesse_replay_donehandler retriesloadHistorySession(null)if no scan is running andS._historyRefScanIdis stillnullafter replay. Do not remove either the flag-setting logic or the retry.- Google Drive uses a lazy generator, not
list()—iter_drive_files()iterated directly so_check_abort()fires between items. Wrapping inlist()blocks the thread for the entire enumeration.
Scan history browser — history.js + results.js
S._historyRefScanId—null= live/SSE mode or the default open-items view; positive int = viewing a past session. Set byloadHistorySession(); cleared byexitHistoryMode().loadHistorySession(null)→loadOpenItems()— passingnullno longer resolves to the latest session. It now loads all open (unactioned) items across every scan viaGET /api/db/flagged(noref), leaves_historyRefScanIdnull, and shows no history banner. The "Open items" banner button (onclick="loadHistorySession(null)", keyhistory_btn_latest) therefore returns to this open-items view. Specific sessions are still loaded with a positiveref, which keeps the re-scan resolved-diff. Do not revertnullto "resolve latest ref" — that reintroduces the "only the last scan is shown" complaint.- Auto-load on page load —
_sseWatchdog()inresults.jscallswindow.loadHistorySession?.(null)whenever/api/scan/statusreports neitherrunning(M365 + file lock) norgoogle_running(Google lock) and nothing is shown yet (!S._historyRefScanId && !S.flaggedData.length). This is not one-shot — it retries on every 4s poll until a session is restored, because (a) the replay buffer is empty after a server restart sosse_replay_donenever fires, and (b) a completed scan's replayedscan_phasecan leave a running flag set that would otherwise block the load forever. Because both locks are confirmed free, the watchdog clears the stale_m365/_google/_fileScanRunningflags before calling. Do not revert to a one-shot_initialStatusCheckedgate — that reintroduces the "blank grid after refresh/restart" bug./api/scan/statusmust reportgoogle_runningseparately;runningalone misses live Google scans. Thesse_replay_donehandler inscan.jsstill retries for the non-empty-buffer (no-restart) case. - History banner (
#historyBanner) — shown whenS._historyRefScanIdis set. Do not hide/show from outsidehistory.js. - Session picker (
#historyDropdown) — rendered inside[data-history-wrap]so the outside-click handler works correctly. Do not move the picker outside this wrapper. - Cache invalidation —
invalidateHistoryCache()clears_sessionsand_latestRefScanId. All three*_doneSSE handlers callwindow.invalidateHistoryCache?.(). - Re-scan diff — items present in the previous session but absent from the current one are tagged
_resolved: true, rendered with.card-resolvedand a green ✓ badge, and NOT added toS.flaggedData(grid-only, cannot be bulk-selected or exported). - Mode transitions —
startScan()callswindow.exitHistoryMode?.()before clearing the grid.
CPR cross-referencing — results.js
_loadRelated(f)— async; hides#previewRelatediff.cpr_countis 0, otherwise fetches/api/db/related/<id>?ref=Nand renders a clickable list with per-item shared-CPR badge. Called fromopenPreview.window._openRelated(id, itemData)— looks upidinS.flaggedDatafirst, falls back toitemDatafrom the API response for items not yet in the grid.
Sources panel resize — log.js + sources.js
_fitSourcesPanel()— called at the end of everyrenderSourcesPanel(). Clears inline height, readsscrollHeight, then restores a saved preference fromlocalStorage(gdpr_sources_h) or pins toscrollHeight._initSourcesResize()— attaches pointer-drag to#sourcesResizeHandle. CapturesscrollHeightas hard max onpointerdown; saves tolocalStorageon release.- Do not add a fixed
max-heightorheightto#sourcesPanelin HTML — height controlled entirely by_fitSourcesPanel()at runtime. - Do not call
_fitSourcesPanel()before the panel has rendered —scrollHeightwill be 0.
Viewer mode — viewer.js
window.VIEWER_MODE— injected by Jinja2.auth.jsaddsviewer-modeclass to<body>; all hide rules are CSS (body.viewer-mode …) exceptdelBtnwhich is also guarded in JS.window.VIEWER_SCOPE— injected alongsideVIEWER_MODE. IfVIEWER_SCOPE.roleis set,auth.jspre-sets#filterRoleand hides the dropdown.- Token onclick attributes — Copy/Revoke buttons pass the token as a single-quoted JS string literal, never via
JSON.stringify(which produces double-quoted strings that breakonclick="…"attributes). - Share link base URL —
_getShareBaseUrl()useswindow.location.originwhenever the page is served over HTTPS or from a non-localhost host (a reverse-proxied hostname or LAN IP is already routable, and rewriting it tohttp://<LAN-IP>would bypass the proxy's TLS). Only when browsing atlocalhost/127.0.0.1over HTTP does it fetch/api/local_ip(LAN IP via UDP probe to8.8.8.8) so copied links work from other machines. The result is cached in_shareBaseUrlso Copy buttons stay within the click gesture. BothcreateShareLinkandcopyTokenLinkareasync. Do not make it return barewindow.location.originunconditionally — that reintroduces unusable127.0.0.1links. - Settings Security pane — Admin PIN and Viewer PIN groups live in
stPaneSecurity.switchSettingsTab('security')triggers bothstLoadPinStatus()andstLoadViewerPinStatus().
Gotchas
-
navigator.clipboardisundefinedover plain HTTP — the app is normally reached athttp://<LAN-IP>:5100, a non-secure context where the Clipboard API does not exist, so callingnavigator.clipboard.writeText(...)throws synchronously (a.catch()on it never runs). Always copy viawindow._copyText(text, btn)(defined inviewer.js) — it feature-detects the API and falls back todocument.execCommand('copy'), then to aprompt(). BecauseexecCommandneeds a user gesture, don'tawaitnetwork calls between the click and the copy;_getShareBaseUrl()caches its result for this reason. -
scheduler.jsstrings must uset()— frequency labels, "Next", "Running...", "Disabled", empty-job text, and empty-history text all have translation keys. Do not hard-code English strings inschedLoad()orschedRenderJobs(). -
Scheduler UI —
schedToggleReportOnly()— dims the Profile row, shows/hides#schedReportOnlyHint, and forces#schedAutoEmailchecked. Called from the checkboxonchangehandler and at the start ofschedAddJob()/schedEditJob(). -
Profile editor accounts — default to unchecked. Only explicitly saved
user_idsare checked. -
Date presets — stored as
years * 365(integer days). Do not use* 365.25. -
copyTokenLinkis async — called fromonclickas fire-and-forget. Do not make it synchronous. -
Escape scan-derived strings with
esc()—results.jsdefinesesc()(escapes& < > " '). Every value that originates from scanned content (f.name,f.account_name,f.folder,f.source,f.modified,label, imagealt, and the same fields onitem/related rows) must pass throughesc()before going intoinnerHTMLor atitle=/alt=attribute. These are attacker-influenceable (e.g. a file named with markup), so an unescaped interpolation is stored XSS — including in shared read-only viewer sessions. Numeric counts (cpr_count,size_kb) don't need it. When embedding an object in anonclickpayload, also.replace(/"/g,'"')theJSON.stringify(...).