diff --git a/CHANGELOG.md b/CHANGELOG.md index b71234e..b7009f7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -24,6 +24,8 @@ Version numbers follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html ### Fixed +- **OneDrive 404 errors during delta scans** — `GET /users/{id}/drive/root/delta` returns 404 for users with no OneDrive licence, a disabled service plan, a drive that was never provisioned (account never signed in), or a suspended account. Previously these 404s fell through to `requests.raise_for_status()` and were caught by the generic `except Exception` handler in `_scan_user_onedrive`, broadcasting a red `scan_error` card. Full scans never showed the error because `_iter_drive_folder_for` has a bare `except Exception: return`. Fixed by adding `M365DriveNotFound(M365Error)` to `m365_connector.py`, raising it from `_get()` on HTTP 404, and handling it explicitly in `_scan_user_onedrive` with a `scan_phase` broadcast ("OneDrive (user): not provisioned — skipped") before the generic exception handler. + - **CI — Windows artifact never uploaded** — PyInstaller `--onedir` puts the exe inside `dist/GDPRScanner/`, not at `dist/*.exe`. The artifact glob never matched, so no Windows build appeared in releases. A PowerShell packaging step now zips `dist\GDPRScanner\` into `GDPRScanner_windows_x64.zip` (mirroring the existing Linux step). - **`EFFORT_ESTIMATE.md`** — build effort estimate document covering component-by-component hour breakdowns and complexity drivers for the project. - **Settings → Security tab** — new dedicated pane in the Settings modal. Admin PIN and Viewer PIN groups moved here from the General tab, which now contains only Appearance and About. The Share modal's **Configure** button navigates directly to the Security tab. diff --git a/CLAUDE.md b/CLAUDE.md index 66d0617..c8e4860 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -73,6 +73,20 @@ Both options live in the profile `options` dict and apply to **all three scan en - **File scan** reads both from `source` dict keys (passed directly from the `/api/file_scan/start` payload). **M365 scan** reads both from `scan_opts = options.get("options", {})`. Both paths apply the same `_cpr_qualifies` / `_exif_has_pii` logic before the flagging gate. - **UI:** sidebar controls `#optSkipGps` (toggle) and `#optMinCpr` (number); profile editor controls `#peOptSkipGps` and `#peOptMinCpr`. Both are saved/loaded by `profiles.js`. +## M365 connector exceptions — m365_connector.py + +Exception hierarchy (all inherit `M365Error(Exception)`): + +| Exception | Trigger | Handler | +|---|---|---| +| `M365PermissionError` | 403 Forbidden | `scan_error` broadcast with human-readable permission hint | +| `M365DeltaTokenExpired` | 410 Gone on delta endpoint | Caller clears token and falls back to full scan | +| `M365DriveNotFound` | 404 Not Found on any path | `scan_phase` broadcast ("not provisioned — skipped") in `_scan_user_onedrive`; full-scan path's `except Exception: return` also silences it | + +**`M365DriveNotFound` — why it exists:** `_get()` previously fell through to `raise_for_status()` on 404, which was caught by the generic `except Exception` handler in `_scan_user_onedrive` and broadcast as a red `scan_error`. The full-scan path (`_iter_drive_folder_for`) silently swallowed the same 404 via `except Exception: return`. Adding the specific exception makes the delta path consistent with the full-scan path: a user without a provisioned OneDrive is skipped without an error card. Common causes: no OneDrive licence, service plan disabled, drive never initialised (account never signed in), account suspended. + +**Do not add a 404 handler to `_get()` that returns a fallback value** — that would silently mask genuine path bugs elsewhere. Raising `M365DriveNotFound` keeps the error visible to callers that need to act on it. + ## Memory management — scan_engine.py Large M365 tenants can generate enormous memory pressure. Key rules to preserve: diff --git a/TODO.md b/TODO.md index 8490e07..3ce5c35 100644 --- a/TODO.md +++ b/TODO.md @@ -41,11 +41,10 @@ Full spec in SUGGESTIONS.md §29. A shareable URL (token-protected) or numeric PIN that gives a DPO, school principal, or compliance coordinator read-only access to the results grid — with disposition tagging but without scan controls, credentials, or delete access. Full spec in SUGGESTIONS.md §33. **Size:** Medium · **Priority:** Medium -### OneDrive 404 errors — investigate and handle appropriately -Every student is supposed to have a OneDrive licence, so 404s on `drive/root/delta` are unexpected. A 404 can mean: no licence assigned, licence assigned but OneDrive service plan disabled, drive not yet provisioned (account never signed in), or account suspended/deleted. Currently broadcast as red `scan_error` in the log. +### OneDrive 404 errors — investigate and handle appropriately ✅ +404 on `drive/root/delta` during delta scans was being broadcast as a red `scan_error`. Root cause: `_get()` hit `raise_for_status()` for 404s, which fell through to the generic `except Exception` handler in `_scan_user_onedrive`. The full-scan path silently swallowed the same 404 via `except Exception: return` in `_iter_drive_folder_for`. -**Action:** Check affected users in the M365 admin centre (Licences + OneDrive status). Once root cause is confirmed, decide whether to suppress, log at lower severity, or show a specific "OneDrive not provisioned" message instead of the raw HTTP error. -**Size:** Small · **Priority:** Medium +Fixed by adding `M365DriveNotFound(M365Error)` exception, raising it from `_get()` on 404, and catching it explicitly in `_scan_user_onedrive` with a lower-severity `scan_phase` broadcast ("OneDrive (user): not provisioned — skipped") instead of a red error card. --- diff --git a/m365_connector.py b/m365_connector.py index 275a249..6690cca 100644 --- a/m365_connector.py +++ b/m365_connector.py @@ -93,6 +93,17 @@ class M365DeltaTokenExpired(M365Error): pass +class M365DriveNotFound(M365Error): + """Raised when the Graph API returns 404 for a drive/root path. + + Common causes: OneDrive licence not assigned, service plan disabled, + drive not yet provisioned (user has never signed in), or account + suspended/deleted. Not a scan error — callers should skip the user + and log at a lower severity. + """ + pass + + class M365Connector: def __init__(self, client_id: str, tenant_id: str, client_secret: str = ""): if not MSAL_OK: @@ -425,6 +436,8 @@ class M365Connector: except Exception: msg = r.text[:200] raise M365PermissionError(path, msg) + if r.status_code == 404: + raise M365DriveNotFound(f"404 Not Found: {path}") r.raise_for_status() return r.json() raise _requests.exceptions.RetryError(f"Gave up after {self._MAX_RETRIES} attempts: {url}") diff --git a/scan_engine.py b/scan_engine.py index f61169d..dd06fc4 100644 --- a/scan_engine.py +++ b/scan_engine.py @@ -54,6 +54,7 @@ def _get_scan_meta(): try: from m365_connector import ( M365Connector, M365Error, M365PermissionError, M365DeltaTokenExpired, + M365DriveNotFound, MSAL_OK, REQUESTS_OK, ) CONNECTOR_OK = True @@ -62,6 +63,7 @@ except ImportError: M365Error = Exception M365PermissionError = Exception M365DeltaTokenExpired = Exception + M365DriveNotFound = Exception MSAL_OK = False REQUESTS_OK = False CONNECTOR_OK = False @@ -768,6 +770,10 @@ def run_scan(options: dict): work_items.append(("file", item, None)) except M365PermissionError: broadcast("scan_error", {"file": f"OneDrive ({uname})", "error": _permission_msg("OneDrive", uname)}) + except M365DriveNotFound: + # OneDrive not provisioned for this user (no licence, service plan + # disabled, or drive never initialised). Not a scan error — skip silently. + broadcast("scan_phase", {"phase": f"OneDrive ({uname}): not provisioned — skipped"}) except Exception as e: broadcast("scan_error", {"file": f"OneDrive ({uname})", "error": str(e)}) else: