fix: suppress OneDrive 404 errors during delta scans as non-provisioned

Add M365DriveNotFound(M365Error) exception raised by _get() on HTTP 404.
Catch it explicitly in _scan_user_onedrive before the generic handler,
broadcasting a scan_phase ("not provisioned — skipped") instead of a red
scan_error card. Full-scan path is unaffected (bare except Exception: return
in _iter_drive_folder_for already silenced the same 404).

Root cause: _get() fell through to raise_for_status() on 404, caught by
the generic except Exception handler and broadcast as scan_error. The
asymmetry with full scans (which silently skipped 404s) was confusing.

Common causes of OneDrive 404: no licence assigned, service plan disabled,
drive never provisioned (account never signed in), account suspended.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
StyxX65 2026-04-12 14:05:59 +02:00
parent 1aaf400771
commit 4dfbae49a4
5 changed files with 38 additions and 4 deletions

View File

@ -24,6 +24,8 @@ Version numbers follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html
### Fixed
- **OneDrive 404 errors during delta scans**`GET /users/{id}/drive/root/delta` returns 404 for users with no OneDrive licence, a disabled service plan, a drive that was never provisioned (account never signed in), or a suspended account. Previously these 404s fell through to `requests.raise_for_status()` and were caught by the generic `except Exception` handler in `_scan_user_onedrive`, broadcasting a red `scan_error` card. Full scans never showed the error because `_iter_drive_folder_for` has a bare `except Exception: return`. Fixed by adding `M365DriveNotFound(M365Error)` to `m365_connector.py`, raising it from `_get()` on HTTP 404, and handling it explicitly in `_scan_user_onedrive` with a `scan_phase` broadcast ("OneDrive (user): not provisioned — skipped") before the generic exception handler.
- **CI — Windows artifact never uploaded** — PyInstaller `--onedir` puts the exe inside `dist/GDPRScanner/`, not at `dist/*.exe`. The artifact glob never matched, so no Windows build appeared in releases. A PowerShell packaging step now zips `dist\GDPRScanner\` into `GDPRScanner_windows_x64.zip` (mirroring the existing Linux step).
- **`EFFORT_ESTIMATE.md`** — build effort estimate document covering component-by-component hour breakdowns and complexity drivers for the project.
- **Settings → Security tab** — new dedicated pane in the Settings modal. Admin PIN and Viewer PIN groups moved here from the General tab, which now contains only Appearance and About. The Share modal's **Configure** button navigates directly to the Security tab.

View File

@ -73,6 +73,20 @@ Both options live in the profile `options` dict and apply to **all three scan en
- **File scan** reads both from `source` dict keys (passed directly from the `/api/file_scan/start` payload). **M365 scan** reads both from `scan_opts = options.get("options", {})`. Both paths apply the same `_cpr_qualifies` / `_exif_has_pii` logic before the flagging gate.
- **UI:** sidebar controls `#optSkipGps` (toggle) and `#optMinCpr` (number); profile editor controls `#peOptSkipGps` and `#peOptMinCpr`. Both are saved/loaded by `profiles.js`.
## M365 connector exceptions — m365_connector.py
Exception hierarchy (all inherit `M365Error(Exception)`):
| Exception | Trigger | Handler |
|---|---|---|
| `M365PermissionError` | 403 Forbidden | `scan_error` broadcast with human-readable permission hint |
| `M365DeltaTokenExpired` | 410 Gone on delta endpoint | Caller clears token and falls back to full scan |
| `M365DriveNotFound` | 404 Not Found on any path | `scan_phase` broadcast ("not provisioned — skipped") in `_scan_user_onedrive`; full-scan path's `except Exception: return` also silences it |
**`M365DriveNotFound` — why it exists:** `_get()` previously fell through to `raise_for_status()` on 404, which was caught by the generic `except Exception` handler in `_scan_user_onedrive` and broadcast as a red `scan_error`. The full-scan path (`_iter_drive_folder_for`) silently swallowed the same 404 via `except Exception: return`. Adding the specific exception makes the delta path consistent with the full-scan path: a user without a provisioned OneDrive is skipped without an error card. Common causes: no OneDrive licence, service plan disabled, drive never initialised (account never signed in), account suspended.
**Do not add a 404 handler to `_get()` that returns a fallback value** — that would silently mask genuine path bugs elsewhere. Raising `M365DriveNotFound` keeps the error visible to callers that need to act on it.
## Memory management — scan_engine.py
Large M365 tenants can generate enormous memory pressure. Key rules to preserve:

View File

@ -41,11 +41,10 @@ Full spec in SUGGESTIONS.md §29.
A shareable URL (token-protected) or numeric PIN that gives a DPO, school principal, or compliance coordinator read-only access to the results grid — with disposition tagging but without scan controls, credentials, or delete access. Full spec in SUGGESTIONS.md §33.
**Size:** Medium · **Priority:** Medium
### OneDrive 404 errors — investigate and handle appropriately
Every student is supposed to have a OneDrive licence, so 404s on `drive/root/delta` are unexpected. A 404 can mean: no licence assigned, licence assigned but OneDrive service plan disabled, drive not yet provisioned (account never signed in), or account suspended/deleted. Currently broadcast as red `scan_error` in the log.
### OneDrive 404 errors — investigate and handle appropriately
404 on `drive/root/delta` during delta scans was being broadcast as a red `scan_error`. Root cause: `_get()` hit `raise_for_status()` for 404s, which fell through to the generic `except Exception` handler in `_scan_user_onedrive`. The full-scan path silently swallowed the same 404 via `except Exception: return` in `_iter_drive_folder_for`.
**Action:** Check affected users in the M365 admin centre (Licences + OneDrive status). Once root cause is confirmed, decide whether to suppress, log at lower severity, or show a specific "OneDrive not provisioned" message instead of the raw HTTP error.
**Size:** Small · **Priority:** Medium
Fixed by adding `M365DriveNotFound(M365Error)` exception, raising it from `_get()` on 404, and catching it explicitly in `_scan_user_onedrive` with a lower-severity `scan_phase` broadcast ("OneDrive (user): not provisioned — skipped") instead of a red error card.
---

View File

@ -93,6 +93,17 @@ class M365DeltaTokenExpired(M365Error):
pass
class M365DriveNotFound(M365Error):
"""Raised when the Graph API returns 404 for a drive/root path.
Common causes: OneDrive licence not assigned, service plan disabled,
drive not yet provisioned (user has never signed in), or account
suspended/deleted. Not a scan error callers should skip the user
and log at a lower severity.
"""
pass
class M365Connector:
def __init__(self, client_id: str, tenant_id: str, client_secret: str = ""):
if not MSAL_OK:
@ -425,6 +436,8 @@ class M365Connector:
except Exception:
msg = r.text[:200]
raise M365PermissionError(path, msg)
if r.status_code == 404:
raise M365DriveNotFound(f"404 Not Found: {path}")
r.raise_for_status()
return r.json()
raise _requests.exceptions.RetryError(f"Gave up after {self._MAX_RETRIES} attempts: {url}")

View File

@ -54,6 +54,7 @@ def _get_scan_meta():
try:
from m365_connector import (
M365Connector, M365Error, M365PermissionError, M365DeltaTokenExpired,
M365DriveNotFound,
MSAL_OK, REQUESTS_OK,
)
CONNECTOR_OK = True
@ -62,6 +63,7 @@ except ImportError:
M365Error = Exception
M365PermissionError = Exception
M365DeltaTokenExpired = Exception
M365DriveNotFound = Exception
MSAL_OK = False
REQUESTS_OK = False
CONNECTOR_OK = False
@ -768,6 +770,10 @@ def run_scan(options: dict):
work_items.append(("file", item, None))
except M365PermissionError:
broadcast("scan_error", {"file": f"OneDrive ({uname})", "error": _permission_msg("OneDrive", uname)})
except M365DriveNotFound:
# OneDrive not provisioned for this user (no licence, service plan
# disabled, or drive never initialised). Not a scan error — skip silently.
broadcast("scan_phase", {"phase": f"OneDrive ({uname}): not provisioned — skipped"})
except Exception as e:
broadcast("scan_error", {"file": f"OneDrive ({uname})", "error": str(e)})
else: