Added SFTP to sources

This commit is contained in:
StyxX65 2026-04-25 08:48:54 +02:00
parent 360eb1caed
commit e35bbe78a5
20 changed files with 826 additions and 120 deletions

View File

@ -7,6 +7,26 @@ Version numbers follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html
--- ---
## [1.6.25] — 2026-04-25
### Added
- **SFTP as a 4th file connector** — SFTP servers can now be added as file sources alongside local folders, SMB shares, and cloud sources. A new `SFTPScanner` class in `sftp_connector.py` implements the same `iter_files()` interface as `FileScanner`, so `run_file_scan()`, SSE broadcasting, DB persistence, card building, scheduled scans, and exports work without changes. Supports password auth and SSH private key auth (RSA, Ed25519, ECDSA, DSS); passphrases stored in the OS keychain. Key files uploaded via `POST /api/file_sources/upload_key` and stored in `~/.gdprscanner/sftp_keys/` with `chmod 600`. SFTP sources appear with a 🔒 icon in the sources panel. Requires `paramiko>=3.4` (optional — scanner falls back gracefully if not installed). New source-type selector (Local / Network (SMB) / SFTP) replaces the SMB path-prefix auto-detection in the add-source form.
- **`POST /api/file_sources/upload_key`** — new endpoint that validates and stores an SSH private key file, returning a `key_path` for use in the source definition.
- **SFTP entry in export SOURCE_MAP** — Excel and Article 30 exports render SFTP sources as "🔒 SFTP" with a purple tint (`EDE9F7`), consistent with the existing per-source tab and summary table logic.
---
## [1.6.24] — 2026-04-25
### Fixed
- **Scheduler UI showed untranslated English strings** — frequency labels ("Daily", "Weekly", "Monthly"), "Next:", "Running...", "Disabled", and both empty-state messages ("No scheduled scans yet." / "No scheduled runs yet") were hardcoded English strings in `scheduler.js` instead of using `t()`. All six call sites in `schedLoad()`, `schedRenderJobs()`, and `schedLoadHistory()` now call `t()` with the appropriate key. Three new translation keys added to `en.json`, `da.json`, and `de.json`: `m365_sched_no_jobs`, `m365_sched_running`, `m365_sched_disabled`.
---
## [1.6.23] — 2026-04-21 ## [1.6.23] — 2026-04-21
### Added ### Added

View File

@ -18,6 +18,8 @@ python -m pytest tests/ -q
**Google Drive delta scan** — `routes/google_scan.py` reads `scan_opts.get("delta", False)` (same flag as M365). Per user, delta key is `f"gdrive:{user_email}"` stored in `~/.gdprscanner/delta.json` alongside M365 tokens. First delta-enabled scan fetches all files then records a Changes API start page token via `conn.get_drive_start_token(user_email)`. Subsequent scans call `conn.get_drive_changes(user_email, token)` (Changes API) and update the token. Token save loads the current file fresh before writing (`{**current_tokens, **_new_drive_tokens}`) to avoid overwriting M365 tokens written by a concurrent scan thread. Invalid/expired tokens fall back to full scan automatically. `google_scan_done` now includes `"delta": bool` and `"delta_sources": int`. **Google Drive delta scan** — `routes/google_scan.py` reads `scan_opts.get("delta", False)` (same flag as M365). Per user, delta key is `f"gdrive:{user_email}"` stored in `~/.gdprscanner/delta.json` alongside M365 tokens. First delta-enabled scan fetches all files then records a Changes API start page token via `conn.get_drive_start_token(user_email)`. Subsequent scans call `conn.get_drive_changes(user_email, token)` (Changes API) and update the token. Token save loads the current file fresh before writing (`{**current_tokens, **_new_drive_tokens}`) to avoid overwriting M365 tokens written by a concurrent scan thread. Invalid/expired tokens fall back to full scan automatically. `google_scan_done` now includes `"delta": bool` and `"delta_sources": int`.
**SFTP connector** — `sftp_connector.py` provides `SFTPScanner` with the same `iter_files()` interface as `FileScanner`. `run_file_scan()` in `scan_engine.py` checks `source.get("source_type") == "sftp"` and instantiates `SFTPScanner`; all other file-scan code (SSE, DB, cards) is unchanged. Auth: `"password"` stores credential via `store_sftp_password()` in OS keychain; `"key"` loads the private key from `~/.gdprscanner/sftp_keys/<uuid>` with an optional keychain passphrase. Key files are uploaded via `POST /api/file_sources/upload_key` (paramiko validates format). `SFTP_OK` flag guards graceful degradation if `paramiko` is not installed. Do not add `source_type="sftp"` handling anywhere except `scan_engine.py` — the rest of the pipeline is source-agnostic.
**Shared content processing** — all three scan engines (M365, Google, file) funnel downloaded bytes through a single function: `cpr_detector._scan_bytes(content, filename)`. It dispatches to the correct parser by file extension. `scan_engine.py` uses the `_scan_bytes_timeout` wrapper for PDFs (subprocess + hard timeout). `routes/google_scan.py` uses `_scan_bytes` directly. Do not duplicate file-type handling in per-source code. **Shared content processing** — all three scan engines (M365, Google, file) funnel downloaded bytes through a single function: `cpr_detector._scan_bytes(content, filename)`. It dispatches to the correct parser by file extension. `scan_engine.py` uses the `_scan_bytes_timeout` wrapper for PDFs (subprocess + hard timeout). `routes/google_scan.py` uses `_scan_bytes` directly. Do not duplicate file-type handling in per-source code.
**`cpr_detector.SUPPORTED_EXTS` is the single source of truth** for which file extensions are scanned across all sources. `file_scanner.py` imports it as `DEFAULT_EXTENSIONS` so local/SMB scans stay in sync automatically. `scan_engine.py` uses it to gate M365/SharePoint/Teams file downloads. Do not maintain a separate extension list anywhere else. **`cpr_detector.SUPPORTED_EXTS` is the single source of truth** for which file extensions are scanned across all sources. `file_scanner.py` imports it as `DEFAULT_EXTENSIONS` so local/SMB scans stay in sync automatically. `scan_engine.py` uses it to gate M365/SharePoint/Teams file downloads. Do not maintain a separate extension list anywhere else.

View File

@ -1,8 +1,8 @@
# GDPRScanner # GDPRScanner
Scans Microsoft 365, Google Workspace, and local/network file systems for Danish Scans Microsoft 365, Google Workspace, local/network file systems, and SFTP servers
CPR numbers and personal data (PII). Produces GDPR compliance reports and supports for Danish CPR numbers and personal data (PII). Produces GDPR compliance reports and
Article 30 record-keeping obligations. supports Article 30 record-keeping obligations.
--- ---
@ -32,7 +32,7 @@ an IDE with intelligent completion. The result is the author's work.
- **Folder path in results** — each email result shows its full folder path (e.g. `Inbox / Ansøgninger pædagog SFO`) in the card and in Excel export - **Folder path in results** — each email result shows its full folder path (e.g. `Inbox / Ansøgninger pædagog SFO`) in the card and in Excel export
- **Delete items** — flagged results can be deleted directly from the UI, individually or in bulk - **Delete items** — flagged results can be deleted directly from the UI, individually or in bulk
- **CPR false-positive reduction** — strict CPR validation - **CPR false-positive reduction** — strict CPR validation
- **Excel export** — multi-tab `.xlsx` report with per-source breakdown, auto-filters, and URL hyperlinks. Columns include: Name, CPR Hits, Face count, GPS (✔ if GPS in EXIF), Special category, EXIF author, Folder, Account, Role, Disposition, Date Modified, Size (KB), URL. A dedicated **GPS locations** sheet lists all items with GPS coordinates including a Google Maps link. Separate tabs for Outlook (Exchange), OneDrive, SharePoint, Teams, Gmail, Google Drive, local folders, and SMB/network shares. Summary sheet shows counts by source and GPS item total. When M365, Google Workspace, and file scans run concurrently, all results are captured in the export — not just the last completed scan - **Excel export** — multi-tab `.xlsx` report with per-source breakdown, auto-filters, and URL hyperlinks. Columns include: Name, CPR Hits, Face count, GPS (✔ if GPS in EXIF), Special category, EXIF author, Folder, Account, Role, Disposition, Date Modified, Size (KB), URL. A dedicated **GPS locations** sheet lists all items with GPS coordinates including a Google Maps link. Separate tabs for Outlook (Exchange), OneDrive, SharePoint, Teams, Gmail, Google Drive, local folders, SMB/network shares, and SFTP. Summary sheet shows counts by source and GPS item total. When M365, Google Workspace, and file scans run concurrently, all results are captured in the export — not just the last completed scan
- **Progressive streaming** — results stream card-by-card via Server-Sent Events as the scan runs - **Progressive streaming** — results stream card-by-card via Server-Sent Events as the scan runs
- **Token auto-refresh** — expired tokens are detected and silently refreshed mid-scan without interrupting the UI - **Token auto-refresh** — expired tokens are detected and silently refreshed mid-scan without interrupting the UI
- **Incremental / resumable scans** — interrupted scans save a checkpoint; the next run resumes from where it stopped rather than starting over - **Incremental / resumable scans** — interrupted scans save a checkpoint; the next run resumes from where it stopped rather than starting over
@ -79,7 +79,7 @@ The sidebar sources panel lists all configured scan sources. Click **Sources** t
**Google Workspace tab** — Two authentication modes: **Workspace** (service account with domain-wide delegation — scans all users) and **Personal account** (OAuth 2.0 device-code flow — scans the signed-in account only). Once connected, per-source toggles control whether Gmail and/or Google Drive appear in the sidebar panel and are included in scans. See [GOOGLE_SETUP.md](docs/setup/GOOGLE_SETUP.md) for setup instructions. **Google Workspace tab** — Two authentication modes: **Workspace** (service account with domain-wide delegation — scans all users) and **Personal account** (OAuth 2.0 device-code flow — scans the signed-in account only). Once connected, per-source toggles control whether Gmail and/or Google Drive appear in the sidebar panel and are included in scans. See [GOOGLE_SETUP.md](docs/setup/GOOGLE_SETUP.md) for setup instructions.
**File sources tab** — Add local folder paths or SMB/CIFS network shares with a name, path, and optional SMB credentials. Each saved source appears as a checkbox in the sidebar panel (local, SMB/network). Use the **Edit** button on each row to update credentials or rename a source without deleting it. **File sources tab** — Add local folder paths, SMB/CIFS network shares, or SFTP servers. A pill selector (Local / Network / SFTP) switches the form fields. SFTP sources require host, port, username, remote path, and auth type (password or private key). SSH private keys are uploaded via the UI, validated with paramiko, and stored in `~/.gdprscanner/sftp_keys/` with `600` permissions; passwords and passphrases are stored in the OS keychain. Each saved source appears as a checkbox in the sidebar panel. Use the **Edit** button on each row to update credentials or rename a source without deleting it.
**Skipped automatically:** `.recycle`, `.sync`, `.btsync`, `.trash`, `.git`, `node_modules`, `System Volume Information`, and other system/sync folders. Hidden directories (`.` prefix) are skipped too. **Skipped automatically:** `.recycle`, `.sync`, `.btsync`, `.trash`, `.git`, `node_modules`, `System Volume Information`, and other system/sync folders. Hidden directories (`.` prefix) are skipped too.
@ -207,6 +207,11 @@ The **⬇ Excel** button exports all current results to a `.xlsx` file (`m365_sc
| OneDrive | Flagged OneDrive files | | OneDrive | Flagged OneDrive files |
| SharePoint | Flagged SharePoint files | | SharePoint | Flagged SharePoint files |
| Teams | Flagged Teams files | | Teams | Flagged Teams files |
| Gmail | Flagged Gmail messages |
| Google Drive | Flagged Google Drive files |
| Local | Flagged local-folder files |
| Network | Flagged SMB/NAS files |
| SFTP | Flagged SFTP server files |
In macOS app builds, the export opens a native Save dialog instead of a browser download. In macOS app builds, the export opens a native Save dialog instead of a browser download.
@ -654,7 +659,7 @@ See [SUGGESTIONS.md](SUGGESTIONS.md) for the full feature roadmap with implement
| File | Description | | File | Description |
|---|---| |---|---|
| `gdpr_scanner.py` | Flask entry point — scan orchestration, SSE route (`/api/scan/stream`), root route | | `gdpr_scanner.py` | Flask entry point — scan orchestration, SSE route (`/api/scan/stream`), root route |
| `scan_engine.py` | M365 and local/SMB scan logic — `run_scan()`, `run_file_scan()` | | `scan_engine.py` | M365 and local/SMB/SFTP scan logic — `run_scan()`, `run_file_scan()` |
| `app_config.py` | All persistence — profiles, settings, SMTP config, lang loading, Fernet encryption | | `app_config.py` | All persistence — profiles, settings, SMTP config, lang loading, Fernet encryption |
| `sse.py` | SSE broadcast queue and `_current_scan_id` | | `sse.py` | SSE broadcast queue and `_current_scan_id` |
| `checkpoint.py` | Mid-scan checkpoint save/load, `_checkpoint_key()` | | `checkpoint.py` | Mid-scan checkpoint save/load, `_checkpoint_key()` |
@ -664,6 +669,7 @@ See [SUGGESTIONS.md](SUGGESTIONS.md) for the full feature roadmap with implement
| `m365_connector.py` | Microsoft Graph API client — auth, token refresh, email/OneDrive/SharePoint/Teams fetchers, delete methods | | `m365_connector.py` | Microsoft Graph API client — auth, token refresh, email/OneDrive/SharePoint/Teams fetchers, delete methods |
| `google_connector.py` | Google Workspace API client — Gmail, Drive, Admin SDK | | `google_connector.py` | Google Workspace API client — Gmail, Drive, Admin SDK |
| `file_scanner.py` | Unified local + SMB/CIFS file iterator — `FileScanner.iter_files()` yields `(path, bytes, metadata)`. SMB reads use a 1-slot sliding-window `ThreadPoolExecutor` (`PREFETCH_WINDOW=1`) with a 60-second per-file timeout. `DEFAULT_EXTENSIONS` is imported from `cpr_detector.SUPPORTED_EXTS` (not a local hardcoded set) so the scannable extension list stays in sync automatically. | | `file_scanner.py` | Unified local + SMB/CIFS file iterator — `FileScanner.iter_files()` yields `(path, bytes, metadata)`. SMB reads use a 1-slot sliding-window `ThreadPoolExecutor` (`PREFETCH_WINDOW=1`) with a 60-second per-file timeout. `DEFAULT_EXTENSIONS` is imported from `cpr_detector.SUPPORTED_EXTS` (not a local hardcoded set) so the scannable extension list stays in sync automatically. |
| `sftp_connector.py` | SFTP file iterator — `SFTPScanner.iter_files()` yields the same `(path, bytes, metadata)` tuple as `FileScanner`. Uses paramiko (`AutoAddPolicy`); supports password auth and private-key auth (RSA / Ed25519 / ECDSA / DSS). Passwords and key passphrases are stored in the OS keychain; key files live in `~/.gdprscanner/sftp_keys/`. Gracefully degrades when paramiko is not installed (`SFTP_OK` flag). |
| `scan_scheduler.py` | In-process APScheduler wrapper — multi-job scheduled scan engine | | `scan_scheduler.py` | In-process APScheduler wrapper — multi-job scheduled scan engine |
| `templates/index.html` | Single-page HTML shell — Jinja2 template. Two variables: `app_version`, `lang_json`. | | `templates/index.html` | Single-page HTML shell — Jinja2 template. Two variables: `app_version`, `lang_json`. |
| `static/style.css` | All application CSS — custom properties, layout, components, light/dark themes | | `static/style.css` | All application CSS — custom properties, layout, components, light/dark themes |

View File

@ -111,6 +111,14 @@ Optional session-level authentication gate for the main scanner interface. Set i
--- ---
### SFTP as a 4th file connector 🔄 In progress
Scan SFTP servers (SSH File Transfer Protocol) alongside local, SMB, and cloud sources. A new `SFTPScanner` class in `sftp_connector.py` implements the same `iter_files()` interface as `FileScanner`, so `run_file_scan()` and everything downstream (SSE, DB, export, scheduling) is unchanged. Auth supports password and SSH private key (+ optional passphrase). Key files stored in `~/.gdprscanner/sftp_keys/`. SFTP sources appear in the file sources panel with a 🔒 icon, are profile-aware, and are included in scheduled scans automatically.
**Files changed:** `sftp_connector.py` (new), `scan_engine.py`, `routes/sources.py`, `app_config.py`, `static/js/sources.js`, `templates/index.html`, `lang/en|da|de.json`, `routes/export.py`, `requirements.txt`
**Size:** Medium · **Priority:** Medium
---
### #32 — Windowed mode for Profiles, Sources, and Settings ✗ Won't do ### #32 — Windowed mode for Profiles, Sources, and Settings ✗ Won't do
The workflow is sequential (configure → scan → review), not parallel — there is no realistic scenario where a modal and the results grid need to be open simultaneously. The Sources panel is already visible in the sidebar. Option A (the least-work path) still loads the full 3800-line JS stack twice. Closed. The workflow is sequential (configure → scan → review), not parallel — there is no realistic scenario where a modal and the results grid need to be open simultaneously. The Sources panel is already visible in the sidebar. Option A (the least-work path) still loads the full 3800-line JS stack twice. Closed.

View File

@ -544,6 +544,8 @@ def _save_role_overrides(overrides: dict) -> None:
# ── File source settings (#8) ───────────────────────────────────────────────── # ── File source settings (#8) ─────────────────────────────────────────────────
_FILE_SOURCES_PATH = _DATA_DIR / "file_sources.json" _FILE_SOURCES_PATH = _DATA_DIR / "file_sources.json"
_SFTP_KEYS_DIR = _DATA_DIR / "sftp_keys"
_SFTP_KEYS_DIR.mkdir(exist_ok=True)
def _load_file_sources() -> list: def _load_file_sources() -> list:
@ -568,6 +570,32 @@ def _save_file_sources(sources: list) -> None:
except Exception as e: except Exception as e:
logger.error("[file_sources] write failed: %s", e) logger.error("[file_sources] write failed: %s", e)
def _resolve_sftp_credentials(source: dict) -> dict:
"""Return a copy of source with password/passphrase resolved from keychain.
Callers (run_file_scan, upload_key endpoint) should use this rather than
reading keychain credentials themselves, so the lookup logic stays in one place.
"""
try:
from sftp_connector import get_sftp_password
except ImportError:
return source
resolved = dict(source)
keychain_key = source.get("keychain_key") or None
host = source.get("sftp_host", "")
user = source.get("sftp_user", "")
if not resolved.get("sftp_password"):
resolved["sftp_password"] = get_sftp_password(host, user, keychain_key)
if not resolved.get("sftp_passphrase"):
# Passphrase stored under a distinct account name
passphrase_key = (keychain_key + ":passphrase") if keychain_key else None
resolved["sftp_passphrase"] = get_sftp_password(host, user, passphrase_key)
return resolved
# ── Viewer tokens ──────────────────────────────────────────────────────────── # ── Viewer tokens ────────────────────────────────────────────────────────────
# Read-only viewer tokens allow sharing scan results with a DPO or compliance # Read-only viewer tokens allow sharing scan results with a DPO or compliance
# officer without exposing scan controls or credentials. Each token is a # officer without exposing scan controls or credentials. Each token is a

View File

@ -1,6 +1,6 @@
# GDPR Scanner — Brugermanual # GDPR Scanner — Brugermanual
Version 1.6.20 Version 1.6.25
--- ---
@ -33,7 +33,7 @@ Når der er fundet elementer, kan du gennemgå dem, beslutte hvad der skal ske m
**Hvad scanneren gennemgår:** **Hvad scanneren gennemgår:**
- Microsoft 365: Exchange e-mail, OneDrive, SharePoint, Teams - Microsoft 365: Exchange e-mail, OneDrive, SharePoint, Teams
- Google Workspace: Gmail, Google Drev - Google Workspace: Gmail, Google Drev
- Lokale og netværksbaserede filmapper (herunder SMB/NAS-drev) - Lokale og netværksbaserede filmapper (herunder SMB/NAS-drev og SFTP-servere)
**Hvad den finder:** **Hvad den finder:**
- CPR-numre - CPR-numre
@ -104,17 +104,33 @@ Fanen Google Workspace lader dig forbinde en Google Workspace-konto (tidligere G
| Gmail | Alle e-mails i den enkelte brugers indbakke og labels | | Gmail | Alle e-mails i den enkelte brugers indbakke og labels |
| Google Drev | Alle filer ejet af eller delt med den enkelte bruger | | Google Drev | Alle filer ejet af eller delt med den enkelte bruger |
### 3.3 Lokale og netværksbaserede filer ### 3.3 Lokale, netværksbaserede og SFTP-filkilder
Fanen **Filkilder** viser de lokale mapper og netværksdrev, du har konfigureret. Fanen **Filkilder** viser de lokale mapper, netværksdrev og SFTP-servere, du har konfigureret.
**Sådan tilføjer du en ny filkilde:** **Sådan tilføjer du en ny filkilde:**
1. Indtast en **Betegnelse** — et navn du kan genkende (f.eks. "Skolens Fællesmappe"). 1. Indtast en **Betegnelse** — et navn du kan genkende (f.eks. "Skolens Fællesmappe").
2. Indtast **Stien**: 2. Vælg **kildetype** med pillerne øverst i formularen:
- Lokal mappe: `~/Dokumenter` eller `/Volumes/Drev`
- Netværksdrev: `//nas-server/delt` eller `\\server\delt` **Lokal**
3. Hvis det er et netværksdrev, udfyldes felterne **SMB-vært**, **Brugernavn** og **Adgangskode** automatisk. Adgangskoden gemmes sikkert i systemets nøglering. - Indtast **Stien** til mappen: `~/Dokumenter` eller `/Volumes/Drev`.
4. Klik på **Tilføj**. - Klik på **Tilføj**.
**Netværk (SMB)**
- Indtast **Stien** i UNC-format: `//nas-server/delt` eller `\\server\delt`.
- Udfyld **SMB-vært**, **Brugernavn** og **Adgangskode**. Adgangskoden gemmes sikkert i systemets nøglering.
- Klik på **Tilføj**.
**SFTP**
- Indtast **Vært** (værtsnavn eller IP-adresse på SSH/SFTP-serveren).
- Indtast **Port** (standard 22).
- Indtast **Brugernavn**.
- Indtast **Fjernsti**, der skal scannes (f.eks. `/home/delt` eller `/`).
- Vælg **Godkendelsestype**:
- **Adgangskode** — indtast adgangskoden. Den gemmes sikkert i systemets nøglering.
- **Privat nøgle** — klik på **Upload nøglefil** og vælg din SSH-privatnøgle (OpenSSH- eller PEM-format). Hvis nøglen er beskyttet med en adgangssætning, skal du indtaste den. Nøglefilen gemmes i scannerens datamappe med `600`-rettigheder.
- Klik på **Tilføj**.
Du kan tilføje så mange filkilder, du har brug for. De vil fremgå som valgbare kilder i venstre panel, når du er klar til at scanne. Du kan tilføje så mange filkilder, du har brug for. De vil fremgå som valgbare kilder i venstre panel, når du er klar til at scanne.
@ -192,7 +208,8 @@ Hvert fundet element vises som et kort. Her er forklaringen på mærker og label
| Teams | Fundet i en Teams-kanal | | Teams | Fundet i en Teams-kanal |
| Gmail | Fundet i en Gmail-postkasse | | Gmail | Fundet i en Gmail-postkasse |
| Google Drev | Fundet i Google Drev | | Google Drev | Fundet i Google Drev |
| Lokal / Netværk | Fundet på et filshare | | Lokal / Netværk | Fundet på et lokalt eller SMB-filshare |
| 🔒 SFTP | Fundet på en SFTP-server |
### Risikoniveau ### Risikoniveau
@ -352,7 +369,7 @@ Klik på **Profiler** for at åbne profil­administrations­panelet. Her kan du:
Klik på **Excel** i filterbjælken for at downloade de aktuelle resultater som en Excel-projektmappe. Projektmappen indeholder: Klik på **Excel** i filterbjælken for at downloade de aktuelle resultater som en Excel-projektmappe. Projektmappen indeholder:
- Et oversigtsfaneblad med scanningsdato, antal elementer og kildefordeling. - Et oversigtsfaneblad med scanningsdato, antal elementer og kildefordeling.
- Et separat faneblad for hver kildetype (Outlook, OneDrive, SharePoint, Teams, Gmail, Google Drive, Lokal, Netværk). - Et separat faneblad for hver kildetype (Outlook, OneDrive, SharePoint, Teams, Gmail, Google Drive, Lokal, Netværk, SFTP).
- Alle fundne elementer, herunder kilde, konto, CPR-antal, risikoniveau, delingsstatus og disposition. - Alle fundne elementer, herunder kilde, konto, CPR-antal, risikoniveau, delingsstatus og disposition.
Knapperne **Excel** og **Art.30** er altid tilgængelige — også efter genstart af programmet — og eksporterer resultaterne fra den seneste afsluttede scanningssession uden at kræve en ny scanning. Knapperne **Excel** og **Art.30** er altid tilgængelige — også efter genstart af programmet — og eksporterer resultaterne fra den seneste afsluttede scanningssession uden at kræve en ny scanning.
@ -556,7 +573,7 @@ Nej. CPR-numre fundet under en scanning gemmes kun som et antal (f.eks. "3 CPR-n
E-mails flyttes til brugerens **Slettet post**-mappe i Exchange — de slettes ikke permanent og kan gendannes af brugeren eller en administrator. Filer flyttes til **papirkurven** i den pågældende tjeneste (OneDrive, SharePoint, filsystem). Permanent sletning kræver en efterfølgende handling af brugeren eller administrator. E-mails flyttes til brugerens **Slettet post**-mappe i Exchange — de slettes ikke permanent og kan gendannes af brugeren eller en administrator. Filer flyttes til **papirkurven** i den pågældende tjeneste (OneDrive, SharePoint, filsystem). Permanent sletning kræver en efterfølgende handling af brugeren eller administrator.
**Kan jeg scanne uden at forbinde til Microsoft 365?** **Kan jeg scanne uden at forbinde til Microsoft 365?**
Ja. Du kan scanne lokale og SMB-filshares uden nogen M365- eller Google-forbindelse. Åbn **Kilder**, gå til fanen **Filkilder**, og tilføj dine filstier. Ja. Du kan scanne lokale mapper, SMB/NAS-drev og SFTP-servere uden nogen M365- eller Google-forbindelse. Åbn **Kilder**, gå til fanen **Filkilder**, og tilføj dine filstier eller SFTP-serveroplysninger.
**Hvad er delta-scanning, og hvornår skal jeg bruge det?** **Hvad er delta-scanning, og hvornår skal jeg bruge det?**
Delta-scanning bruger Microsoft Graphs ændringstokens (for M365) og Google Drive Changes API (for Google Workspace) til kun at hente elementer ændret siden den seneste scanning. Det er ideelt til regelmæssige (f.eks. ugentlige) compliance-tjek efter, at du har gennemført en fuld basisscan. Aktiver det i afsnittet Indstillinger i venstre panel. Delta-scanning bruger Microsoft Graphs ændringstokens (for M365) og Google Drive Changes API (for Google Workspace) til kun at hente elementer ændret siden den seneste scanning. Det er ideelt til regelmæssige (f.eks. ugentlige) compliance-tjek efter, at du har gennemført en fuld basisscan. Aktiver det i afsnittet Indstillinger i venstre panel.
@ -584,4 +601,4 @@ Ja. Brug **🔗 Del**-knappen til at oprette et skrivebeskyttet viewer-link elle
--- ---
*GDPR Scanner v1.6.20 — teknisk opsætning og konfiguration: se README.md* *GDPR Scanner v1.6.25 — teknisk opsætning og konfiguration: se README.md*

View File

@ -1,6 +1,6 @@
# GDPR Scanner — User Manual # GDPR Scanner — User Manual
Version 1.6.20 Version 1.6.25
--- ---
@ -33,7 +33,7 @@ When items are found, you can review them, decide what to do with each one (keep
**What it scans:** **What it scans:**
- Microsoft 365: Exchange email, OneDrive, SharePoint, Teams - Microsoft 365: Exchange email, OneDrive, SharePoint, Teams
- Google Workspace: Gmail, Google Drive - Google Workspace: Gmail, Google Drive
- Local and network file shares (including SMB/NAS drives) - Local and network file shares (including SMB/NAS drives and SFTP servers)
**What it finds:** **What it finds:**
- CPR numbers (Danish civil registration numbers) - CPR numbers (Danish civil registration numbers)
@ -104,17 +104,33 @@ The Google Workspace tab lets you connect a Google Workspace (formerly G Suite)
| Gmail | All emails in each user's inbox and labels | | Gmail | All emails in each user's inbox and labels |
| Google Drive | All files owned by or shared with each user | | Google Drive | All files owned by or shared with each user |
### 3.3 Local and Network File Shares ### 3.3 Local, Network, and SFTP File Sources
The **Filkilder** (File Sources) tab lists any local folders or network drives you have configured. The **Filkilder** (File Sources) tab lists any local folders, network drives, or SFTP servers you have configured.
**To add a new file source:** **To add a new file source:**
1. Enter a **Label** — a friendly name you will recognise (e.g. "Skolens Fællesmappe"). 1. Enter a **Label** — a friendly name you will recognise (e.g. "Skolens Fællesmappe").
2. Enter the **Path**: 2. Select the **source type** using the pill selector at the top of the form:
- Local folder: `~/Documents` or `/Volumes/Share`
- Network share: `//nas-server/shared` or `\\server\share` **Local**
3. If it is a network share, fill in the **SMB Host**, **Username**, and **Password** that appear automatically. The password is stored securely in your system keychain. - Enter the **Path** to the folder: `~/Documents` or `/Volumes/Share`.
4. Click **Tilføj** (Add). - Click **Tilføj** (Add).
**Network (SMB)**
- Enter the **Path** in UNC format: `//nas-server/shared` or `\\server\share`.
- Fill in the **SMB Host**, **Username**, and **Password** that appear. The password is stored securely in your system keychain.
- Click **Tilføj** (Add).
**SFTP**
- Enter the **Host** (hostname or IP address of the SSH/SFTP server).
- Enter the **Port** (default 22).
- Enter the **Username**.
- Enter the **Remote path** to scan (e.g. `/home/shared` or `/`).
- Choose the **Authentication type**:
- **Password** — enter the password. It is stored securely in your system keychain.
- **Private key** — click **Upload key file** and select your SSH private key (OpenSSH or PEM format). If the key is passphrase-protected, enter the passphrase. The key file is stored in the scanner's data directory with `600` permissions.
- Click **Tilføj** (Add).
You can add as many file sources as you need. Each one will appear as a selectable source in the main sidebar when you are ready to scan. You can add as many file sources as you need. Each one will appear as a selectable source in the main sidebar when you are ready to scan.
@ -192,7 +208,8 @@ Each flagged item appears as a card. Here is what the badges and labels mean:
| Teams | Found in a Teams channel | | Teams | Found in a Teams channel |
| Gmail | Found in a Gmail mailbox | | Gmail | Found in a Gmail mailbox |
| Google Drive | Found in Google Drive | | Google Drive | Found in Google Drive |
| Local / Network | Found on a file share | | Local / Network | Found on a local or SMB file share |
| 🔒 SFTP | Found on an SFTP server |
### Risk level ### Risk level
@ -352,7 +369,7 @@ Click **Profiles** to open the profile management panel. Here you can:
Click **Excel** in the filter bar to download the current results as an Excel workbook. The workbook contains: Click **Excel** in the filter bar to download the current results as an Excel workbook. The workbook contains:
- A summary tab with scan date, item counts, and source breakdown. - A summary tab with scan date, item counts, and source breakdown.
- A separate tab for each source type (Outlook, OneDrive, SharePoint, Teams, Gmail, Google Drive, Local, Network). - A separate tab for each source type (Outlook, OneDrive, SharePoint, Teams, Gmail, Google Drive, Local, Network, SFTP).
- Every flagged item, including source, account, CPR count, risk level, sharing status, and disposition. - Every flagged item, including source, account, CPR count, risk level, sharing status, and disposition.
The **Excel** and **Art.30** buttons are always available — even after restarting the application — and will export the results from the most recent completed scan session without requiring a new scan. The **Excel** and **Art.30** buttons are always available — even after restarting the application — and will export the results from the most recent completed scan session without requiring a new scan.
@ -556,7 +573,7 @@ No. CPR numbers found during a scan are stored only as a count (e.g. "3 CPR numb
Emails are moved to the user's **Deleted Items** folder in Exchange — they are not permanently deleted and can be recovered by the user or an administrator. Files are moved to the **recycle bin** of the relevant service (OneDrive, SharePoint, file system). A permanent deletion requires a second action by the user or admin. Emails are moved to the user's **Deleted Items** folder in Exchange — they are not permanently deleted and can be recovered by the user or an administrator. Files are moved to the **recycle bin** of the relevant service (OneDrive, SharePoint, file system). A permanent deletion requires a second action by the user or admin.
**Can I scan without connecting to Microsoft 365?** **Can I scan without connecting to Microsoft 365?**
Yes. You can scan local and SMB file shares without any M365 or Google connection. Open **Sources**, go to the **Filkilder** tab, and add your file paths. Yes. You can scan local folders, SMB/NAS drives, and SFTP servers without any M365 or Google connection. Open **Sources**, go to the **Filkilder** tab, and add your file paths or SFTP server details.
**What is delta scanning and when should I use it?** **What is delta scanning and when should I use it?**
Delta scanning uses Microsoft Graph change tokens (for M365) and the Google Drive Changes API (for Google Workspace) to fetch only items modified since the last scan. It is ideal for regular (e.g. weekly) compliance checks after you have done a full baseline scan. Enable it in the Options section of the sidebar. Delta scanning uses Microsoft Graph change tokens (for M365) and the Google Drive Changes API (for Google Workspace) to fetch only items modified since the last scan. It is ideal for regular (e.g. weekly) compliance checks after you have done a full baseline scan. Enable it in the Options section of the sidebar.
@ -584,4 +601,4 @@ Yes. Use the **🔗 Share** button to create a read-only viewer link or set a Vi
--- ---
*GDPR Scanner v1.6.20 — for technical setup and configuration see README.md* *GDPR Scanner v1.6.25 — for technical setup and configuration see README.md*

View File

@ -608,6 +608,25 @@
"m365_fsrc_saved": "Kilde gemt", "m365_fsrc_saved": "Kilde gemt",
"m365_fsrc_saving": "Gemmer...", "m365_fsrc_saving": "Gemmer...",
"m365_fsrc_path_required": "Sti er påkrævet.", "m365_fsrc_path_required": "Sti er påkrævet.",
"m365_fsrc_type_local": "Lokal mappe",
"m365_fsrc_type_smb": "Netværksdrev (SMB)",
"m365_fsrc_type_sftp": "SFTP-server",
"m365_fsrc_sftp_host": "SFTP-host",
"m365_fsrc_sftp_port": "Port",
"m365_fsrc_sftp_user": "Brugernavn",
"m365_fsrc_sftp_remote_path": "Fjernsti",
"m365_fsrc_sftp_auth_password": "Adgangskode",
"m365_fsrc_sftp_auth_key": "SSH-nøgle",
"m365_fsrc_sftp_pw": "Adgangskode",
"m365_fsrc_sftp_pw_hint": "Adgangskoden gemmes i OS-nøgleringe — aldrig i en fil.",
"m365_fsrc_sftp_key_upload": "Privat nøglefil",
"m365_fsrc_sftp_key_btn": "Upload nøgle",
"m365_fsrc_sftp_key_uploaded": "Nøgle uploadet",
"m365_fsrc_sftp_passphrase": "Adgangssætning (hvis nøglen er krypteret)",
"m365_fsrc_sftp_passphrase_hint": "Adgangssætningen gemmes i OS-nøgleringe — aldrig i en fil.",
"m365_fsrc_sftp_not_installed": "paramiko er ikke installeret — kør: pip install paramiko",
"m365_fsrc_sftp_host_required": "SFTP-host er påkrævet.",
"m365_fsrc_sftp_user_required": "SFTP-brugernavn er påkrævet.",
"m365_fsrc_scan_btn": "Scan", "m365_fsrc_scan_btn": "Scan",
"m365_fsrc_scan_start": "Starter filscanning", "m365_fsrc_scan_start": "Starter filscanning",
"m365_src_group_files": "Filkilder", "m365_src_group_files": "Filkilder",
@ -712,6 +731,9 @@
"m365_sched_editor_edit": "Rediger planlagt scanning", "m365_sched_editor_edit": "Rediger planlagt scanning",
"m365_sched_name_required": "Navn er påkrævet", "m365_sched_name_required": "Navn er påkrævet",
"m365_sched_no_runs": "Ingen planlagte kørsler endnu", "m365_sched_no_runs": "Ingen planlagte kørsler endnu",
"m365_sched_no_jobs": "Ingen planlagte scanninger endnu.",
"m365_sched_running": "Kører...",
"m365_sched_disabled": "Deaktiveret",
"m365_sched_freq_daily": "Dagligt", "m365_sched_freq_daily": "Dagligt",
"m365_sched_freq_weekly": "Ugentligt", "m365_sched_freq_weekly": "Ugentligt",
"m365_sched_freq_monthly": "Månedligt", "m365_sched_freq_monthly": "Månedligt",

View File

@ -608,6 +608,25 @@
"m365_fsrc_saved": "Quelle gespeichert", "m365_fsrc_saved": "Quelle gespeichert",
"m365_fsrc_saving": "Speichern...", "m365_fsrc_saving": "Speichern...",
"m365_fsrc_path_required": "Pfad ist erforderlich.", "m365_fsrc_path_required": "Pfad ist erforderlich.",
"m365_fsrc_type_local": "Lokaler Ordner",
"m365_fsrc_type_smb": "Netzwerkfreigabe (SMB)",
"m365_fsrc_type_sftp": "SFTP-Server",
"m365_fsrc_sftp_host": "SFTP-Host",
"m365_fsrc_sftp_port": "Port",
"m365_fsrc_sftp_user": "Benutzername",
"m365_fsrc_sftp_remote_path": "Remote-Pfad",
"m365_fsrc_sftp_auth_password": "Passwort",
"m365_fsrc_sftp_auth_key": "SSH-Schlüssel",
"m365_fsrc_sftp_pw": "Passwort",
"m365_fsrc_sftp_pw_hint": "Passwort wird im OS-Schlüsselbund gespeichert — nie in einer Datei.",
"m365_fsrc_sftp_key_upload": "Private Schlüsseldatei",
"m365_fsrc_sftp_key_btn": "Schlüssel hochladen",
"m365_fsrc_sftp_key_uploaded": "Schlüssel hochgeladen",
"m365_fsrc_sftp_passphrase": "Passphrase (wenn Schlüssel verschlüsselt ist)",
"m365_fsrc_sftp_passphrase_hint": "Passphrase wird im OS-Schlüsselbund gespeichert — nie in einer Datei.",
"m365_fsrc_sftp_not_installed": "paramiko nicht installiert — ausführen: pip install paramiko",
"m365_fsrc_sftp_host_required": "SFTP-Host ist erforderlich.",
"m365_fsrc_sftp_user_required": "SFTP-Benutzername ist erforderlich.",
"m365_fsrc_scan_btn": "Scannen", "m365_fsrc_scan_btn": "Scannen",
"m365_fsrc_scan_start": "Datei-Scan wird gestartet", "m365_fsrc_scan_start": "Datei-Scan wird gestartet",
"m365_src_group_files": "Dateiquellen", "m365_src_group_files": "Dateiquellen",
@ -712,6 +731,9 @@
"m365_sched_editor_edit": "Geplante Suche bearbeiten", "m365_sched_editor_edit": "Geplante Suche bearbeiten",
"m365_sched_name_required": "Name ist erforderlich", "m365_sched_name_required": "Name ist erforderlich",
"m365_sched_no_runs": "Noch keine geplanten Läufe", "m365_sched_no_runs": "Noch keine geplanten Läufe",
"m365_sched_no_jobs": "Noch keine geplanten Scans.",
"m365_sched_running": "Läuft...",
"m365_sched_disabled": "Deaktiviert",
"m365_sched_freq_daily": "Täglich", "m365_sched_freq_daily": "Täglich",
"m365_sched_freq_weekly": "Wöchentlich", "m365_sched_freq_weekly": "Wöchentlich",
"m365_sched_freq_monthly": "Monatlich", "m365_sched_freq_monthly": "Monatlich",

View File

@ -608,6 +608,25 @@
"m365_fsrc_saved": "Source saved", "m365_fsrc_saved": "Source saved",
"m365_fsrc_saving": "Saving...", "m365_fsrc_saving": "Saving...",
"m365_fsrc_path_required": "Path is required.", "m365_fsrc_path_required": "Path is required.",
"m365_fsrc_type_local": "Local folder",
"m365_fsrc_type_smb": "Network share (SMB)",
"m365_fsrc_type_sftp": "SFTP server",
"m365_fsrc_sftp_host": "SFTP host",
"m365_fsrc_sftp_port": "Port",
"m365_fsrc_sftp_user": "Username",
"m365_fsrc_sftp_remote_path": "Remote path",
"m365_fsrc_sftp_auth_password": "Password",
"m365_fsrc_sftp_auth_key": "SSH key",
"m365_fsrc_sftp_pw": "Password",
"m365_fsrc_sftp_pw_hint": "Password is saved to the OS keychain — never stored in a file.",
"m365_fsrc_sftp_key_upload": "Private key file",
"m365_fsrc_sftp_key_btn": "Upload key",
"m365_fsrc_sftp_key_uploaded": "Key uploaded",
"m365_fsrc_sftp_passphrase": "Passphrase (if key is encrypted)",
"m365_fsrc_sftp_passphrase_hint": "Passphrase is saved to the OS keychain — never stored in a file.",
"m365_fsrc_sftp_not_installed": "paramiko not installed — run: pip install paramiko",
"m365_fsrc_sftp_host_required": "SFTP host is required.",
"m365_fsrc_sftp_user_required": "SFTP username is required.",
"m365_fsrc_scan_btn": "Scan", "m365_fsrc_scan_btn": "Scan",
"m365_fsrc_scan_start": "Starting file scan", "m365_fsrc_scan_start": "Starting file scan",
"m365_src_group_files": "File sources", "m365_src_group_files": "File sources",
@ -712,6 +731,9 @@
"m365_sched_editor_edit": "Edit scheduled scan", "m365_sched_editor_edit": "Edit scheduled scan",
"m365_sched_name_required": "Name is required", "m365_sched_name_required": "Name is required",
"m365_sched_no_runs": "No scheduled runs yet", "m365_sched_no_runs": "No scheduled runs yet",
"m365_sched_no_jobs": "No scheduled scans yet.",
"m365_sched_running": "Running...",
"m365_sched_disabled": "Disabled",
"m365_sched_freq_daily": "Daily", "m365_sched_freq_daily": "Daily",
"m365_sched_freq_weekly": "Weekly", "m365_sched_freq_weekly": "Weekly",
"m365_sched_freq_monthly": "Monthly", "m365_sched_freq_monthly": "Monthly",

View File

@ -37,7 +37,8 @@ pystray>=0.19 # System tray icon
# ── File system scanning (optional) ────────────────────────────────────────── # ── File system scanning (optional) ──────────────────────────────────────────
smbprotocol>=1.13 # SMB2/3 network share scanning without mounting smbprotocol>=1.13 # SMB2/3 network share scanning without mounting
keyring>=25.0 # OS keychain credential storage for SMB passwords paramiko>=3.4 # SFTP scanning over SSH
keyring>=25.0 # OS keychain credential storage for SMB/SFTP passwords
python-dotenv>=1.0 # .env file fallback for headless SMB credentials python-dotenv>=1.0 # .env file fallback for headless SMB credentials
# ── Scheduler (#19) ────────────────────────────────────────────────────────── # ── Scheduler (#19) ──────────────────────────────────────────────────────────

View File

@ -44,6 +44,7 @@ def _build_excel_bytes(role: str = "") -> tuple[bytes, str]:
"gdrive": ("💾 Google Drive", "D5F5E3"), "gdrive": ("💾 Google Drive", "D5F5E3"),
"local": ("📁 Local", "E6F7E6"), "local": ("📁 Local", "E6F7E6"),
"smb": ("🌐 Network", "E0F0FA"), "smb": ("🌐 Network", "E0F0FA"),
"sftp": ("🔒 SFTP", "EDE9F7"),
} }
COLS = [ COLS = [
("Name / Subject", 45), ("Name / Subject", 45),
@ -403,6 +404,7 @@ def _build_article30_docx(role: str = "") -> tuple[bytes, str]:
"gdrive": "Google Drive", "gdrive": "Google Drive",
"local": "Local files", "local": "Local files",
"smb": "Network / SMB", "smb": "Network / SMB",
"sftp": "SFTP",
} }
# ── Colour palette ──────────────────────────────────────────────────────── # ── Colour palette ────────────────────────────────────────────────────────
@ -597,7 +599,7 @@ def _build_article30_docx(role: str = "") -> tuple[bytes, str]:
r = p.add_run(txt); r.bold = True r = p.add_run(txt); r.bold = True
r.font.size = Pt(10); r.font.color.rgb = WHITE r.font.size = Pt(10); r.font.color.rgb = WHITE
for src_key in ("email", "onedrive", "sharepoint", "teams", "gmail", "gdrive", "local", "smb"): for src_key in ("email", "onedrive", "sharepoint", "teams", "gmail", "gdrive", "local", "smb", "sftp"):
if src_key not in scanned_sources: if src_key not in scanned_sources:
continue continue
src_items = by_source.get(src_key, []) src_items = by_source.get(src_key, [])

View File

@ -3,9 +3,11 @@ File sources and file scan
""" """
from __future__ import annotations from __future__ import annotations
import threading import threading
import uuid as _uuid
from pathlib import Path
from flask import Blueprint, jsonify, request from flask import Blueprint, jsonify, request
from routes import state from routes import state
from app_config import _load_file_sources, _save_file_sources from app_config import _load_file_sources, _save_file_sources, _SFTP_KEYS_DIR
try: try:
from file_scanner import store_smb_password, SMB_OK as _SMB_OK from file_scanner import store_smb_password, SMB_OK as _SMB_OK
@ -15,6 +17,12 @@ except ImportError:
_SMB_OK = False _SMB_OK = False
def store_smb_password(*a, **kw): return False # type: ignore[misc] def store_smb_password(*a, **kw): return False # type: ignore[misc]
try:
from sftp_connector import store_sftp_password, SFTP_OK as _SFTP_OK
except ImportError:
_SFTP_OK = False
def store_sftp_password(*a, **kw): return False # type: ignore[misc]
bp = Blueprint("sources", __name__) bp = Blueprint("sources", __name__)
@ -23,20 +31,31 @@ def file_sources_list():
"""Return all saved file source definitions.""" """Return all saved file source definitions."""
sources = _load_file_sources() sources = _load_file_sources()
return jsonify({ return jsonify({
"sources": sources, "sources": sources,
"smb_available": _SMB_OK, "smb_available": _SMB_OK,
"scanner_ok": _FILE_SCANNER_OK, "sftp_available": _SFTP_OK,
"scanner_ok": _FILE_SCANNER_OK,
}) })
@bp.route("/api/file_sources/save", methods=["POST"]) @bp.route("/api/file_sources/save", methods=["POST"])
def file_sources_save(): def file_sources_save():
"""Add or update a file source. Assigns a UUID if id is missing.""" """Add or update a file source. Assigns a UUID if id is missing."""
import uuid as _uuid
data = request.get_json() or {} data = request.get_json() or {}
path = data.get("path", "").strip() source_type = data.get("source_type", "")
if not path:
return jsonify({"error": "path required"}), 400 # Validate required fields per source type
if source_type == "sftp":
if not data.get("sftp_host", "").strip():
return jsonify({"error": "sftp_host required"}), 400
if not data.get("sftp_user", "").strip():
return jsonify({"error": "sftp_user required"}), 400
if not data.get("path", "").strip():
data["path"] = "/"
else:
if not data.get("path", "").strip():
return jsonify({"error": "path required"}), 400
sources = _load_file_sources() sources = _load_file_sources()
uid = data.get("id") or "" uid = data.get("id") or ""
for i, s in enumerate(sources): for i, s in enumerate(sources):
@ -52,41 +71,116 @@ def file_sources_save():
@bp.route("/api/file_sources/delete", methods=["POST"]) @bp.route("/api/file_sources/delete", methods=["POST"])
def file_sources_delete(): def file_sources_delete():
"""Remove a file source by id.""" """Remove a file source by id. Also deletes any associated SFTP key file."""
uid = (request.get_json() or {}).get("id", "") uid = (request.get_json() or {}).get("id", "")
if not uid: if not uid:
return jsonify({"error": "id required"}), 400 return jsonify({"error": "id required"}), 400
sources = [s for s in _load_file_sources() if s.get("id") != uid] sources = _load_file_sources()
deleted = next((s for s in sources if s.get("id") == uid), None)
sources = [s for s in sources if s.get("id") != uid]
_save_file_sources(sources) _save_file_sources(sources)
# Clean up key file if this was an SFTP key-auth source
if deleted and deleted.get("sftp_key_path"):
key_file = Path(deleted["sftp_key_path"])
if key_file.parent == _SFTP_KEYS_DIR and key_file.exists():
try:
key_file.unlink()
except OSError:
pass
return jsonify({"ok": True}) return jsonify({"ok": True})
@bp.route("/api/file_sources/store_creds", methods=["POST"]) @bp.route("/api/file_sources/store_creds", methods=["POST"])
def file_sources_store_creds(): def file_sources_store_creds():
"""Store SMB password in the OS keychain.""" """Store SMB or SFTP password/passphrase in the OS keychain."""
if not _FILE_SCANNER_OK: data = request.get_json() or {}
return jsonify({"error": "file_scanner not available"}), 503 source_type = data.get("source_type", "smb")
data = request.get_json() or {} password = data.get("password", "")
smb_host = data.get("smb_host", "")
smb_user = data.get("smb_user", "") if source_type == "sftp":
password = data.get("password", "") if not _SFTP_OK:
key = data.get("keychain_key") or smb_user return jsonify({"error": "paramiko not installed — run: pip install paramiko"}), 503
if not smb_user or not password: host = data.get("sftp_host", "")
return jsonify({"error": "smb_user and password required"}), 400 user = data.get("sftp_user", "")
ok = store_smb_password(smb_host, smb_user, password, key) if not user or not password:
if ok: return jsonify({"error": "sftp_user and password required"}), 400
return jsonify({"ok": True, "keychain_key": key}) key = data.get("keychain_key") or f"sftp:{user}@{host}"
return jsonify({"error": "keyring not available — install: pip install keyring"}), 500 ok = store_sftp_password(host, user, password, key)
if ok:
return jsonify({"ok": True, "keychain_key": key})
return jsonify({"error": "keyring not available — install: pip install keyring"}), 500
else:
if not _FILE_SCANNER_OK:
return jsonify({"error": "file_scanner not available"}), 503
smb_host = data.get("smb_host", "")
smb_user = data.get("smb_user", "")
if not smb_user or not password:
return jsonify({"error": "smb_user and password required"}), 400
key = data.get("keychain_key") or smb_user
ok = store_smb_password(smb_host, smb_user, password, key)
if ok:
return jsonify({"ok": True, "keychain_key": key})
return jsonify({"error": "keyring not available — install: pip install keyring"}), 500
@bp.route("/api/file_sources/upload_key", methods=["POST"])
def file_sources_upload_key():
"""Accept an SSH private key file upload and store it in the SFTP keys directory.
Validates the file is a recognised private key format before saving.
Returns {"key_id": uuid, "key_path": absolute_path}.
"""
if not _SFTP_OK:
return jsonify({"error": "paramiko not installed — run: pip install paramiko"}), 503
if "key_file" not in request.files:
return jsonify({"error": "key_file required"}), 400
file = request.files["key_file"]
raw = file.read(65536) # 64 KB is more than enough for any private key
# Validate before saving — try loading the key material with paramiko
import io
import paramiko
loaded = False
for cls in (paramiko.RSAKey, paramiko.Ed25519Key, paramiko.ECDSAKey, paramiko.DSSKey):
try:
cls.from_private_key(io.BytesIO(raw))
loaded = True
break
except (paramiko.ssh_exception.SSHException, Exception):
continue
if not loaded:
# Might be passphrase-protected — still accept it; validation will happen at connect time
if b"-----BEGIN" not in raw and b"OPENSSH PRIVATE KEY" not in raw:
return jsonify({"error": "File does not appear to be a private key"}), 400
key_id = str(_uuid.uuid4())
key_path = _SFTP_KEYS_DIR / key_id
key_path.write_bytes(raw)
key_path.chmod(0o600)
return jsonify({"ok": True, "key_id": key_id, "key_path": str(key_path)})
@bp.route("/api/file_scan/start", methods=["POST"]) @bp.route("/api/file_scan/start", methods=["POST"])
def file_scan_start(): def file_scan_start():
"""Start a file system scan for a single file source.""" """Start a file system scan for a single file source (local, SMB, or SFTP)."""
if not _FILE_SCANNER_OK: source = request.get_json() or {}
source_type = source.get("source_type", "")
if source_type == "sftp":
if not _SFTP_OK:
return jsonify({"error": "paramiko not installed — run: pip install paramiko"}), 503
elif not _FILE_SCANNER_OK:
return jsonify({"error": "file_scanner not available"}), 503 return jsonify({"error": "file_scanner not available"}), 503
if not state._scan_lock.acquire(blocking=False): if not state._scan_lock.acquire(blocking=False):
return jsonify({"error": "scan already running"}), 409 return jsonify({"error": "scan already running"}), 409
source = request.get_json() or {}
state._scan_abort.clear() state._scan_abort.clear()
def _run(): def _run():

View File

@ -75,6 +75,12 @@ except ImportError:
FileScanner = None # type: ignore[assignment,misc] FileScanner = None # type: ignore[assignment,misc]
FILE_SCANNER_OK = False FILE_SCANNER_OK = False
try:
from sftp_connector import SFTPScanner, SFTP_OK as _SFTP_OK
except ImportError:
SFTPScanner = None # type: ignore[assignment,misc]
_SFTP_OK = False
try: try:
import document_scanner as ds import document_scanner as ds
SCANNER_OK = True SCANNER_OK = True
@ -151,18 +157,21 @@ def _with_disposition(card: dict, db) -> dict:
def run_file_scan(source: dict): def run_file_scan(source: dict):
"""Scan a single local or SMB file source for CPR numbers and PII. """Scan a single local, SMB, or SFTP file source for CPR numbers and PII.
Reuses _scan_bytes, _broadcast_card, _check_special_category, Reuses _scan_bytes, _broadcast_card, _check_special_category,
_detect_photo_faces and all other existing scan helpers. _detect_photo_faces and all other existing scan helpers.
Args: Args:
source: file source dict with keys: source: file source dict with keys:
path, label, smb_host, smb_user, smb_domain, keychain_key, source_type ("local"|"smb"|"sftp"), path, label,
smb_host, smb_user, smb_domain, keychain_key,
sftp_host, sftp_port, sftp_user, sftp_auth, sftp_key_path,
scan_photos (bool), max_file_mb (int) scan_photos (bool), max_file_mb (int)
""" """
# state vars accessed via _state module # state vars accessed via _state module
source_kind = source.get("source_type", "")
path = source.get("path", "") path = source.get("path", "")
label = source.get("label") or path label = source.get("label") or path
smb_host = source.get("smb_host") or None smb_host = source.get("smb_host") or None
@ -175,7 +184,11 @@ def run_file_scan(source: dict):
min_cpr_count = max(1, int(source.get("min_cpr_count", 1))) min_cpr_count = max(1, int(source.get("min_cpr_count", 1)))
max_mb = int(source.get("max_file_mb", 50)) max_mb = int(source.get("max_file_mb", 50))
if not FILE_SCANNER_OK: if source_kind == "sftp":
if not _SFTP_OK:
broadcast("scan_error", {"file": label, "error": "paramiko not installed — run: pip install paramiko"})
return
elif not FILE_SCANNER_OK:
broadcast("scan_error", {"file": label, "error": "file_scanner.py not found"}) broadcast("scan_error", {"file": label, "error": "file_scanner.py not found"})
return return
@ -200,15 +213,30 @@ def run_file_scan(source: dict):
broadcast("scan_phase", {"phase": f"Files \u2014 {label}"}) broadcast("scan_phase", {"phase": f"Files \u2014 {label}"})
try: try:
fs = FileScanner( if source_kind == "sftp":
path=path, fs = SFTPScanner(
smb_host=smb_host, host=source.get("sftp_host", ""),
smb_user=smb_user, root_path=path,
smb_password=smb_password, username=source.get("sftp_user", ""),
smb_domain=smb_domain, port=int(source.get("sftp_port", 22)),
keychain_key=keychain_key, auth_type=source.get("sftp_auth", "password"),
max_file_bytes=max_mb * 1_048_576, password=source.get("sftp_password") or None,
) key_path=source.get("sftp_key_path") or None,
passphrase=source.get("sftp_passphrase") or None,
keychain_key=keychain_key,
max_file_bytes=max_mb * 1_048_576,
label=label,
)
else:
fs = FileScanner(
path=path,
smb_host=smb_host,
smb_user=smb_user,
smb_password=smb_password,
smb_domain=smb_domain,
keychain_key=keychain_key,
max_file_bytes=max_mb * 1_048_576,
)
def _progress(rel_path: str): def _progress(rel_path: str):
broadcast("scan_file", {"file": rel_path}) broadcast("scan_file", {"file": rel_path})

245
sftp_connector.py Normal file
View File

@ -0,0 +1,245 @@
"""
sftp_connector.py SFTP file iterator for GDPR Scanner.
Provides SFTPScanner.iter_files() which yields (relative_path, bytes, metadata)
for files on an SFTP/SSH server, using the same interface as FileScanner so that
run_file_scan() in scan_engine.py works identically for all three source types.
Optional dependency:
paramiko>=3.4 SSH/SFTP client (pip install paramiko)
If paramiko is not installed, SFTP_OK is False and callers must check before use.
"""
from __future__ import annotations
import stat
import time
from pathlib import PurePosixPath
from typing import Iterator
from file_scanner import SKIP_DIRS, MAX_FILE_BYTES, _skip, _error, KEYCHAIN_SERVICE
# ── Optional dependency ───────────────────────────────────────────────────────
try:
import paramiko
SFTP_OK = True
except ImportError:
SFTP_OK = False
try:
import keyring as _keyring
_KEYRING_OK = True
except ImportError:
_KEYRING_OK = False
# ── Credential helpers ────────────────────────────────────────────────────────
def get_sftp_password(host: str, user: str, keychain_key: str | None = None) -> str | None:
"""Return SFTP password or key passphrase from OS keychain."""
if not _KEYRING_OK:
return None
account = keychain_key or f"sftp:{user}@{host}"
try:
return _keyring.get_password(KEYCHAIN_SERVICE, account) or None
except Exception:
return None
def store_sftp_password(host: str, user: str, password: str,
keychain_key: str | None = None) -> bool:
"""Store SFTP password or passphrase in the OS keychain. Returns True on success."""
if not _KEYRING_OK:
return False
account = keychain_key or f"sftp:{user}@{host}"
try:
_keyring.set_password(KEYCHAIN_SERVICE, account, password)
return True
except Exception:
return False
# ── SFTPScanner ───────────────────────────────────────────────────────────────
class SFTPScanner:
"""SFTP file iterator — identical iter_files() interface to FileScanner."""
def __init__(
self,
host: str,
root_path: str,
username: str,
port: int = 22,
auth_type: str = "password", # "password" | "key"
password: str | None = None,
key_path: str | None = None,
passphrase: str | None = None,
keychain_key: str | None = None,
max_file_bytes: int = MAX_FILE_BYTES,
label: str = "",
):
self.host = host
self.port = port
self.root_path = root_path.rstrip("/") or "/"
self.username = username
self.auth_type = auth_type
self.key_path = key_path
self.keychain_key = keychain_key
self.max_file_bytes = max_file_bytes
self.label = label or f"{username}@{host}"
# Resolve credentials from keychain if not provided directly
self._password = password
self._passphrase = passphrase
if not self._password and auth_type == "password":
self._password = get_sftp_password(host, username, keychain_key)
if not self._passphrase and auth_type == "key" and key_path:
self._passphrase = get_sftp_password(host, username, keychain_key)
@staticmethod
def sftp_available() -> bool:
return SFTP_OK
@property
def source_type(self) -> str:
return "sftp"
# ── Public ────────────────────────────────────────────────────────────────
def iter_files(
self,
extensions: set[str] | None = None,
progress_cb=None,
) -> Iterator[tuple[str, bytes | None, dict]]:
"""Yield (relative_path, content_bytes, metadata) for every scannable file.
Same contract as FileScanner.iter_files() oversized and unreadable files
yield a sentinel with content=None and meta['skipped']=True.
"""
if not SFTP_OK:
raise RuntimeError("paramiko not installed — run: pip install paramiko")
from cpr_detector import SUPPORTED_EXTS as DEFAULT_EXTENSIONS
exts = extensions or DEFAULT_EXTENSIONS
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
connect_kwargs: dict = {
"hostname": self.host,
"port": self.port,
"username": self.username,
"timeout": 30,
}
if self.auth_type == "key" and self.key_path:
pkey = _load_pkey(self.key_path, self._passphrase)
connect_kwargs["pkey"] = pkey
else:
connect_kwargs["password"] = self._password or ""
# Disable agent and key lookup when using password so paramiko doesn't
# prompt interactively when the server advertises pubkey auth.
connect_kwargs["look_for_keys"] = False
connect_kwargs["allow_agent"] = False
ssh.connect(**connect_kwargs)
try:
sftp = ssh.open_sftp()
try:
yield from self._walk(sftp, self.root_path, exts, progress_cb)
finally:
sftp.close()
finally:
ssh.close()
# ── Private walker ────────────────────────────────────────────────────────
def _walk(
self,
sftp,
directory: str,
exts: set[str],
progress_cb,
) -> Iterator[tuple[str, bytes | None, dict]]:
source_root = f"sftp://{self.username}@{self.host}{self.root_path}"
try:
entries = sftp.listdir_attr(directory)
except OSError as e:
rel = _rel(directory, self.root_path) or "."
yield _error(rel, str(e), "sftp", source_root)
return
for attr in entries:
name = attr.filename
if name.startswith("."):
continue
if name.lower() in SKIP_DIRS:
continue
full_remote = f"{directory}/{name}".replace("//", "/")
rel = _rel(full_remote, self.root_path)
if attr.st_mode is not None and stat.S_ISDIR(attr.st_mode):
yield from self._walk(sftp, full_remote, exts, progress_cb)
continue
ext = PurePosixPath(name).suffix.lower()
if ext not in exts:
continue
size = attr.st_size or 0
if size > self.max_file_bytes:
yield _skip(rel, size, "sftp", source_root)
continue
if progress_cb:
progress_cb(rel)
modified = (
time.strftime("%Y-%m-%d", time.gmtime(attr.st_mtime))
if attr.st_mtime else ""
)
meta = {
"size_kb": round(size / 1024, 1),
"modified": modified,
"source_type": "sftp",
"source_root": source_root,
"full_path": full_remote,
"skipped": False,
}
try:
with sftp.open(full_remote, "rb") as fh:
content = fh.read(self.max_file_bytes)
yield rel, content, meta
except OSError as e:
yield _error(rel, str(e), "sftp", source_root)
# ── Helpers ───────────────────────────────────────────────────────────────────
def _rel(full_path: str, root: str) -> str:
"""Return path relative to root, stripping leading slash."""
if full_path.startswith(root):
return full_path[len(root):].lstrip("/")
return full_path.lstrip("/")
def _load_pkey(key_path: str, passphrase: str | None):
"""Load a private key from disk, trying RSA → Ed25519 → ECDSA → DSS."""
for cls in (
paramiko.RSAKey,
paramiko.Ed25519Key,
paramiko.ECDSAKey,
paramiko.DSSKey,
):
try:
return cls.from_private_key_file(key_path, password=passphrase)
except paramiko.ssh_exception.SSHException:
continue
except FileNotFoundError:
raise
raise ValueError(f"Unrecognised private key format: {key_path}")

View File

@ -31,6 +31,7 @@ Never revert to `!!window._googleConnected` / `_fileSources.length > 0` — thos
## Gotchas ## Gotchas
- **`scheduler.js` strings must use `t()`** — frequency labels (`m365_sched_freq_daily/weekly/monthly`), "Next" (`m365_sched_next`), "Running..." (`m365_sched_running`), "Disabled" (`m365_sched_disabled`), empty-job text (`m365_sched_no_jobs`), and empty-history text (`m365_sched_no_runs`) all have translation keys. Do not hard-code English strings in `schedLoad()` or `schedRenderJobs()`.
- **Profile editor accounts** — default to unchecked. Only explicitly saved `user_ids` are checked. - **Profile editor accounts** — default to unchecked. Only explicitly saved `user_ids` are checked.
- **Date presets** — stored as `years * 365` (integer days). Do not use `* 365.25`. - **Date presets** — stored as `years * 365` (integer days). Do not use `* 365.25`.
- **`copyTokenLink` is async** — called from `onclick` attributes as a fire-and-forget (the Promise is unhandled, which is fine). It `await`s `_getShareBaseUrl()` to get the machine's LAN IP before building the URL. Do not make it synchronous or revert to `window.location.origin` directly. - **`copyTokenLink` is async** — called from `onclick` attributes as a fire-and-forget (the Promise is unhandled, which is fine). It `await`s `_getShareBaseUrl()` to get the machine's LAN IP before building the URL. Do not make it synchronous or revert to `window.location.origin` directly.

View File

@ -378,6 +378,19 @@ function getGoogleScanOptions() {
// ── File sources pane ───────────────────────────────────────────────────────── // ── File sources pane ─────────────────────────────────────────────────────────
function _srcIcon(s) {
if (s.source_type === 'sftp') return '\uD83D\uDD12';
const isSmb = s.path && (s.path.startsWith('//') || s.path.startsWith('\\\\'));
return isSmb ? '\uD83C\uDF10' : '\uD83D\uDCC1';
}
function _srcSubtitle(s) {
if (s.source_type === 'sftp') {
return _esc((s.sftp_user||'')+'@'+(s.sftp_host||'')+(s.path||'/'));
}
return _esc(s.path||'')+(s.smb_user?' \u00b7 \uD83D\uDC64 '+_esc(s.smb_user):'');
}
function srcFileRenderList() { function srcFileRenderList() {
const list = document.getElementById('srcFileList'); const list = document.getElementById('srcFileList');
if (!list) return; if (!list) return;
@ -386,9 +399,8 @@ function srcFileRenderList() {
return; return;
} }
list.innerHTML = S._fileSources.map(function(s) { list.innerHTML = S._fileSources.map(function(s) {
const isSmb = s.path && (s.path.startsWith('//') || s.path.startsWith('\\\\')); const icon = _srcIcon(s);
const icon = isSmb ? '\uD83C\uDF10' : '\uD83D\uDCC1'; const sid = _esc(s.id||'');
const sid = _esc(s.id||'');
const slabel = _esc(s.label||s.path||''); const slabel = _esc(s.label||s.path||'');
return '<div class="fsrc-row">' return '<div class="fsrc-row">'
+'<div class="fsrc-row-head">' +'<div class="fsrc-row-head">'
@ -398,11 +410,47 @@ function srcFileRenderList() {
+'<button class="btn-edit" onclick="srcFileEdit(\''+sid+'\')" style="background:none;border:1px solid var(--border);color:var(--muted);padding:2px 7px;border-radius:4px;font-size:10px;cursor:pointer">'+t('m365_fsrc_edit_btn','Edit')+'</button>' +'<button class="btn-edit" onclick="srcFileEdit(\''+sid+'\')" style="background:none;border:1px solid var(--border);color:var(--muted);padding:2px 7px;border-radius:4px;font-size:10px;cursor:pointer">'+t('m365_fsrc_edit_btn','Edit')+'</button>'
+'<button class="btn-del" onclick="srcFileDelete(\''+sid+'\',\''+slabel+'\')">'+t('m365_profile_delete','Delete')+'</button>' +'<button class="btn-del" onclick="srcFileDelete(\''+sid+'\',\''+slabel+'\')">'+t('m365_profile_delete','Delete')+'</button>'
+'</div></div>' +'</div></div>'
+'<div class="fsrc-row-path">'+_esc(s.path||'')+(s.smb_user?' \u00b7 \uD83D\uDC64 '+_esc(s.smb_user):'')+'</div>' +'<div class="fsrc-row-path">'+_srcSubtitle(s)+'</div>'
+'</div>'; +'</div>';
}).join(''); }).join('');
} }
function srcFileTypeSelect(type) {
document.getElementById('srcFileSourceType').value = type;
var pathRow = document.getElementById('srcFilePathRow');
var smbFields = document.getElementById('srcFileSmbFields');
var sftpFields= document.getElementById('srcFileSftpFields');
if (pathRow) pathRow.style.display = type === 'sftp' ? 'none' : '';
if (smbFields) smbFields.style.display = type === 'smb' ? 'flex' : 'none';
if (sftpFields)sftpFields.style.display= type === 'sftp' ? 'flex' : 'none';
['srcTypeLocal','srcTypeSmb','srcTypeSftp'].forEach(function(id) {
var btn = document.getElementById(id);
if (!btn) return;
var active = (id === 'srcType' + type.charAt(0).toUpperCase() + type.slice(1));
btn.style.background = active ? 'var(--accent)' : 'none';
btn.style.color = active ? '#fff' : 'var(--muted)';
});
}
function srcFileAutoNameSftp() {
var labelEl = document.getElementById('srcFileLabel');
if (labelEl && labelEl._userEdited) return;
var host = (document.getElementById('srcFileSftpHost')||{}).value || '';
if (labelEl && host) labelEl.value = host;
}
function srcFileSftpAuthSelect(authType) {
document.getElementById('srcFileSftpAuth').value = authType;
var pwFields = document.getElementById('srcSftpPwFields');
var keyFields = document.getElementById('srcSftpKeyFields');
var btnPw = document.getElementById('srcSftpAuthPw');
var btnKey = document.getElementById('srcSftpAuthKey');
if (pwFields) pwFields.style.display = authType === 'password' ? '' : 'none';
if (keyFields) keyFields.style.display = authType === 'key' ? 'flex' : 'none';
if (btnPw) { btnPw.style.background = authType==='password'?'var(--accent)':'none'; btnPw.style.color = authType==='password'?'#fff':'var(--muted)'; }
if (btnKey) { btnKey.style.background = authType==='key'?'var(--accent)':'none'; btnKey.style.color = authType==='key'?'#fff':'var(--muted)'; }
}
function srcFileDetectSmb() { function srcFileDetectSmb() {
const p = document.getElementById('srcFilePath').value; const p = document.getElementById('srcFilePath').value;
const isSmb = p.startsWith('//') || p.startsWith('\\\\'); const isSmb = p.startsWith('//') || p.startsWith('\\\\');
@ -427,30 +475,80 @@ function srcFileAutoName() {
} }
async function srcFileAdd() { async function srcFileAdd() {
const label = document.getElementById('srcFileLabel').value.trim(); const label = document.getElementById('srcFileLabel').value.trim();
const path = document.getElementById('srcFilePath').value.trim(); const sourceType = (document.getElementById('srcFileSourceType')||{}).value || 'local';
const smbHost = document.getElementById('srcFileSmbHost').value.trim(); const stat = document.getElementById('srcFileStatus');
const smbUser = document.getElementById('srcFileSmbUser').value.trim(); const editIdEl = document.getElementById('srcFileEditId');
const smbPw = document.getElementById('srcFileSmbPw').value; const existingId = editIdEl ? editIdEl.value : '';
const stat = document.getElementById('srcFileStatus');
if (!label) { stat.style.color='var(--danger)'; stat.textContent=t('m365_fsrc_name_required','Name is required.'); document.getElementById('srcFileLabel').focus(); return; } if (!label) { stat.style.color='var(--danger)'; stat.textContent=t('m365_fsrc_name_required','Name is required.'); document.getElementById('srcFileLabel').focus(); return; }
if (!path) { stat.style.color='var(--danger)'; stat.textContent=t('m365_fsrc_path_required','Path is required.'); return; }
stat.style.color='var(--muted)'; stat.textContent=t('m365_fsrc_saving','Saving...'); stat.style.color='var(--muted)'; stat.textContent=t('m365_fsrc_saving','Saving...');
if (smbPw && smbUser) {
try { await fetch('/api/file_sources/store_creds',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify({smb_host:smbHost,smb_user:smbUser,password:smbPw})}); } catch(e){} var body = {label, source_type: sourceType};
if (existingId) body.id = existingId;
if (sourceType === 'sftp') {
const sftpHost = document.getElementById('srcFileSftpHost').value.trim();
const sftpUser = document.getElementById('srcFileSftpUser').value.trim();
const sftpPath = document.getElementById('srcFileSftpPath').value.trim() || '/';
const sftpPort = parseInt(document.getElementById('srcFileSftpPort').value) || 22;
const sftpAuth = document.getElementById('srcFileSftpAuth').value || 'password';
if (!sftpHost) { stat.style.color='var(--danger)'; stat.textContent=t('m365_fsrc_sftp_host_required','SFTP host is required.'); return; }
if (!sftpUser) { stat.style.color='var(--danger)'; stat.textContent=t('m365_fsrc_sftp_user_required','SFTP username is required.'); return; }
Object.assign(body, {sftp_host:sftpHost, sftp_port:sftpPort, sftp_user:sftpUser, sftp_auth:sftpAuth, path:sftpPath});
if (sftpAuth === 'password') {
const sftpPw = document.getElementById('srcFileSftpPw').value;
if (sftpPw) {
try { await fetch('/api/file_sources/store_creds',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify({source_type:'sftp',sftp_host:sftpHost,sftp_user:sftpUser,password:sftpPw})}); } catch(e){}
}
} else {
// Upload key file if one is selected
const keyFileEl = document.getElementById('srcFileSftpKeyFile');
const keyStatusEl = document.getElementById('srcFileSftpKeyStatus');
const keyPathEl = document.getElementById('srcFileSftpKeyPath');
if (keyFileEl && keyFileEl.files.length && !keyPathEl.value) {
try {
const fd = new FormData(); fd.append('key_file', keyFileEl.files[0]);
const kr = await fetch('/api/file_sources/upload_key',{method:'POST',body:fd});
const kd = await kr.json();
if (kd.error) { stat.style.color='var(--danger)'; stat.textContent=kd.error; return; }
keyPathEl.value = kd.key_path;
if (keyStatusEl) keyStatusEl.textContent = t('m365_fsrc_sftp_key_uploaded','Key uploaded');
} catch(e){ stat.style.color='var(--danger)'; stat.textContent=e.message; return; }
}
body.sftp_key_path = keyPathEl ? keyPathEl.value : '';
const passphrase = (document.getElementById('srcFileSftpPassphrase')||{}).value || '';
if (passphrase) {
const passphraseKey = sftpHost+':'+sftpUser+':passphrase';
try { await fetch('/api/file_sources/store_creds',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify({source_type:'sftp',sftp_host:sftpHost,sftp_user:sftpUser,password:passphrase,keychain_key:passphraseKey})}); } catch(e){}
body.keychain_key = passphraseKey;
}
}
} else {
const path = document.getElementById('srcFilePath').value.trim();
const smbHost = document.getElementById('srcFileSmbHost').value.trim();
const smbUser = document.getElementById('srcFileSmbUser').value.trim();
const smbPw = document.getElementById('srcFileSmbPw').value;
if (!path) { stat.style.color='var(--danger)'; stat.textContent=t('m365_fsrc_path_required','Path is required.'); return; }
Object.assign(body, {path, smb_host:smbHost, smb_user:smbUser});
if (smbPw && smbUser) {
try { await fetch('/api/file_sources/store_creds',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify({source_type:'smb',smb_host:smbHost,smb_user:smbUser,password:smbPw})}); } catch(e){}
}
} }
try { try {
const editId = document.getElementById('srcFileEditId');
const existingId = editId ? editId.value : '';
const body = {label, path, smb_host:smbHost, smb_user:smbUser};
if (existingId) body.id = existingId;
const r = await fetch('/api/file_sources/save',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify(body)}); const r = await fetch('/api/file_sources/save',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify(body)});
const d = await r.json(); const d = await r.json();
if (d.error) { stat.style.color='var(--danger)'; stat.textContent=d.error; return; } if (d.error) { stat.style.color='var(--danger)'; stat.textContent=d.error; return; }
['srcFileLabel','srcFilePath','srcFileSmbHost','srcFileSmbUser','srcFileSmbPw'].forEach(function(id){const el=document.getElementById(id);if(el){el.value='';el._userEdited=false;}}); // Reset form
if (editId) editId.value=''; ['srcFileLabel','srcFilePath','srcFileSmbHost','srcFileSmbUser','srcFileSmbPw',
'srcFileSftpHost','srcFileSftpUser','srcFileSftpPw','srcFileSftpPassphrase','srcFileSftpKeyPath'].forEach(function(id){const el=document.getElementById(id);if(el){el.value='';if(el._userEdited!==undefined)el._userEdited=false;}});
var portEl = document.getElementById('srcFileSftpPort'); if(portEl) portEl.value='22';
if (editIdEl) editIdEl.value='';
const addBtn=document.getElementById('srcFileAddBtn'); if(addBtn) addBtn.textContent=t('m365_fsrc_add_btn','Add'); const addBtn=document.getElementById('srcFileAddBtn'); if(addBtn) addBtn.textContent=t('m365_fsrc_add_btn','Add');
document.getElementById('srcFileSmbFields').style.display='none'; srcFileTypeSelect('local');
stat.style.color='var(--accent)'; stat.textContent='\u2714 '+t('m365_fsrc_saved','Source saved'); stat.style.color='var(--accent)'; stat.textContent='\u2714 '+t('m365_fsrc_saved','Source saved');
await _loadFileSources(); await _loadFileSources();
srcFileRenderList(); srcFileRenderList();
@ -462,20 +560,28 @@ function srcFileEdit(id) {
const s = S._fileSources.find(function(x){return x.id===id;}); const s = S._fileSources.find(function(x){return x.id===id;});
if (!s) return; if (!s) return;
const labelEl = document.getElementById('srcFileLabel'); const labelEl = document.getElementById('srcFileLabel');
const pathEl = document.getElementById('srcFilePath');
const hostEl = document.getElementById('srcFileSmbHost');
const userEl = document.getElementById('srcFileSmbUser');
const pwEl = document.getElementById('srcFileSmbPw');
const editId = document.getElementById('srcFileEditId'); const editId = document.getElementById('srcFileEditId');
if (labelEl) { labelEl.value = s.label||''; labelEl._userEdited = true; } if (labelEl) { labelEl.value = s.label||''; labelEl._userEdited = true; }
if (pathEl) pathEl.value = s.path||'';
if (hostEl) hostEl.value = s.smb_host||'';
if (userEl) userEl.value = s.smb_user||'';
if (pwEl) pwEl.value = s.smb_user ? '\u2022\u2022\u2022\u2022\u2022\u2022\u2022\u2022' : '';
if (editId) editId.value = id; if (editId) editId.value = id;
const isSmb = (s.path||'').startsWith('//') || (s.path||'').startsWith('\\\\');
const smbFields = document.getElementById('srcFileSmbFields'); var sourceType = s.source_type || (((s.path||'').startsWith('//')||(s.path||'').startsWith('\\\\')) ? 'smb' : 'local');
if (smbFields) smbFields.style.display = isSmb ? 'flex' : 'none'; srcFileTypeSelect(sourceType);
if (sourceType === 'sftp') {
var hostEl = document.getElementById('srcFileSftpHost'); if(hostEl) hostEl.value = s.sftp_host||'';
var portEl = document.getElementById('srcFileSftpPort'); if(portEl) portEl.value = s.sftp_port||22;
var userEl = document.getElementById('srcFileSftpUser'); if(userEl) userEl.value = s.sftp_user||'';
var pathEl = document.getElementById('srcFileSftpPath'); if(pathEl) pathEl.value = s.path||'/';
var authEl = document.getElementById('srcFileSftpAuth'); if(authEl) authEl.value = s.sftp_auth||'password';
srcFileSftpAuthSelect(s.sftp_auth||'password');
if (s.sftp_key_path) { var kp = document.getElementById('srcFileSftpKeyPath'); if(kp) kp.value=s.sftp_key_path; }
} else {
var pathEl2 = document.getElementById('srcFilePath'); if(pathEl2) pathEl2.value = s.path||'';
var smbHostEl = document.getElementById('srcFileSmbHost'); if(smbHostEl) smbHostEl.value = s.smb_host||'';
var smbUserEl = document.getElementById('srcFileSmbUser'); if(smbUserEl) smbUserEl.value = s.smb_user||'';
var smbPwEl = document.getElementById('srcFileSmbPw'); if(smbPwEl) smbPwEl.value = s.smb_user ? '\u2022\u2022\u2022\u2022\u2022\u2022\u2022\u2022' : '';
}
const btn = document.getElementById('srcFileAddBtn'); const btn = document.getElementById('srcFileAddBtn');
if (btn) btn.textContent = t('m365_fsrc_save_changes','Save changes'); if (btn) btn.textContent = t('m365_fsrc_save_changes','Save changes');
const stat = document.getElementById('srcFileStatus'); const stat = document.getElementById('srcFileStatus');
@ -547,9 +653,7 @@ function _renderFileSources() {
return; return;
} }
list.innerHTML = S._fileSources.map(function(s) { list.innerHTML = S._fileSources.map(function(s) {
const isSmb = s.path && (s.path.startsWith('//') || s.path.startsWith('\\\\')); const icon = _srcIcon(s);
const icon = isSmb ? '\uD83C\uDF10' : '\uD83D\uDCC1';
const userPart = s.smb_user ? ' \u00b7 \uD83D\uDC64 ' + _esc(s.smb_user) : '';
const sid = _esc(s.id || ''); const sid = _esc(s.id || '');
const slabel = _esc(s.label || s.path || ''); const slabel = _esc(s.label || s.path || '');
return '<div class="fsrc-row">' return '<div class="fsrc-row">'
@ -559,7 +663,7 @@ function _renderFileSources() {
+ '<button class="btn-scan" onclick="fsrcScan(\'' + sid + '\')">&#9654; ' + t('m365_fsrc_scan_btn','Scan') + '</button>' + '<button class="btn-scan" onclick="fsrcScan(\'' + sid + '\')">&#9654; ' + t('m365_fsrc_scan_btn','Scan') + '</button>'
+ '<button class="btn-del" onclick="fsrcDelete(\'' + sid + '\',\'' + slabel + '\')">' + t('m365_profile_delete','Delete') + '</button>' + '<button class="btn-del" onclick="fsrcDelete(\'' + sid + '\',\'' + slabel + '\')">' + t('m365_profile_delete','Delete') + '</button>'
+ '</div></div>' + '</div></div>'
+ '<div class="fsrc-row-path">' + _esc(s.path || '') + userPart + '</div>' + '<div class="fsrc-row-path">' + _srcSubtitle(s) + '</div>'
+ '</div>'; + '</div>';
}).join(''); }).join('');
} }
@ -667,6 +771,9 @@ window.getGoogleScanOptions = getGoogleScanOptions;
window.srcFileRenderList = srcFileRenderList; window.srcFileRenderList = srcFileRenderList;
window.srcFileDetectSmb = srcFileDetectSmb; window.srcFileDetectSmb = srcFileDetectSmb;
window.srcFileAutoName = srcFileAutoName; window.srcFileAutoName = srcFileAutoName;
window.srcFileAutoNameSftp = srcFileAutoNameSftp;
window.srcFileTypeSelect = srcFileTypeSelect;
window.srcFileSftpAuthSelect = srcFileSftpAuthSelect;
window.srcFileAdd = srcFileAdd; window.srcFileAdd = srcFileAdd;
window.srcFileEdit = srcFileEdit; window.srcFileEdit = srcFileEdit;
window.srcFileDelete = srcFileDelete; window.srcFileDelete = srcFileDelete;

View File

@ -18,19 +18,19 @@ function schedLoad() {
var descEl = document.getElementById('schedDesc_' + js.id); var descEl = document.getElementById('schedDesc_' + js.id);
if (!descEl) return; if (!descEl) return;
var j2 = _schedJobs.find(function(x){ return x.id === js.id; }); var j2 = _schedJobs.find(function(x){ return x.id === js.id; });
var freqLabel = !j2 ? '' : (j2.frequency === 'weekly' ? 'Weekly' : j2.frequency === 'monthly' ? 'Monthly' : 'Daily'); var freqLabel = !j2 ? '' : (j2.frequency === 'weekly' ? t('m365_sched_freq_weekly','Weekly') : j2.frequency === 'monthly' ? t('m365_sched_freq_monthly','Monthly') : t('m365_sched_freq_daily','Daily'));
var timeStr = !j2 ? '' : String(j2.hour||0).padStart(2,'0') + ':' + String(j2.minute||0).padStart(2,'0'); var timeStr = !j2 ? '' : String(j2.hour||0).padStart(2,'0') + ':' + String(j2.minute||0).padStart(2,'0');
var base = freqLabel + ' ' + timeStr; var base = freqLabel + ' ' + timeStr;
var runBtn = document.getElementById('schedRunBtn_' + js.id); var runBtn = document.getElementById('schedRunBtn_' + js.id);
if (js.is_running) { if (js.is_running) {
descEl.textContent = base + ' \u00b7 Running...'; descEl.textContent = base + ' \u00b7 ' + t('m365_sched_running','Running...');
if (runBtn) { runBtn.style.borderColor='#22c55e'; runBtn.style.color='#22c55e'; } if (runBtn) { runBtn.style.borderColor='#22c55e'; runBtn.style.color='#22c55e'; }
} else if (js.next_run) { } else if (js.next_run) {
var dt = new Date(js.next_run); var dt = new Date(js.next_run);
descEl.textContent = base + ' \u00b7 Next: ' + dt.toLocaleString(undefined,{month:'short',day:'numeric',hour:'2-digit',minute:'2-digit'}); descEl.textContent = base + ' \u00b7 ' + t('m365_sched_next','Next') + ': ' + dt.toLocaleString(undefined,{month:'short',day:'numeric',hour:'2-digit',minute:'2-digit'});
if (runBtn) { runBtn.style.borderColor='var(--border)'; runBtn.style.color='var(--muted)'; } if (runBtn) { runBtn.style.borderColor='var(--border)'; runBtn.style.color='var(--muted)'; }
} else { } else {
descEl.textContent = base + (js.enabled ? '' : ' \u00b7 Disabled'); descEl.textContent = base + (js.enabled ? '' : ' \u00b7 ' + t('m365_sched_disabled','Disabled'));
if (runBtn) { runBtn.style.borderColor='var(--border)'; runBtn.style.color='var(--muted)'; } if (runBtn) { runBtn.style.borderColor='var(--border)'; runBtn.style.color='var(--muted)'; }
} }
}); });
@ -41,13 +41,13 @@ function schedRenderJobs() {
var list = document.getElementById('schedJobList'); var list = document.getElementById('schedJobList');
if (!list) return; if (!list) return;
if (!_schedJobs.length) { if (!_schedJobs.length) {
list.innerHTML = '<div style="font-size:11px;color:var(--muted);padding:4px 0">No scheduled scans yet.</div>'; list.innerHTML = '<div style="font-size:11px;color:var(--muted);padding:4px 0">' + t('m365_sched_no_jobs','No scheduled scans yet.') + '</div>';
return; return;
} }
list.innerHTML = _schedJobs.map(function(j) { list.innerHTML = _schedJobs.map(function(j) {
var sid = _esc(j.id); var sid = _esc(j.id);
var sname = _esc(j.name || 'Unnamed'); var sname = _esc(j.name || 'Unnamed');
var freqLabel = j.frequency === 'weekly' ? 'Weekly' : j.frequency === 'monthly' ? 'Monthly' : 'Daily'; var freqLabel = j.frequency === 'weekly' ? t('m365_sched_freq_weekly','Weekly') : j.frequency === 'monthly' ? t('m365_sched_freq_monthly','Monthly') : t('m365_sched_freq_daily','Daily');
var timeStr = String(j.hour||0).padStart(2,'0') + ':' + String(j.minute||0).padStart(2,'0'); var timeStr = String(j.hour||0).padStart(2,'0') + ':' + String(j.minute||0).padStart(2,'0');
var desc = freqLabel + ' ' + timeStr; var desc = freqLabel + ' ' + timeStr;
var chk = j.enabled ? ' checked' : ''; var chk = j.enabled ? ' checked' : '';
@ -217,7 +217,7 @@ function schedLoadHistory() {
if (!el) return; if (!el) return;
fetch('/api/scheduler/history?limit=10').then(function(r){ return r.json(); }).then(function(d) { fetch('/api/scheduler/history?limit=10').then(function(r){ return r.json(); }).then(function(d) {
var runs = d.runs || []; var runs = d.runs || [];
if (!runs.length) { el.innerHTML = '<em>No scheduled runs yet</em>'; return; } if (!runs.length) { el.innerHTML = '<em>' + t('m365_sched_no_runs','No scheduled runs yet') + '</em>'; return; }
var html = ''; var html = '';
runs.forEach(function(r) { runs.forEach(function(r) {
var ts = r.started_at ? new Date(r.started_at * 1000).toLocaleString() : '-'; var ts = r.started_at ? new Date(r.started_at * 1000).toLocaleString() : '-';

View File

@ -62,14 +62,15 @@ function renderSourcesPanel() {
S._pendingGoogleSources = null; S._pendingGoogleSources = null;
} }
// File sources (local / SMB) — one entry per saved source // File sources (local / SMB / SFTP) — one entry per saved source
if (S._fileSources.length > 0) { if (S._fileSources.length > 0) {
html += '<div style="margin:6px 0 2px;font-size:10px;color:var(--muted);text-transform:uppercase;letter-spacing:.04em">' html += '<div style="margin:6px 0 2px;font-size:10px;color:var(--muted);text-transform:uppercase;letter-spacing:.04em">'
+ '<hr style="border:none;border-top:1px solid var(--border);margin:1px 0 2px">'; + '<hr style="border:none;border-top:1px solid var(--border);margin:1px 0 2px">';
S._fileSources.forEach(function(s) { S._fileSources.forEach(function(s) {
const isSmb = s.path && (s.path.startsWith('//') || s.path.startsWith('\\\\')); const isSftp = s.source_type === 'sftp';
const icon = isSmb ? '\uD83C\uDF10' : '\uD83D\uDCC1'; const isSmb = !isSftp && s.path && (s.path.startsWith('//') || s.path.startsWith('\\\\'));
const label = s.label || s.path || s.id; const icon = isSftp ? '\uD83D\uDD12' : (isSmb ? '\uD83C\uDF10' : '\uD83D\uDCC1');
const label = s.label || s.path || s.id;
const isChecked = (s.id in checked) ? checked[s.id] : true; const isChecked = (s.id in checked) ? checked[s.id] : true;
html += '<label class="source-check">' html += '<label class="source-check">'
+ '<input type="checkbox" data-source-id="' + _esc(s.id) + '" data-source-type="file"' + (isChecked ? ' checked' : '') + '>' + '<input type="checkbox" data-source-id="' + _esc(s.id) + '" data-source-type="file"' + (isChecked ? ' checked' : '') + '>'

View File

@ -1219,11 +1219,22 @@ document.addEventListener('DOMContentLoaded', applyI18n);
<div class="srcmgmt-group"> <div class="srcmgmt-group">
<div class="srcmgmt-group-title" data-i18n="m365_file_sources_add">Add source</div> <div class="srcmgmt-group-title" data-i18n="m365_file_sources_add">Add source</div>
<div class="fsrc-form" style="border-color:var(--border)"> <div class="fsrc-form" style="border-color:var(--border)">
<!-- Source type selector -->
<div class="fsrc-form-row">
<label>Type</label>
<div style="display:flex;background:var(--bg);border:1px solid var(--border);border-radius:6px;overflow:hidden">
<button type="button" id="srcTypeLocal" onclick="srcFileTypeSelect('local')" style="flex:1;border:none;padding:3px 8px;font-size:11px;cursor:pointer;background:var(--accent);color:#fff" data-i18n="m365_fsrc_type_local">Local folder</button>
<button type="button" id="srcTypeSmb" onclick="srcFileTypeSelect('smb')" style="flex:1;border:none;border-left:1px solid var(--border);padding:3px 8px;font-size:11px;cursor:pointer;background:none;color:var(--muted)" data-i18n="m365_fsrc_type_smb">Network (SMB)</button>
<button type="button" id="srcTypeSftp" onclick="srcFileTypeSelect('sftp')" style="flex:1;border:none;border-left:1px solid var(--border);padding:3px 8px;font-size:11px;cursor:pointer;background:none;color:var(--muted)" data-i18n="m365_fsrc_type_sftp">SFTP</button>
</div>
</div>
<input type="hidden" id="srcFileSourceType" value="local">
<div class="fsrc-form-row"> <div class="fsrc-form-row">
<label>Name <span style="color:var(--accent)">*</span></label> <label>Name <span style="color:var(--accent)">*</span></label>
<input id="srcFileLabel" type="text" placeholder="e.g. Teacher files, NAS archive" maxlength="80" autocomplete="off"> <input id="srcFileLabel" type="text" placeholder="e.g. Teacher files, NAS archive" maxlength="80" autocomplete="off">
</div> </div>
<div class="fsrc-form-row"> <!-- Local / SMB path field -->
<div id="srcFilePathRow" class="fsrc-form-row">
<label data-i18n="m365_fsrc_path">Path</label> <label data-i18n="m365_fsrc_path">Path</label>
<input id="srcFilePath" type="text" placeholder="~/Documents or //nas/shares" oninput="srcFileDetectSmb(); srcFileAutoName()"> <input id="srcFilePath" type="text" placeholder="~/Documents or //nas/shares" oninput="srcFileDetectSmb(); srcFileAutoName()">
</div> </div>
@ -1243,6 +1254,58 @@ document.addEventListener('DOMContentLoaded', applyI18n);
</div> </div>
<div style="font-size:10px;color:var(--muted)" data-i18n="m365_fsrc_smb_pw_hint">Saved to OS keychain — never stored in a file.</div> <div style="font-size:10px;color:var(--muted)" data-i18n="m365_fsrc_smb_pw_hint">Saved to OS keychain — never stored in a file.</div>
</div> </div>
<!-- SFTP fields -->
<div id="srcFileSftpFields" style="display:none;flex-direction:column;gap:6px">
<div class="fsrc-form-row">
<label data-i18n="m365_fsrc_sftp_host">SFTP host</label>
<input id="srcFileSftpHost" type="text" placeholder="sftp.school.dk" oninput="srcFileAutoNameSftp()">
</div>
<div class="fsrc-form-row">
<label data-i18n="m365_fsrc_sftp_port">Port</label>
<input id="srcFileSftpPort" type="number" value="22" min="1" max="65535" style="width:70px">
</div>
<div class="fsrc-form-row">
<label data-i18n="m365_fsrc_sftp_user">Username</label>
<input id="srcFileSftpUser" type="text" placeholder="backup_user">
</div>
<div class="fsrc-form-row">
<label data-i18n="m365_fsrc_sftp_remote_path">Remote path</label>
<input id="srcFileSftpPath" type="text" placeholder="/var/data" value="/">
</div>
<!-- Auth type toggle -->
<div class="fsrc-form-row">
<label>Auth</label>
<div style="display:flex;background:var(--bg);border:1px solid var(--border);border-radius:6px;overflow:hidden">
<button type="button" id="srcSftpAuthPw" onclick="srcFileSftpAuthSelect('password')" style="flex:1;border:none;padding:3px 8px;font-size:11px;cursor:pointer;background:var(--accent);color:#fff" data-i18n="m365_fsrc_sftp_auth_password">Password</button>
<button type="button" id="srcSftpAuthKey" onclick="srcFileSftpAuthSelect('key')" style="flex:1;border:none;border-left:1px solid var(--border);padding:3px 8px;font-size:11px;cursor:pointer;background:none;color:var(--muted)" data-i18n="m365_fsrc_sftp_auth_key">SSH key</button>
</div>
</div>
<input type="hidden" id="srcFileSftpAuth" value="password">
<!-- Password auth -->
<div id="srcSftpPwFields">
<div class="fsrc-form-row">
<label data-i18n="m365_fsrc_sftp_pw">Password</label>
<input id="srcFileSftpPw" type="password" placeholder="Stored in OS keychain">
</div>
<div style="font-size:10px;color:var(--muted)" data-i18n="m365_fsrc_sftp_pw_hint">Password is saved to the OS keychain — never stored in a file.</div>
</div>
<!-- Key auth -->
<div id="srcSftpKeyFields" style="display:none;flex-direction:column;gap:6px">
<div class="fsrc-form-row">
<label data-i18n="m365_fsrc_sftp_key_upload">Private key</label>
<div style="display:flex;gap:6px;align-items:center">
<input id="srcFileSftpKeyFile" type="file" accept=".pem,.key,.pub,*" style="flex:1;font-size:11px">
<span id="srcFileSftpKeyStatus" style="font-size:10px;color:var(--muted)"></span>
</div>
</div>
<input type="hidden" id="srcFileSftpKeyPath" value="">
<div class="fsrc-form-row">
<label data-i18n="m365_fsrc_sftp_passphrase">Passphrase</label>
<input id="srcFileSftpPassphrase" type="password" placeholder="Leave blank if key has no passphrase">
</div>
<div style="font-size:10px;color:var(--muted)" data-i18n="m365_fsrc_sftp_passphrase_hint">Passphrase is saved to the OS keychain — never stored in a file.</div>
</div>
</div>
<div style="display:flex;align-items:center;gap:8px"> <div style="display:flex;align-items:center;gap:8px">
<input type="hidden" id="srcFileEditId" value=""> <input type="hidden" id="srcFileEditId" value="">
<div id="srcFileStatus" style="flex:1;font-size:11px;color:var(--muted)"></div> <div id="srcFileStatus" style="flex:1;font-size:11px;color:var(--muted)"></div>