GDPRScanner/docs/setup/M365_SETUP.md
2026-04-11 04:38:11 +02:00

4.9 KiB
Raw Blame History

Microsoft 365 Setup

Step-by-step guide for connecting GDPRScanner to Microsoft 365.


1. Register an app in Azure

Go to Azure Portal → Microsoft Entra ID → App registrations → New registration.

Field Value
Name GDPRScanner (or any name)
Supported account types Accounts in this organisational directory only
Redirect URI Leave blank

Click Register. Note the Application (client) ID and Directory (tenant) ID — you'll need both.


2. Choose an authentication mode

Mode How it works When to use
Application Client credentials — client ID + tenant ID + client secret. No user interaction. Automated / scheduled scans, all-user scans
Delegated OAuth device code flow — user signs in interactively. Single-user scans, testing

Application mode — create a client secret

In your app registration: Certificates & secrets → New client secret.

Set an expiry (24 months recommended) and copy the Value immediately — it is only shown once.

Delegated mode — no secret needed

The scanner will show a device code URL. Open it in a browser, sign in, and the scanner authenticates as that user.


3. Add API permissions

Go to API permissions → Add a permission → Microsoft Graph.

Scan only

Permission Type
Mail.Read Application or Delegated
Files.Read.All Application or Delegated
Sites.Read.All Application or Delegated
ChannelMessage.Read.All Application
Team.ReadBasic.All Application
User.Read.All Application

Scan + Delete

Add these in addition to the read permissions above:

Permission Type
Mail.ReadWrite Application or Delegated
Files.ReadWrite.All Application or Delegated
Sites.ReadWrite.All Application or Delegated

Email reports via Graph

If you want the scanner to send email reports via Microsoft 365 (not SMTP):

Permission Type
Mail.Send Application or Delegated

All Application permissions require admin consent. Click Grant admin consent for [your tenant] at the top of the API permissions page. Without this, scans will fail with 403 errors.


4. Connect in GDPRScanner

Open GDPRScanner → Source Management → Microsoft 365 tab.

Field Where to find it
Client ID App registration → Overview → Application (client) ID
Tenant ID App registration → Overview → Directory (tenant) ID
Client Secret The value you copied in step 2 (Application mode only)

Click Connect. In Application mode, the connection is immediate. In Delegated mode, a browser window opens for sign-in.


5. Verify

After connecting, the Sources panel shows:

  • Email — Exchange mailboxes
  • OneDrive — personal drives
  • SharePoint — site file libraries
  • Teams — Teams channel files

The Accounts panel lists all users in the tenant (Application mode) or just the signed-in user (Delegated mode).


Notes on deletion

Emails deleted via the scanner are moved to Deleted Items — recoverable for 1430 days depending on admin configuration. Files are sent to the OneDrive/SharePoint recycle bin — retained for 93 days across both recycle bin stages before permanent deletion. Nothing is permanently destroyed without a second manual step.


Headless / scheduled mode

Headless mode uses Application auth only. Credentials are read in priority order:

  1. --settings FILE — a JSON file you provide
  2. Environment variables: M365_CLIENT_ID, M365_TENANT_ID, M365_CLIENT_SECRET

Example settings file:

{
  "client_id":     "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "tenant_id":     "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "client_secret": "your-secret",
  "sources":       ["email", "onedrive"],
  "options": {
    "older_than_days": 365,
    "email_body":      true,
    "attachments":     true,
    "delta":           true
  }
}

Run:

python gdpr_scanner.py --headless --output ~/Reports/ --settings settings.json

See the full CLI flag reference in README.md.


Role classification (staff / student)

GDPRScanner classifies users as staff or student based on their Microsoft 365 licence SKU. The mapping is in classification/m365_skus.json. If users appear as "other", open Settings → SKU debug to see which SKU IDs are assigned in your tenant and add any missing ones to m365_skus.json.


Troubleshooting

Symptom Likely cause
403 on scan start Admin consent not granted, or wrong permissions added
AADSTS7000215 Invalid client secret — check it was copied correctly
No users listed User.Read.All permission missing or not consented
Teams files not appearing ChannelMessage.Read.All or Team.ReadBasic.All missing
Delta scan not working Delta tokens require at least one full scan first