GDPRScanner/docs/setup/GOOGLE_SETUP.md
2026-04-11 04:38:11 +02:00

145 lines
4.4 KiB
Markdown

# Google Workspace Setup
Step-by-step guide for connecting GDPRScanner to Google Workspace via a service account.
GDPRScanner connects using a **service account** with **domain-wide delegation** — this allows it to scan all users' Gmail and Drive without requiring each user to sign in individually.
---
## 1. Create a Google Cloud project
Go to [console.cloud.google.com](https://console.cloud.google.com) and create a new project (or use an existing one).
---
## 2. Enable the required APIs
In your project: **APIs & Services → Enable APIs and Services**. Enable:
- **Gmail API**
- **Google Drive API**
- **Admin SDK API**
---
## 3. Create a service account
Go to **IAM & Admin → Service accounts → Create service account**.
| Field | Value |
|---|---|
| Name | gdprscanner (or any name) |
| Description | GDPRScanner service account |
Click **Create and continue**. Skip the optional role and user access steps. Click **Done**.
### Create a key
Click on the service account → **Keys → Add key → Create new key → JSON**.
Download the JSON file. This is your service account key — treat it like a password.
---
## 4. Enable domain-wide delegation
Back on the service account page: **Show advanced settings → Domain-wide delegation → Enable**.
Note the **Client ID** (a long number) — you'll need it in the next step.
---
## 5. Authorise scopes in Google Admin Console
Go to [admin.google.com](https://admin.google.com) →
**Security → Access and data control → API controls → Manage domain-wide delegation → Add new**.
| Field | Value |
|---|---|
| Client ID | The numeric Client ID from the service account |
| OAuth scopes | See below |
Add all of these scopes (paste as a comma-separated list):
```
https://www.googleapis.com/auth/admin.directory.user.readonly,
https://www.googleapis.com/auth/gmail.readonly,
https://www.googleapis.com/auth/drive.readonly
```
Click **Authorise**. Changes can take a few minutes to propagate.
---
## 6. Connect in GDPRScanner
Open GDPRScanner → **Source Management → Google Workspace** tab.
1. **Upload service account key** — select the JSON file you downloaded in step 3
2. **Admin email** — enter the email address of a Google Workspace admin user in your domain (e.g. `admin@skolen.dk`). The service account impersonates this user to call the Admin Directory API.
Click **Connect**. If successful, the status dot turns green and shows the service account email.
---
## 7. User role classification
GDPRScanner classifies Google Workspace users as **staff** or **student** based on their **Organisational Unit (OU) path** in Google Admin.
The mapping is in `classification/google_ou_roles.json`. Edit it to match your school's OU structure — no code change required.
Default mapping:
| OU prefix | Role |
|---|---|
| `/Elever` | student |
| `/Personale` | staff |
| `/Admin` | staff |
To see your OU structure: **Google Admin → Directory → Administrer organisationsenheder**.
Example `classification/google_ou_roles.json` for a typical Danish school (Gudenaaskolen.dk structure):
```json
{
"student_ou_prefixes": ["/Elever"],
"staff_ou_prefixes": ["/Personale", "/Admin"]
}
```
After editing the file, restart GDPRScanner — no rebuild required.
---
## 8. Verify
After connecting:
- **Sources panel** shows Gmail and Google Drive checkboxes
- **Accounts panel** shows all Google Workspace users with `GWS` badges
- Users are classified as Elev / Ansat based on their OU
Select one or more accounts, check Gmail and/or Google Drive, and click Scan.
---
## Notes on what is scanned
| Source | What is scanned |
|---|---|
| Gmail | Email bodies and attachments for all mail folders |
| Google Drive | My Drive files — Docs, Sheets, Slides are auto-exported to text for scanning |
---
## Troubleshooting
| Symptom | Likely cause |
|---|---|
| `unauthorized_client` on connect | Domain-wide delegation not enabled, or scopes not authorised in Admin Console |
| 0 users listed | `admin.directory.user.readonly` scope missing, or wrong admin email |
| Users show as "Anden" (other) | OU path not matched in `classification/google_ou_roles.json` — check OU paths in Google Admin and compare with the file |
| Gmail scan finds nothing | `gmail.readonly` scope not authorised |
| Drive scan finds nothing | `drive.readonly` scope not authorised |
| `RefreshError` on scan | Service account key expired or revoked — generate a new key |