Ideas

xlab.studio tenant — taxonomy and operational streamlining

Date: 2026-06-02 Scope: xlab.studio M365 E5 Developer Trial tenant (125 TB pooled storage, multi-user OneDrive) Companion to: feedback_xlab_studio_tenant_naming_taxonomy_2026-06-02.md, project_xlab_studio_m365_dev_tier_keepalive_2026-06-02.md, feedback_gvr_setup_remotes_python_patch_fix_2026-06-02.md, feedback_microsoft_admin_app_permissions_gotchas_2026-06-02.md


Summary

A day spent operationalising the xlab.studio tenant from an “it works” prototype into a documented, scriptable substrate. Three categories of work: (1) a two-tier naming taxonomy registered for tenant users, (2) a token-refresh fix that eliminates a class of OAuth hangs, (3) three Microsoft-side gotchas captured so they never re-cost focus.

1. Two-tier naming taxonomy registered

The xlab.studio tenant now has explicit semantics for what kind of user belongs where:

Tier Domain Purpose Naming pattern
Tier 1 @gf.cx Persons — people, roles humans inhabit normal-looking addresses (admin@gf.cx, dan@gf.cx)
Tier 2 @xlab.studio Data ingest accounts — service identities that own data flows <service>_<owner>[_<scope>]@xlab.studio

Registered Tier-2 service codes: icloud, google, dropbox, timemachine. Future services extend the list rather than improvising.

Concrete pattern: - icloud_audrey_lam@xlab.studio — Audrey’s iCloud → her OneDrive - google_smart_sellars@xlab.studio — smart.sellars Google Takeout → admin OneDrive (current overnight transfer)

Underscores in the UPN itself; hyphens in derived rclone remote names (rclone auto-translates _- for remote naming).

2. The gvr-setup-remotes.sh fix

The script that mints all GVR OneDrive remotes (refreshes OAuth tokens for each xlab.studio user) was hanging in parallel runs. Root cause was buried: rclone config create validates the token via a Microsoft Graph API call, and when that call fails it silently falls back to interactive OAuth — which sits forever waiting for a browser that doesn’t exist when running detached/parallel.

Fix: replaced the rclone config create block with a direct Python file patch on rclone.conf. The script now mints tokens via curl + writes them straight into the conf with configparser. No more validation step, no fallback path, no interactive prompt.

Behaviour Before After
Solo run Worked, took 5-10 sec Works, takes 5-10 sec
Parallel run from launchd Hung for hours Works, same 5-10 sec
OAuth fallback risk Always present Eliminated

Backup at ~/bin/gvr-setup-remotes.sh.bak.2026-06-02. Tonight’s smart.sellars transfer running on the GCP VM uses the post-fix pattern (its own self-contained Python token mint, same approach as the patched script).

3. Microsoft-side gotchas captured

Three things Microsoft does that cost time if you don’t know:

Gotcha 1 — Two “Sites.FullControl.All” permissions, both required

Microsoft Graph has Sites.FullControl.All. The SharePoint REST API has its own Sites.FullControl.All. They are separate permissions on separate APIs. Granting one doesn’t grant the other. When provisioning an app for tenant-wide SharePoint/OneDrive admin, grant both.

Gotcha 2 — Personal-site provisioning is delegated-auth only

OneDrive personal-site provisioning (creating a user’s OneDrive on first access) cannot be done by an app-only auth flow, regardless of permissions held. Microsoft enforces this server-side. Either provision via delegated auth (a real user signing in with admin consent) or have the user sign in to OneDrive once themselves.

Gotcha 3 — 1Password client_secret field must hold the “Value”, not the “Secret ID”

The Microsoft Entra admin portal shows BOTH “Secret ID” (a GUID identifier) and “Value” (the actual secret string) when you create a client secret. The Secret ID is what you’d default to copying because it’s labelled first. Use Value. If you store the Secret ID instead, every auth attempt returns AADSTS7000215 (“invalid client secret”), which reads like a credential typo even though the credential is structurally well-formed.

Why this work compounds

xlab.studio is the storage spine for everything data-flow on this portfolio — iCloud photo migrations, Google Takeout archives, future Dropbox/Time Machine flows. Today’s work converts the spine from “Dan can operate it” to “documented enough that scripts and future-Dan can operate it without rediscovering each gotcha”. The 125 TB pool was never the limit. Operator clarity was.

Decisions logged

  1. Adopt the two-tier taxonomy now, not later. Every new ingest account follows the pattern from this point.
  2. Don’t retry rclone config create for OneDrive remotes. The Python conf-patch pattern is the canonical path forward.
  3. Audit existing client_secret items in 1P for the Value-vs-Secret-ID gotcha next time a refresh comes around.
  4. Track Microsoft’s permission split as a portable lesson — applies to any vendor where the “same name” permission exists on multiple APIs.

What’s next

← /reportsSource: dare_strategy_xlab_studio_tenant_taxonomy_2026-06-02.md · Rendered 2026-06-02 18:38