SG VM ingest playbook · when to use gf-cx-singapore

SIGNAL · NETWORK PLAYBOOK · 5 JUNE 2026

Decision rule for routing data into xlab.studio (APAC) OneDrive, based on the same-day empirical bake-off + the post-bake-off conversation. Quick-reference for future-Dan/future-Claude when designing a new ingest path: WHEN does routing via gf-cx-singapore actually help, and when does it just add hops?

The rule in one sentence

Route through gf-cx-singapore (vm-asia-gcp) only when the source data already lives in a fast-network position. Otherwise the home WAN upload is the floor regardless of route, and the VM’s ~100 MiB/s pipe sits idle waiting for bytes.

Decision table

Source location Use SG VM? Path Throughput floor
Google Workspace export (gcs-*:) ✅ Yes Mac signs admin export → SG VM reads from gs:// bucket → OneDrive ~100 MiB/s (per-app+tenant ceiling)
Dropbox tenant (gvr-dropbox-*:) ✅ Yes Mac authorises Dropbox source → SG VM reads from Dropbox APAC peering → OneDrive ~100 MiB/s
iCloud Takeout zips ✅ Yes Mac retrieves Apple-hosted download URL → SG VM curls via Apple CDN → OneDrive ~100 MiB/s
Google Photos personal Takeout ✅ Yes takeout.google.com → 50 GB zip parts → SG VM curls → OneDrive ~100 MiB/s
Home Mac Mini RAID (Immich, NAS, time machine, etc.) No Mac → OneDrive direct, 5-8 destination lanes in parallel ~36 MiB/s (home fiber upload ceiling)
Files already on the Mac ❌ No Same as above — Mac is the source, WAN is the floor ~36 MiB/s

Why the home-LAN case doesn’t benefit

The SG VM is fast because it has 2 ms to Microsoft’s Singapore POP (sg2.ntwk.msn.net) via GCP’s APAC backbone — confirmed via the 2026-06-05 traceroute. That advantage only applies to the SG-→-OneDrive leg.

When source data lives on Dan’s home Mac Mini: - Mac → SG VM via Tailscale runs over the same 300 Mbps home fiber (~36 MiB/s ceiling), then a trans-Pacific leg to Singapore. - Mac → OneDrive direct runs over the same 300 Mbps home fiber to Microsoft Newark NJ POP (~22 ms), then MS backbone to APAC. - Either way, the home fiber is the floor. The SG VM doesn’t speed up the Mac upload — it just adds a hop on the way.

For home-LAN-resident workloads, multi-destination parallelism from the Mac itself (5-8 distinct gvr-* destination drives in parallel) saturates the home WAN at ~36 MiB/s. Same approach as the bake-off proved on the VM; the cap that fires first is now the WAN, not the Microsoft per-principal cap.

Empirical proof — the same-day bake-off

Single Azure AD app + single service principal + N concurrent destination drives, run from gf-cx-singapore:

Lanes VM RAM Aggregate MiB/s Speedup
1 8 GiB 14.03 1.0× (baseline)
5 8 GiB 68.27 4.87×
8 16 GiB 101.14 7.21× · 91% of MS per-app+tenant ceiling

VM resize 8→16 GiB took 80 seconds (stop / set-machine-type / start) on the existing audreylam $300 trial. Same tailnet IP, same disk, same rclone.conf. Audreylam trial covers ~1.8 months at n2-standard-4 rates with $255 of remaining credit.

Full bake-off detail at the Parallel Ingest Identities · Working Hypothesis report.

Operational shortcuts

Routing a new source into OneDrive? Ask first:

  1. Is the source already in a cloud provider’s APAC backbone? (Google Workspace = yes via GCS; Dropbox = yes via NTT Tokyo; Apple = yes via Apple CDN; Microsoft Teams Recordings = yes) → Use SG VM. Mint a destination drive on xlab.studio (gvr-<service>-<owner>@xlab.studio), one per shard, run N concurrent rclones from gf-cx-singapore.

  2. Is the source on a home device or in an US/EU-only cloud? → Skip the VM. Mac → OneDrive direct, 5-8 destination drives in parallel. Home WAN is the floor.

  3. Hybrid (e.g., Dropbox account but with media that originated on a home Mac)? → Use the VM. Once the data has reached Dropbox, it’s in a cloud network position regardless of where it started. The “fast-network position” rule cares about CURRENT location, not historical.

What this means for upcoming workloads

Workload Plan
Immich on home Mac Mini RAID (planned) Mac → OneDrive direct, 5-8 lanes. Initial 5 TB seed: ~40 hours at fiber ceiling. Daily deltas: trivial trickle. OneDrive = off-site dedup + originals copy; Mac Mini RAID = source of truth.
Ongoing Workspace exports (audreyinc, audreylam, etc.) SG VM relay continues to be the right path
Future Dropbox bulk migrations SG VM relay
Photo libraries shot in-camera + landed on Mac Mac → OneDrive direct (home-LAN source)
Photo libraries already in iCloud / Google Photos SG VM via Takeout zips

Cross-references

Naming convention

This report uses Dan’s canonical names:


Generated 2026-06-05 evening. Synthesises the day’s full thread: broken-thumbs incident → tripwires → snapshots fix → traceroute → parallel ingest hypothesis → bake-off → VM resize → home fiber empirical → Immich architecture decision. Filed as the operational playbook so future Dan/Claude don’t re-derive the “when to use the VM” rule from first principles each time.

Source: dare_sg_vm_ingest_playbook_2026-06-05.md · Rendered 2026-06-05 23:14