S3 → R2 stray-image promote — preview & decision
S3-ONLY STRAY-IMAGES · PROMOTE PREVIEW · 11 MAY 2026
TL;DR
- 25 stray basenames identified by yesterday’s
dare_lost_files_audit_cdn-dare-co-uk_2026-05-10— referenced in repo HTML/CSS, present in S3, missing from R2 → currently 404 for live visitors. - 20 of 25 auto-slug cleanly via the SEO image-naming convention (lowercase, hyphenated,
.jpeg→.jpg, resolution suffix stripped, stop words dropped). - 5 of 25 are flagged for human judgment — numeric-only IDs, single-word slugs, typo-shape names. Listed below with options.
- Three execution modes available —
basename(preserve refs, zero risk),slugwith auto-proposed names, orslugafter manual override of the 5 flagged. - Recommended:
basenamemode tonight (closes 404 gap immediately), revisit slug renames as a focused later pass when source images are in hand. - Blocked on:
aws s3 sync s3://cdn.dare.co.uk ~/var/dare/s3-archive/completion (currently in flight). Promote script operates on the local archive once populated.
Background — why this exists
The legacy cdn.dare.co.uk S3 bucket holds 15,315 objects / 436 MB of WordPress-era image content. The live CDN at images.dare.co.uk is now backed by Cloudflare R2 (bucket dare-images). During the migration, 25 image basenames were left behind — referenced in repo HTML but never copied to R2. Visitors hit images.dare.co.uk/posts/<basename> and get a 404 because R2 doesn’t have them, even though the file IS still in S3.
The fix: promote those 25 from S3 to R2 so the existing repo references stop 404-ing. Per today’s session report, this is the highest-leverage, lowest-risk move out of the five S3-succession tasks (the others — mirror 97 R2-only, archive 3,541 orphans, triage 42 lost, retire bucket — chain off the local-archive snapshot we’re building now).
The 25 stray basenames
| # | Original basename | Proposed slug | Transformations | Flags |
|---|---|---|---|---|
| 1 | 005-300x292.jpg |
005.jpg |
stripped -300x292 |
⚠ numeric-only — needs human rename for SEO value, ⚠ very short |
| 2 | 83emptycar.jpg |
83emptycar.jpg |
none | — (no separators to parse) |
| 3 | Tuerkischer_schachspieler_windisch4.jpg |
tuerkischer-schachspieler-windisch4.jpg |
lowercase, underscores → hyphens | — |
| 4 | afternoon-tea.jpg |
afternoon-tea.jpg |
none | — |
| 5 | apple-5th-ave.jpeg |
apple-5th-ave.jpg |
.jpeg → .jpg |
— |
| 6 | campervan.jpg |
campervan.jpg |
none | — |
| 7 | dolomites-italy.jpeg |
dolomites-italy.jpg |
.jpeg → .jpg |
— |
| 8 | eugeniorecuenco.jpg |
eugeniorecuenco.jpg |
none | — |
| 9 | evolve.jpg |
evolve.jpg |
none | — |
| 10 | gauchos-in-patagonia.jpg |
gauchos-patagonia.jpg |
dropped stop word in |
— |
| 11 | halle-berry-tie.jpg |
halle-berry-tie.jpg |
none | — |
| 12 | henri-cartier-bresson-300x191.jpg |
henri-cartier-bresson.jpg |
stripped -300x191 |
— |
| 13 | kayak.jpg |
kayak.jpg |
none | ⚠ very short slug |
| 14 | lesmoking.jpg |
lesmoking.jpg |
none | — (could be le-smoking.jpg) |
| 15 | nigel-text.jpg |
nigel-text.jpg |
none | — |
| 16 | perla-large.jpeg |
perla-large.jpg |
.jpeg → .jpg |
— |
| 17 | safe2tv-772039.jpg |
safe2tv-772039.jpg |
none | (opaque numeric ID — would benefit from a real noun) |
| 18 | staticmap.png |
staticmap.png |
none | (generic — <somewhere>-map.png would rank better) |
| 19 | steven-soderbergh-259x300.jpg |
steven-soderbergh.jpg |
stripped -259x300 |
— |
| 20 | stunts-pylon.jpg |
stunts-pylon.jpg |
none | — |
| 21 | syd-mead-blade-runner.jpeg |
syd-mead-blade-runner.jpg |
.jpeg → .jpg |
— |
| 22 | terroni.jpeg |
terroni.jpg |
.jpeg → .jpg |
— |
| 23 | toyko.jpeg |
toyko.jpg |
.jpeg → .jpg |
⚠ possible typo: toyko → tokyo? |
| 24 | u2-no-line-on-the-horizon.jpg |
u2-no-line-horizon.jpg |
dropped stop words on the |
— |
| 25 | ugg-close-up.jpeg |
ugg-close-up.jpg |
.jpeg → .jpg |
— |
The 5 flagged-for-human-judgment cases
These auto-proposed slugs are mechanically correct but SEO-suboptimal. Listed with what would improve each:
| Original | Auto slug | If you remember the image | Honest default if you don’t |
|---|---|---|---|
005-300x292.jpg |
005.jpg |
Give it a real noun — <subject>-005.jpg |
Ship as 005.jpg; SEO marginal anyway |
kayak.jpg |
kayak.jpg |
<context>-kayak.jpg (e.g. wooden-kayak.jpg, patagonia-kayak.jpg) |
Ship as-is |
safe2tv-772039.jpg |
safe2tv-772039.jpg |
The “772039” looks Flickr-ID-shaped — drop it, name what’s IN the image | Ship as-is |
staticmap.png |
staticmap.png |
<place>-map.png (looked like a city map in the article context) |
Ship as-is |
toyko.jpeg |
toyko.jpg |
Almost certainly typo for tokyo.jpg |
Fix to tokyo.jpg — typos rank worse than the canonical spelling |
The toyko → tokyo typo fix is the strongest single signal in the table.
Three execution modes
| Mode | What it does | Repo edits needed? | Risk |
|---|---|---|---|
--mode=basename |
Upload all 25 at existing basenames into dare-images/posts/<basename> |
None | Lowest — closes 404 gap immediately, no slug ambiguity, no repo touch |
--mode=slug (auto-proposed) |
Apply the table above verbatim; upload at proposed slugs; emit a repo-rewrite sed plan | Yes — repo HTML/CSS needs find + sed against the basename→slug mapping |
Medium — 5 flagged slugs are functional but suboptimal for SEO |
--mode=slug (manual override for the 5 flagged) |
Same as above, but you override the 5 flagged basenames before running | Yes | Best SEO outcome, requires 5 quick decisions |
Recommendation
--mode=basename tonight. Reasoning:
- Closes the 404 gap immediately — visitors stop hitting the dead links the moment uploads complete (modulo Cloudflare edge cache TTL, which is short for 404s anyway).
- Zero repo edits — current HTML references stop 404-ing without us touching dare-co-uk’s source.
- Defers the SEO rename as a separate intentional pass — when you have the source images in hand to make the human-judgment calls cleanly, not under “let me ship something tonight” pressure.
- The 5 flagged cases mostly benefit from human eyeballs that see the actual image — easier to do as a separate scoped pass than as a sub-task here.
Slug-mode is a later improvement, not a blocker on tonight’s deploy. The repo-rewrite plan it would emit (sed against basenames) is generic and can be run any time post-promote.
Status
- ⏳
aws s3 sync s3://cdn.dare.co.uk ~/var/dare/s3-archive/— currently in flight. Last check: 2,560 of ~15,315 files / 10 MB of ~436 MB synced. Resume:
bash
# Sync runs via the new ~/bin/aws wrapper (op-injected creds, no plaintext)
# — completes in 5–15 min depending on bandwidth.
aws s3 sync s3://cdn.dare.co.uk ~/var/dare/s3-archive/
- ⏳ Dry-run pending sync completion — operates on the local archive once populated:
bash
python3 ~/bin/dare_s3_to_r2_promote.py --mode=basename
- ⏳ Execute pending dry-run approval — wraps in
op runfor the R2 token; uploads withContent-Type: image/jpeg(or .png/.gif/.webp by extension) andCache-Control: public, max-age=31536000, immutable:
bash
op run --env-file=- -- python3 ~/bin/dare_s3_to_r2_promote.py --execute --mode=basename <<EOF
R2_ACCESS_KEY_ID=op://Private/R2 dare-pipeline-thumbs/access_key_id
R2_SECRET_ACCESS_KEY=op://Private/R2 dare-pipeline-thumbs/secret_access_key
R2_ENDPOINT=op://Private/R2 dare-pipeline-thumbs/endpoint
EOF
- ⏳ Post-execute verification — script HEAD-checks each uploaded URL at
https://images.dare.co.uk/posts/<basename>and reports 200/non-200 per file. Writes a dated execution report to~/Downloads/dare_s3_to_r2_promote_basename_2026-05-11.md.
Linked artefacts
dare_lost_files_audit_cdn-dare-co-uk_2026-05-10dare_s3_inventory_cdn-dare-co-uk_2026-05-10- Session report 2026-05-11
Generated 2026-05-11 14:33. Sync still in flight; dry-run + execute pending.