dashboard.audreyinc.com — A/B testing + editorial + Google/Meta ads experimentation surface (parked sketch)

DARE.CO.UK · PARKED SKETCH · 2026-05-18

Mirrored from ~/.claude/.../memory/project_dashboard_audreyinc_com_ab_editorial_ads_parked.md. This is a design sketch parked for future build — read for context, not as a current deliverable.

2026-05-18 sketch. dashboard.audreyinc.com is reserved for commerce / experimentation (distinct from health.audreyinc.com which hosts Site Health metrics). This surface needs plumbing for A/B testing landing pages + editorial copy variants + Google/Meta ad-creative testing + organic social-pattern testing, with conversion attribution back to Shopify. Unparks the previously-parked dare A/B framework (dare lacked traffic; audrey has commerce). Resume on Dan’s first commerce-experimentation engagement, OR when stages 4-8 of the audrey commerce flywheel come online.

Dan, 2026-05-18 evening: “dashboard.audreyinc.com has a landing page, but we need to start and sketch the plumbing/schematics for how A/B testing, editorial and ways to test google and meta ad’s can be designed and deployed, to evaluate copy/ideas/social patterns.”

Subdomain delineation (clarified 2026-05-18)

health.audreyinc.com — Site Health metrics surface. Tripwires, GSC markers, content-quality signals. Pattern lifted from dare’s Site Health row + click-through-to-latest-authoritative-report.
dashboard.audreyinc.com — commerce / experimentation surface. A/B tests, editorial variants, ad attribution, social patterns. This sketch.
bookings.audreyinc.com — Phase A booking page (live; per project_booking_product_evolution.md).

The three surfaces talk to overlapping data (same Shopify backend, same GSC + GA4) but serve different mental modes: health = “is the site OK,” dashboard = “what’s working,” bookings = “transact.”

Architecture — 5-stage experimentation pipeline

�STASH5�

Builds on project_ab_testing_cf_workers_parked.md (originally parked because dare lacked traffic — audrey has commerce, so the trigger is now met).

Stage 1: DEFINE — what’s being tested

Variants live in CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF KV namespace audrey_experiments. Each variant entry:

�STASH8�

Variants seeded via a small CLI helper (audrey_experiment_define.py) that writes to KV via Worker API. No UI for v1.

Stage 2: ASSIGN — traffic split + stickiness

CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Worker bound to audreyinc.com/* routes. For requests matching an active experiment’s surface:

Read _audrey_exp cookie. If present and valid → serve that variant.
If absent → hash visitor ID (IP + UA → SHA-256, first 8 bytes) % 100; assign per weight; set sticky cookie (30-day TTL).
Rewrite the response HTML using HTMLRewriter (CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF-native): swap hero copy / image / CTA based on variant.

For ad-creative tests: rewriting isn’t needed — the variant IS the ad creative; the Worker only needs to capture the gclid/fbclid and pass it through to Shopify’s add-to-cart endpoint with the variant tag.

For editorial tests (newsletter subject lines, social captions): the Worker isn’t involved — these are tested in the publishing tool itself (Mailchimp / Klaviyo / Buffer / whatever audrey uses). The pipeline just RECORDS the variant tag against the conversion outcome.

Stage 3: TRACK — conversion events

Per-event, fire to multiple destinations:

Shopify Analytics — native conversion attribution (add_to_cart, purchase). Inject _audrey_exp into Shopify cart attributes so it carries through the funnel.
GA4 — custom event with experiment_id + variant_id parameters. Enables Looker Studio reporting.
Meta Pixel — same shape; required for Meta’s audience targeting to learn off the variants.
Google Ads conversion endpoint — if the test surface is a Google Ad landing.
CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Analytics Engine — our own time-series store for fast / cheap queries on traffic-split health (per-variant request count, error rate, etc.). Free tier covers our volume.

A small audrey_event_logger.js library in the Worker handles the fan-out — one event in → six destinations out. Idempotent (same event ID → no duplicate fires).

Stage 4: AGGREGATE — daily roll-ups

Logpush from CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Analytics Engine → BigQuery audrey.experiments.events_raw (free for CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF, ~$0 GCS storage at our volume).

Daily Cloud Run job (sibling to dare-devreports per the existing pattern): - Reads raw events for the previous UTC day. - Pivots into audrey.experiments.daily table: (experiment_id, variant_id, date, impressions, conversions, conversion_rate, lower_95ci, upper_95ci). - Computes statistical significance per pair-wise comparison (chi-square or two-proportion z-test). Outputs winner_recommendation field: null | "control" | "<variant_id>" | "no_winner_yet".

Stage 5: REPORT — dashboard.audreyinc.com surface

Reuses every UI pattern shipped on dare’s dashboard today:

Card-row of running experiments — each card a single experiment. Headline number = current leading variant’s conversion lift. Pill verdict: RUNNING / CONCLUSIVE WINNER / INSUFFICIENT DATA / STOPPED. Sparkline = the time-series of conversion-rate delta over the test window. Click-through to per-experiment detail report.
Window toggle (24h / 7d / 30d / 90d) — same shape as Edge Health. Different windows reveal different stories: 24h surfaces today’s spend efficiency; 30d/90d show the engagement-arc Dan named in feedback_window_toggles_are_high_value.md.
Smoothed-area trend chart (per feedback_smoothed_area_chart_over_time.md) — overlay of total impressions × converted impressions per experiment. The visual idiom from NextDNS’s chart applied to audrey’s commerce data.
Editorial-test results table — newsletter / social / caption variant winners with confidence intervals.
Ad-attribution row — Google + Meta spend × conversions × CAC per variant. Highlights underperforming ad creative that can be paused.
Social-pattern panel — organic post variants per channel (Instagram / Threads / Twitter/X / Pinterest), engagement deltas. Cross-references with which posts drove site visits via UTM.

Authentication

dashboard.audreyinc.com should be CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF-Access gated (same pattern as dashboard.dare.co.uk). Dan + collaborators by email; service token for CLI / agent access.

Surfaces being tested — initial scope

When the build unparks, these are the priority experiment surfaces (~6 worth shipping in v1):

Homepage hero copy variants — register/voice tests. The DARE-style 4-pillar framing won’t fit; audrey’s voice is more luxury-craft. Workshop the variant set with chat-Claude before defining.
Product page descriptions — short-vs-medium-vs-long form. Connects to the GSC crawled-not-indexed cohort (19 audrey pages).
Add-to-cart CTA microcopy — single-word changes (“Add to bag” vs “Order” vs “Get yours”). Small variants, big delta historically.
Newsletter subject lines — variant pairs per send, tracked via Klaviyo or similar.
Google Ads creative — same product, three copy variants. Win → scale spend; lose → kill.
Meta Ads creative — Reels vs static vs carousel formats. Same product, three creative shapes.

Integration with existing audrey work

Shopify UCP/llms.txt (per project_shopify_ucp_audrey_native.md) — agent-discoverability surface is independent of the experimentation pipeline, but both feed the broader commerce flywheel.
Judge.me reviews (per project_audrey_reviews_app_install.md) — review counts + ratings per product become an experiment variable (does adding reviews-snippet to a product card move conversion?).
Booking phase A (per project_booking_product_evolution.md) — bookings page is a candidate experiment surface once it has traffic.

Why this gets built now (vs deferred indefinitely)

Per user_invest_in_content_quality_not_more_tooling.md: don’t build more tooling unless it directly supports content investment. This framework directly supports content investment — it’s how we know whether new content (rewritten product descriptions, new homepage hero, new ad creative) actually moves the needle. Without it, content investment is gut-feel; with it, content investment is measurable + iterable.

So: this is tooling in service of editorial work. The exception that proves the rule.

Cost envelope (verified-or-best-guess)

Component	Cost shape
CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Workers requests	~$0 at our volume (under free tier)
CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF KV reads/writes	~$0
CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Analytics Engine	~$0
Logpush → BigQuery	CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Logpush free; GCS storage ~$0.02/GB/mo (negligible at our volume)
BigQuery query cost	first 1 TB scanned/mo free; we’ll scan MB
Cloud Run daily aggregator	~$0 at our run frequency
Meta + Google Ads conversion endpoints	free (just API calls)
Shopify analytics native	included in Shopify plan
Total recurring	~$0/mo at audrey’s expected volume

Open design questions

HTMLRewriter vs JS-side variant rendering. Worker rewrite is faster + works without JS; JS-side allows richer client-side experiments (animation variants, interaction variants). v1 = HTMLRewriter; v2 = JS hook if needed.
Statistical significance method. Frequentist (z-test) vs Bayesian. Frequentist is simpler + matches GA4’s native reporting. v1 = frequentist; v2 = Bayesian if we want richer “probability of winner” reads.
Stickiness duration. 30 days is generous; 14 might be better for fast-cycling tests. Calibrate after v1.
Holdout group. Should we reserve 10% of traffic as a control-of-controls so we can measure baseline drift independent of any active experiment?
Editorial-test integration shape. Newsletter / social variants happen in OTHER tools (Klaviyo / Buffer). Do we ingest from their APIs, or do we just record variant tags + match conversions by UTM?

Sibling memories

project_ab_testing_cf_workers_parked.md — original A/B framework, parked for lack-of-traffic on dare. This sketch is the audrey-flavoured unpark.
project_audrey_commerce_flywheel.md — 8-stage chain; this dashboard is the instrumentation layer for stages 4-8 (reviews → JSON-LD → SERP → programmatic recommendation → conversion).
feedback_window_toggles_are_high_value.md — window-toggle pattern lifts into the experiment-status cards.
feedback_smoothed_area_chart_over_time.md — visual idiom for the per-experiment trend chart.
feedback_show_the_future_grey_it_out.md — pending-state cards for experiments not yet started.
feedback_internal_seo.md — experiment IDs follow descriptive-slug naming (homepage_hero_2026_q2 not exp_001).
user_invest_in_content_quality_not_more_tooling.md — this framework qualifies as tooling-in-service-of-content per the exception clause.

Resume conditions

✅ First commerce-experimentation engagement starts (Dan begins testing a homepage variant or ad creative).
✅ Stages 4-8 of project_audrey_commerce_flywheel.md come online (reviews → JSON-LD → SERP visibility creates A/B-testable surface).
✅ Client engagement asks for a commerce-experimentation dashboard (lifts the pattern as portfolio-portable, like Site Health did today).
Earliest qualifying trigger gets v1 build; subsequent triggers exercise the cross-site portability.

Source: parked_sketch_dashboard_audreyinc_com_ab_editorial_ads_2026-05-18.md · Rendered 2026-05-18 12:53