audrey dashboard — brief template + voice contract

Date: 2026-05-13 Phase: 1 spec — the contract between data and voice Companion: audrey_dashboard_proposal_2026-05-13 Status: Spec. The 45-minute spike that comes after this writes the queries, runs the voice once, eyeballs.


The voice

Direction: Growth partner over morning espresso. Confident, action-biased, lightly impatient with status-reporting. Speaks in shopping-path language: paths, leaks, thresholds, friction points. Diagnoses fast and prescribes faster. Every 06:00 brief ends with one shippable test the operator can launch in Shopify before the 18:00 brief. Every 18:00 brief reads yesterday’s test verdict and threads it into tomorrow’s hypothesis.

Register tests: - Don’t moralise about flat numbers — diagnose, prescribe. - Don’t use the word “engagement.” - Reference at least one prior test or pattern by date (memory is the voice’s competence signal). - Name products, collections, paths specifically — not “users” or “the site.” - Sound like someone who has been watching this data for weeks and remembers what was tried Tuesday.

Sample paragraph (chat-Claude’s existence proof):

47 sessions hit the cashmere PDP, 4 reached checkout, 1 paid. The leak isn’t traffic, it’s the page. Flat-lay creative beat lifestyle on every collection card last week — PDP still opens with lifestyle. Today’s test: swap the hero on the top 3 PDPs to flat-lay, 50/50, 72h.

That’s the lock — specific, references prior pattern, ends with one shippable move.

Full voice prompt (copy into narrator.py of audrey-pipeline):

You are a growth partner reviewing audreyinc.com's data over morning espresso
with Dan. He runs the shop and has committed to A/B testing daily through
2026-12-31 (~200 experiments). You've been watching this data for weeks and
remember the tests that have shipped.

Voice: confident, action-biased, lightly impatient with status-reporting.
You speak in shopping-path language: paths, leaks, thresholds, friction
points, drop-off, hero variants, collection performance.

You do NOT use the word "engagement." You do not moralise about flat days —
you diagnose and prescribe.

Write a brief paragraph (~60-second read, ~140 words) in five elements:
1. VITALS — revenue, sessions, conversion rate, AOV vs 7d/28d baseline,
   one sentence.
2. WHAT SHIFTED — one or two real anomalies in the shopping path
   (collection view → PDP → cart → checkout → paid).
3. OPPORTUNITY — the leak or underexploited segment, named specifically
   (which product, which referrer, which collection, which path).
4. YESTERDAY'S TEST — verdict if one was running: ship it, kill it, or
   extend. If nothing ran, skip this element silently.
5. TODAY'S TEST — one concrete A/B that ships in Shopify before next brief.
   Format: variant change + duration + success metric.

The brief structured data follows. The most recent prior test (if any) is
under "yesterday_test". Reference it if relevant.

The contract — field → voice job

Each brief section pairs structured input data with the voice’s specific instruction. This is the load-bearing artefact: it defines minimum-data-needed (no more, no less).

Element 1 — Vitals

Field Source Voice job
revenue_today_gbp Shopify Admin: orders(query: "created_at:>=$today AND financial_status:paid") → sum totalPriceSet.shopMoney.amount State the headline number
revenue_7d_avg_gbp Same, query window >=$today-7d One delta-sentence: “up £X on the 7-day avg” or “down £Y vs last Wed”
revenue_28d_avg_gbp Same, window >=$today-28d Quiet baseline — only mention if today is >1σ off
orders_today Same, count of order nodes State the count
aov_today_gbp revenue_today_gbp / orders_today Mention if it’s moved by >5% from 7d avg
sessions_today CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Analytics GraphQL: httpRequests1dGroups.uniq.uniques filtered by audreyinc.com zone + date=$today State sessions
conversion_rate_pct orders_today / sessions_today * 100 The single most important diagnostic number — always include

Shopify query shape:

query DailyOrders($today: String!) {
  orders(first: 100, query: $today, sortKey: CREATED_AT, reverse: true) {
    edges {
      node {
        id
        createdAt
        totalPriceSet { shopMoney { amount currencyCode } }
        customer { id numberOfOrders }
        lineItems(first: 10) {
          edges { node { quantity product { handle title } variant { id title price } } }
        }
        sourceName
        referringSite
      }
    }
  }
}
# $today = "created_at:>=2026-05-13T00:00:00Z AND financial_status:paid"

Sample narrator line for this element:

£10,246 today across 47 paid orders — AOV £218, up £14 on the 7-day. Conversion held at 0.98%.


Element 2 — What shifted

Field Source Voice job
top_paths_today CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Analytics: httpRequestsAdaptiveGroups ordered by sum_visits_DESC, dimension clientRequestPath, limit 20 Identify which paths gained/lost traffic vs 7d avg
top_paths_7d_avg Same query, last 7 days averaged Baseline for comparison
top_collections Derived: paths matching /collections/* from top_paths Collection-level shifts
top_pdps Derived: paths matching /products/* from top_paths PDP-level shifts
top_products_today Shopify: lineItems.product.handle aggregated from today’s orders Product velocity (units sold)
top_products_7d Same, last 7 days Baseline product velocity
top_referrers_today CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Analytics: same query, dimension clientRequestReferer Which channels are driving today
referrer_share_shift Today’s top-3 share vs 7d avg Channel-shift anomalies

Voice job composite: identify one or two real anomalies — not a list. The voice picks the most diagnostic shift and narrates it.

Sample narrator line:

The cashmere collection page jumped 38% on traffic but PDP click-through dropped — visitors aren’t moving from /collections/cashmere into product pages.


Element 3 — Opportunity

This element is mostly inference, not direct field reads. The voice constructs the opportunity from the shifts identified in Element 2 + prior pattern memory.

Input Source Voice job
Funnel snapshot Derived from Elements 1 + 2: sessions → PDP views → checkout starts → paid Identify which stage is leaking
Cart abandonment count Shopify: abandonedCheckouts(query: "created_at:>=$today") count + total value Flag if abandoned > 2× paid (leak signal)
Top abandoned line items Shopify: abandonedCheckouts.lineItems.product.handle Identify product-specific friction
Customer split Shopify: from order data, customer.numberOfOrders field — new vs returning Identify which cohort is moving
Prior pattern memory Read from experiments.json (flat file, see below) Reference one prior test/pattern if relevant

Voice job: name one specific opportunity. “The leak isn’t traffic, it’s the page” is the canonical shape — locates the friction, sets up the test.

Abandoned checkouts query:

query AbandonedCheckouts($since: String!) {
  abandonedCheckouts(first: 50, query: $since) {
    edges {
      node {
        id
        createdAt
        totalPriceSet { shopMoney { amount } }
        lineItems(first: 5) { edges { node { product { handle } quantity } } }
      }
    }
  }
}
# $since = "created_at:>=2026-05-13T00:00:00Z"

Sample narrator line:

The leak isn’t traffic, it’s the cashmere PDP — 47 sessions, 4 carts, 1 paid. Same pattern as last Wednesday before the price-anchor test moved the needle.


Element 4 — Yesterday’s test verdict

Field Source Voice job
yesterday_test.hypothesis Read from experiments.json flat file (see Deferred section) One-sentence verdict reference
yesterday_test.variant Same Identify which variant won/lost
yesterday_test.metric Same Report lift or null result
yesterday_test.recommendation Voice generates: ship / kill / extend The actionable verdict

Voice job: if yesterday’s test has reached its sample-size or duration threshold, render the verdict in one sentence. If no test was running, skip this element silently — don’t write a placeholder.

Sample narrator line (when a test ran):

Tuesday’s flat-lay-vs-lifestyle test on the silk twill collection cards is in — flat-lay won click-through by 6.2 points. Ship it across the rest of the collection.


Element 5 — Today’s test (the lock)

This element is purely voice-generated — no input fields. The voice picks ONE test from the opportunity diagnosis.

Required format constraint: 1. Variant change — what’s swapping for what (must be a Shopify-changeable thing: theme block, copy, image, CTA, price, badge, recommendation list) 2. Duration — how long it runs (typically 48–72h depending on traffic volume) 3. Success metric — what number determines win/lose

Voice prompt for this element:

Pick ONE test that ships in Shopify before next brief. Must be:
- Specifically actionable (which page, which element, which variant)
- Bounded in time (24h to 72h typical)
- Measurable (a number that determines win/lose)

Do NOT prescribe tests outside Shopify control (paid acquisition, off-site SEO).
Do NOT prescribe more than one test per brief.

Sample narrator line:

Today’s test: swap the hero on the top 3 cashmere PDPs from lifestyle to flat-lay, 50/50, 72h. Watch PDP→cart rate; ship the winner.


Deferred — the experiment ledger

Chat-Claude’s catch: “by August you’re re-running tests you already lost in June.” The brief is the loop; the ledger is the memory. Worth deferring per Dan’s framing: “ledger can wait until you’ve shipped your tenth test and started losing track.”

v1 (deferred until ~10th test): flat markdown file experiments.md or ~/Code/audrey-pipeline/experiments.md, 5 columns:

| Date    | Hypothesis                            | Variant          | Duration | Verdict           |
|---------|---------------------------------------|------------------|----------|-------------------|
| 2026-05-15 | flat-lay PDP hero beats lifestyle  | A/B 50/50        | 72h      | A won +6.2pp PDP→cart |
| 2026-05-18 | price anchor £180 vs £220 on twill | A/B 50/50        | 48h      | Null — re-run with more sample |

The narrator reads this file at brief-build time, references the prior test if relevant.

v2 (after ~30 tests): HTML rendered view at dashboard.audreyinc.com/experiments, same plumbing as the brief.

Resume signal for v1 → v2: when the markdown file has >15 rows and you find yourself searching it more than once a week.


The 45-minute spike — what to build first

Concrete deliverables in execution order:

# Step Time
1 Mint Shopify Custom App + grab Admin API token (per audrey proposal Phase 0); verify the orders + abandonedCheckouts queries above return real data 15 min
2 Verify the CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Analytics queries return path + referrer + uniques against the audreyinc.com zone (probably reuse dare-pipeline’s CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF auth setup) 10 min
3 Write build_brief.py stub that fetches both, merges into the structured object matching this contract 10 min
4 Run Haiku once with the voice prompt above + the structured brief; eyeball output 5 min
5 One iteration of voice prompt based on output (likely tightening: trim adjectives, force test-format compliance) 5 min

End-of-spike artefact: one real Haiku-narrated paragraph against real audrey data, in the right voice, with the right elements. That’s enough to commit to full Phase 1 build.


Implementation questions to resolve during the spike

  1. Currency. GBP-only (most audrey orders are UK?) or render in customer’s currency? Probably GBP for the brief; multi-currency adds noise.
  2. Days with zero orders. Early-stage / weekend lulls. Voice should still write — Element 1 reports the goose-egg + Element 5 still prescribes a test (e.g., “no orders today — today’s test: ).
  3. First-week cold start. No yesterday’s-test history. Element 4 stays silent for the first ~7 days; voice doesn’t fake memory.
  4. Sample-size thresholds for test verdicts. Below what N does Element 4 say “too early to call, extending”? Suggest: 100 sessions or 5 conversions per variant minimum, whichever comes first.
  5. Editorial control on test type. Should the voice ever prescribe off-Shopify tests (paid ads, email subject lines)? Suggest: no — keep test surface tight to Shopify theme/PDP/copy/checkout. Off-Shopify tests are separate cycles.

What lands when this ships

A daily 06:00 + 18:00 paragraph that: - Names today’s funnel numbers in one breath - Identifies one specific path-shift or leak - Reports yesterday’s test verdict if one ran - Prescribes one Shopify-shippable test for today - Sounds like a partner, not a status report

Cost: ~$0.02/mo on Haiku. Cadence: 06:00 BST = pre-coffee read; 18:00 BST = end-of-day verdict check. Outcome by 2026-12-31: ~200 experiments shipped, an experiment ledger of what works on luxury commerce, and a voice trained by 230 days of operator feedback.


Pattern memories this earns when it ships


The brief template is the contract. Voice can’t ask the data what it doesn’t query; data shouldn’t carry fields the voice doesn’t use. One pass; one artefact.

Source: audrey_dashboard_brief_template_2026-05-13.md · Rendered 2026-05-13 14:48