audrey dashboard — brief template + voice contract
Date: 2026-05-13 Phase: 1 spec — the contract between data and voice Companion: audrey_dashboard_proposal_2026-05-13 Status: Spec. The 45-minute spike that comes after this writes the queries, runs the voice once, eyeballs.
The voice
Direction: Growth partner over morning espresso. Confident, action-biased, lightly impatient with status-reporting. Speaks in shopping-path language: paths, leaks, thresholds, friction points. Diagnoses fast and prescribes faster. Every 06:00 brief ends with one shippable test the operator can launch in Shopify before the 18:00 brief. Every 18:00 brief reads yesterday’s test verdict and threads it into tomorrow’s hypothesis.
Register tests: - Don’t moralise about flat numbers — diagnose, prescribe. - Don’t use the word “engagement.” - Reference at least one prior test or pattern by date (memory is the voice’s competence signal). - Name products, collections, paths specifically — not “users” or “the site.” - Sound like someone who has been watching this data for weeks and remembers what was tried Tuesday.
Sample paragraph (chat-Claude’s existence proof):
47 sessions hit the cashmere PDP, 4 reached checkout, 1 paid. The leak isn’t traffic, it’s the page. Flat-lay creative beat lifestyle on every collection card last week — PDP still opens with lifestyle. Today’s test: swap the hero on the top 3 PDPs to flat-lay, 50/50, 72h.
That’s the lock — specific, references prior pattern, ends with one shippable move.
Full voice prompt (copy into narrator.py of audrey-pipeline):
You are a growth partner reviewing audreyinc.com's data over morning espresso
with Dan. He runs the shop and has committed to A/B testing daily through
2026-12-31 (~200 experiments). You've been watching this data for weeks and
remember the tests that have shipped.
Voice: confident, action-biased, lightly impatient with status-reporting.
You speak in shopping-path language: paths, leaks, thresholds, friction
points, drop-off, hero variants, collection performance.
You do NOT use the word "engagement." You do not moralise about flat days —
you diagnose and prescribe.
Write a brief paragraph (~60-second read, ~140 words) in five elements:
1. VITALS — revenue, sessions, conversion rate, AOV vs 7d/28d baseline,
one sentence.
2. WHAT SHIFTED — one or two real anomalies in the shopping path
(collection view → PDP → cart → checkout → paid).
3. OPPORTUNITY — the leak or underexploited segment, named specifically
(which product, which referrer, which collection, which path).
4. YESTERDAY'S TEST — verdict if one was running: ship it, kill it, or
extend. If nothing ran, skip this element silently.
5. TODAY'S TEST — one concrete A/B that ships in Shopify before next brief.
Format: variant change + duration + success metric.
The brief structured data follows. The most recent prior test (if any) is
under "yesterday_test". Reference it if relevant.
The contract — field → voice job
Each brief section pairs structured input data with the voice’s specific instruction. This is the load-bearing artefact: it defines minimum-data-needed (no more, no less).
Element 1 — Vitals
| Field | Source | Voice job |
|---|---|---|
revenue_today_gbp |
Shopify Admin: orders(query: "created_at:>=$today AND financial_status:paid") → sum totalPriceSet.shopMoney.amount |
State the headline number |
revenue_7d_avg_gbp |
Same, query window >=$today-7d |
One delta-sentence: “up £X on the 7-day avg” or “down £Y vs last Wed” |
revenue_28d_avg_gbp |
Same, window >=$today-28d |
Quiet baseline — only mention if today is >1σ off |
orders_today |
Same, count of order nodes | State the count |
aov_today_gbp |
revenue_today_gbp / orders_today |
Mention if it’s moved by >5% from 7d avg |
sessions_today |
CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Analytics GraphQL: httpRequests1dGroups.uniq.uniques filtered by audreyinc.com zone + date=$today |
State sessions |
conversion_rate_pct |
orders_today / sessions_today * 100 |
The single most important diagnostic number — always include |
Shopify query shape:
query DailyOrders($today: String!) {
orders(first: 100, query: $today, sortKey: CREATED_AT, reverse: true) {
edges {
node {
id
createdAt
totalPriceSet { shopMoney { amount currencyCode } }
customer { id numberOfOrders }
lineItems(first: 10) {
edges { node { quantity product { handle title } variant { id title price } } }
}
sourceName
referringSite
}
}
}
}
# $today = "created_at:>=2026-05-13T00:00:00Z AND financial_status:paid"
Sample narrator line for this element:
£10,246 today across 47 paid orders — AOV £218, up £14 on the 7-day. Conversion held at 0.98%.
Element 2 — What shifted
| Field | Source | Voice job |
|---|---|---|
top_paths_today |
CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Analytics: httpRequestsAdaptiveGroups ordered by sum_visits_DESC, dimension clientRequestPath, limit 20 |
Identify which paths gained/lost traffic vs 7d avg |
top_paths_7d_avg |
Same query, last 7 days averaged | Baseline for comparison |
top_collections |
Derived: paths matching /collections/* from top_paths |
Collection-level shifts |
top_pdps |
Derived: paths matching /products/* from top_paths |
PDP-level shifts |
top_products_today |
Shopify: lineItems.product.handle aggregated from today’s orders |
Product velocity (units sold) |
top_products_7d |
Same, last 7 days | Baseline product velocity |
top_referrers_today |
CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Analytics: same query, dimension clientRequestReferer |
Which channels are driving today |
referrer_share_shift |
Today’s top-3 share vs 7d avg | Channel-shift anomalies |
Voice job composite: identify one or two real anomalies — not a list. The voice picks the most diagnostic shift and narrates it.
Sample narrator line:
The cashmere collection page jumped 38% on traffic but PDP click-through dropped — visitors aren’t moving from /collections/cashmere into product pages.
Element 3 — Opportunity
This element is mostly inference, not direct field reads. The voice constructs the opportunity from the shifts identified in Element 2 + prior pattern memory.
| Input | Source | Voice job |
|---|---|---|
| Funnel snapshot | Derived from Elements 1 + 2: sessions → PDP views → checkout starts → paid | Identify which stage is leaking |
| Cart abandonment count | Shopify: abandonedCheckouts(query: "created_at:>=$today") count + total value |
Flag if abandoned > 2× paid (leak signal) |
| Top abandoned line items | Shopify: abandonedCheckouts.lineItems.product.handle |
Identify product-specific friction |
| Customer split | Shopify: from order data, customer.numberOfOrders field — new vs returning |
Identify which cohort is moving |
| Prior pattern memory | Read from experiments.json (flat file, see below) |
Reference one prior test/pattern if relevant |
Voice job: name one specific opportunity. “The leak isn’t traffic, it’s the page” is the canonical shape — locates the friction, sets up the test.
Abandoned checkouts query:
query AbandonedCheckouts($since: String!) {
abandonedCheckouts(first: 50, query: $since) {
edges {
node {
id
createdAt
totalPriceSet { shopMoney { amount } }
lineItems(first: 5) { edges { node { product { handle } quantity } } }
}
}
}
}
# $since = "created_at:>=2026-05-13T00:00:00Z"
Sample narrator line:
The leak isn’t traffic, it’s the cashmere PDP — 47 sessions, 4 carts, 1 paid. Same pattern as last Wednesday before the price-anchor test moved the needle.
Element 4 — Yesterday’s test verdict
| Field | Source | Voice job |
|---|---|---|
yesterday_test.hypothesis |
Read from experiments.json flat file (see Deferred section) |
One-sentence verdict reference |
yesterday_test.variant |
Same | Identify which variant won/lost |
yesterday_test.metric |
Same | Report lift or null result |
yesterday_test.recommendation |
Voice generates: ship / kill / extend | The actionable verdict |
Voice job: if yesterday’s test has reached its sample-size or duration threshold, render the verdict in one sentence. If no test was running, skip this element silently — don’t write a placeholder.
Sample narrator line (when a test ran):
Tuesday’s flat-lay-vs-lifestyle test on the silk twill collection cards is in — flat-lay won click-through by 6.2 points. Ship it across the rest of the collection.
Element 5 — Today’s test (the lock)
This element is purely voice-generated — no input fields. The voice picks ONE test from the opportunity diagnosis.
Required format constraint: 1. Variant change — what’s swapping for what (must be a Shopify-changeable thing: theme block, copy, image, CTA, price, badge, recommendation list) 2. Duration — how long it runs (typically 48–72h depending on traffic volume) 3. Success metric — what number determines win/lose
Voice prompt for this element:
Pick ONE test that ships in Shopify before next brief. Must be:
- Specifically actionable (which page, which element, which variant)
- Bounded in time (24h to 72h typical)
- Measurable (a number that determines win/lose)
Do NOT prescribe tests outside Shopify control (paid acquisition, off-site SEO).
Do NOT prescribe more than one test per brief.
Sample narrator line:
Today’s test: swap the hero on the top 3 cashmere PDPs from lifestyle to flat-lay, 50/50, 72h. Watch PDP→cart rate; ship the winner.
Deferred — the experiment ledger
Chat-Claude’s catch: “by August you’re re-running tests you already lost in June.” The brief is the loop; the ledger is the memory. Worth deferring per Dan’s framing: “ledger can wait until you’ve shipped your tenth test and started losing track.”
v1 (deferred until ~10th test): flat markdown file experiments.md or ~/Code/audrey-pipeline/experiments.md, 5 columns:
| Date | Hypothesis | Variant | Duration | Verdict |
|---------|---------------------------------------|------------------|----------|-------------------|
| 2026-05-15 | flat-lay PDP hero beats lifestyle | A/B 50/50 | 72h | A won +6.2pp PDP→cart |
| 2026-05-18 | price anchor £180 vs £220 on twill | A/B 50/50 | 48h | Null — re-run with more sample |
The narrator reads this file at brief-build time, references the prior test if relevant.
v2 (after ~30 tests): HTML rendered view at dashboard.audreyinc.com/experiments, same plumbing as the brief.
Resume signal for v1 → v2: when the markdown file has >15 rows and you find yourself searching it more than once a week.
The 45-minute spike — what to build first
Concrete deliverables in execution order:
| # | Step | Time |
|---|---|---|
| 1 | Mint Shopify Custom App + grab Admin API token (per audrey proposal Phase 0); verify the orders + abandonedCheckouts queries above return real data |
15 min |
| 2 | Verify the CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF Analytics queries return path + referrer + uniques against the audreyinc.com zone (probably reuse dare-pipeline’s CDN, security layer, and DNS provider sitting in front of dare.co.uk.">CF auth setup) | 10 min |
| 3 | Write build_brief.py stub that fetches both, merges into the structured object matching this contract |
10 min |
| 4 | Run Haiku once with the voice prompt above + the structured brief; eyeball output | 5 min |
| 5 | One iteration of voice prompt based on output (likely tightening: trim adjectives, force test-format compliance) | 5 min |
End-of-spike artefact: one real Haiku-narrated paragraph against real audrey data, in the right voice, with the right elements. That’s enough to commit to full Phase 1 build.
Implementation questions to resolve during the spike
- Currency. GBP-only (most audrey orders are UK?) or render in customer’s currency? Probably GBP for the brief; multi-currency adds noise.
- Days with zero orders. Early-stage / weekend lulls. Voice should still write — Element 1 reports the goose-egg + Element 5 still prescribes a test (e.g., “no orders today — today’s test:
“ ). - First-week cold start. No yesterday’s-test history. Element 4 stays silent for the first ~7 days; voice doesn’t fake memory.
- Sample-size thresholds for test verdicts. Below what N does Element 4 say “too early to call, extending”? Suggest: 100 sessions or 5 conversions per variant minimum, whichever comes first.
- Editorial control on test type. Should the voice ever prescribe off-Shopify tests (paid ads, email subject lines)? Suggest: no — keep test surface tight to Shopify theme/PDP/copy/checkout. Off-Shopify tests are separate cycles.
What lands when this ships
A daily 06:00 + 18:00 paragraph that: - Names today’s funnel numbers in one breath - Identifies one specific path-shift or leak - Reports yesterday’s test verdict if one ran - Prescribes one Shopify-shippable test for today - Sounds like a partner, not a status report
Cost: ~$0.02/mo on Haiku. Cadence: 06:00 BST = pre-coffee read; 18:00 BST = end-of-day verdict check. Outcome by 2026-12-31: ~200 experiments shipped, an experiment ledger of what works on luxury commerce, and a voice trained by 230 days of operator feedback.
Pattern memories this earns when it ships
feedback_audrey_voice_growth_partner.md— the voice direction + register tests capturedfeedback_brief_template_field_voice_contract.md— the “one artefact, two columns” pattern (field → voice job) for any future LLM-narrator brief designfeedback_test_as_lock_in_brief.md— the design constraint that every brief ends with one shippable test transforms dashboards into design-thinking enginesproject_audrey_pipeline_operating_notes.md— once built, the operational specifics
The brief template is the contract. Voice can’t ask the data what it doesn’t query; data shouldn’t carry fields the voice doesn’t use. One pass; one artefact.