sharaf/product-experience-audit

Use when the user wants to audit a user journey, audit a signup/onboarding/checkout flow, do a UX audit, find the friction in a funnel, understand why users are dropping off or where they are being lost, or improve conversion in a web app — any diagnostic review of a multi-step, in-product flow. Use it whenever the user mentions drop-off, funnels, session replay, heatmaps, activation, time-to-value, cart or checkout abandonment, onboarding friction, or rage clicks, or wants to know where users struggle and what to fix first, even if they don't say "audit." Produces a severity-ranked, prioritized, experiment-validated improvement backlog via evidence-first intake, five parallel specialist lenses, and synthesis.

1.26x

Quality

100%

Does it follow best practices?

Impact

72%

1.26x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

name:: product-experience-audit
description:: Use when the user wants to audit a user journey, audit a signup/onboarding/checkout flow, do a UX audit, find the friction in a funnel, understand why users are dropping off or where they are being lost, or improve conversion in a web app — any diagnostic review of a multi-step, in-product flow. Use it whenever the user mentions drop-off, funnels, session replay, heatmaps, activation, time-to-value, cart or checkout abandonment, onboarding friction, or rage clicks, or wants to know where users struggle and what to fix first, even if they don't say "audit." Produces a severity-ranked, prioritized, experiment-validated improvement backlog via evidence-first intake, five parallel specialist lenses, and synthesis.
metadata:: {"version":"0.1.2","source_domain":"product-experience-auditing","source_sub_domains":"audit-methodology-and-scoping, journey-mapping-and-flow-modeling, heuristic-evaluation-and-usability-inspection, persuasion-trust-and-behavioral-design, funnel-and-drop-off-analysis, behavioral-analytics-and-session-replay, friction-and-frustration-signals, analytics-instrumentation-and-data-quality, qualitative-research-and-voice-of-customer, onboarding-and-activation, forms-checkout-and-conversion-flows, navigation-information-architecture-and-search, performance-and-core-web-vitals, accessibility-and-inclusive-journeys, mobile-cross-device-and-responsive-journeys, prioritization-roadmapping-and-experiment-validation","research_date":"2026-06-23"}

Product Experience Audit

Name: sharaf/product-experience-audit
Rating: 94.4 (1 reviews)
Author: sharaf

Reference files under references/ hold phase details; open only the files needed for the current phase or specialist lens.

Purpose

Audit a user's journey through a live web application — a multi-step, stateful, in-product flow such as signup→activation, onboarding, checkout, or a core task flow — and produce a severity-ranked, prioritized, experiment-validated improvement backlog. Follow one spine end to end: find where users leak, explain why, size the prize by recoverable conversion, prioritize explicitly, and validate the fix before claiming it worked. The discipline that holds it together is triangulation — expert inspection, behavioral quantification (funnels, replay, heatmaps, frustration signals), and qualitative evidence, used together.

For an acquisition or marketing landing page, defer to a landing-page audit; this skill audits the in-product journey after the click.

Input

Establish two things before any analysis: the journey under audit (entry point → each step → success state, its macro conversion goal and the micro conversions along the way, and the segments/devices in scope) and the evidence available. If the flow or its success state cannot be determined from the request or the artifacts, ask one targeted question rather than auditing the wrong flow. Evidence may include any of:

Live or staging URL plus credentials (enables a step-by-step walkthrough)
Source repo (routes, forms, validation, error/empty states, analytics, A/B)
Product analytics — GA4, Amplitude, Mixpanel, PostHog, Heap (funnels, rates)
Session replay / heatmaps — FullStory, Hotjar, Clarity, Contentsquare (the why)
Qualitative inputs — usability tests, surveys, NPS/CSAT verbatims, tickets

First Actions

Build a factual journey brief before any critique — observations, not judgments. Map the journey, walk the flow as a new and returning user if an app is available, read the source if a repo is available, and gate any analytics on instrumentation trust before believing a single number — confirm events fire once and the funnel is ordered/windowed, and name the data loss that biases every rate (ad-blocker ~25–40%, consent decline ~50–55%; treat GA4 modeled figures as estimates). When local repo access exists, commands like these orient the scan:

rg --files | rg '(^|/)(README|package.json|next.config|vite.config|src|app|pages|components|routes)'
rg -n "gtag|GTM|amplitude|mixpanel|posthog|heap|hotjar|fullstory|clarity|analytics|funnel|onboarding|checkout|signup|activation" .

Emit the brief as a labeled Journey Brief block with all nine fields before any findings; write "not provided" for absent evidence rather than inventing it. context-gathering.md explains how to populate each field.

Journey Brief
Product: [name, what it does]
Journey audited: [entry → steps → success state]
Conversion goal: [macro] / micro: [intermediate signals]
Segments & devices in scope: [new vs returning, plan, mobile/desktop]
Evidence available: [live / source / analytics / replay / qual — and what is MISSING]
Step inventory: [step 1…n with what happens at each]
Quantitative signal: [per-step rates, biggest absolute drop — or "not provided"]
Behavioral/qualitative signal: [frustration signals, replay themes, verbatims — or "not provided"]
Data-trust notes: [instrumentation caveats; consent / bot / ad-blocker bias]

Workflow

#	Phase	Use
1	Gather context and assemble the journey brief	context-gathering.md
2	Run the five specialist lenses over the shared brief	specialist-agents.md
3	Deduplicate, reconcile severity, and synthesize the backlog	synthesis-report.md
4	Apply guardrails, evidence/symptom routing, and success checks	guardrails-decision-logic.md

Dispatch the five lenses as parallel specialist agents when the host environment allows it, each receiving the full journey brief inline. If not, run them as separate passes and keep their findings distinct before synthesis. No intermediate files are required.

Audit depth. Default to the full five-lens audit for any flow that matters to the business. For a quick triage — a single small flow, a fast turnaround, or thin evidence — run only the two or three relevant lenses inline (usually heuristic and funnel) and note that depth was traded for speed. Scale the agent count to the journey, not the journey to the agents.

Synthesis. When multiple lenses flag the same issue — analytics shows the leak, replay shows the rage click, the heuristic lens names the violation — merge them into ONE finding with the strongest evidence chain; never list the same issue separately under each lens. A lens that finds no material issue states what it checked and records it as a documented strength, not a silent omission.

Severity Scale

Severity reflects impact on journey completion; prefer measured impact (funnel/replay) over assumed impact.

Severity	Definition	Examples
Critical	Blocks the journey, or a measured/known >20–30% loss for affected users	dead-end or broken step, cannot complete on mobile, login/redirect loop, payment fails silently, keyboard trap on a required field, consent wall losing most users
High	Major friction, ~10–30% step impact	forced account creation before checkout, no inline validation on a high-error field, rage-click hotspot, surprise costs late in checkout, slow LCP on a key step, missing activation prompt
Medium	Measurable but <10% impact	confusing label, suboptimal field count, weak empty state, non-blocking contrast failures, ambiguous nav label
Low	Best-practice gap, marginal lift	micro-copy polish, cosmetic inconsistency, nice-to-have affordance

Finding Contract

Every finding block must use this exact template:

Finding: [what was observed + which journey step]
Evidence: [funnel number / replay observation / heuristic violated / code element / WCAG SC / benchmark]
Why it matters: [impact, with a measured number or an attributed benchmark]
Fix: [concrete, specific change — not "improve this"]
Validate: [A/B | before-after | painted-door | ship+monitor] + a one-line hypothesis — "changing [X] will improve [metric] for [segment] because [evidence]"
Severity: [critical/high/medium/low]
Journey step: [step name]

Assert a cause only when evidence supports it. When a step leaks but the cause is unknown, say so and recommend the evidence that would resolve it rather than guessing.

Routing

Route the leading lens by the evidence and the symptom: with analytics, lead with the funnel and send the biggest leaks to replay for cause; without it, lead with heuristic evaluation and a cognitive walkthrough and state that magnitudes are unmeasured. The full evidence/symptom routing, the journey archetypes, and the success criteria are in guardrails-decision-logic.md.

Final Report Contract

Default to the report template in synthesis-report.md. The final answer must include:

Executive summary — the journey audited, the evidence base and its limits, and the biggest opportunity sized by recoverable conversion.
Journey map — the steps in order with per-step conversion/drop-off where known, the largest absolute leak marked, and data-trust caveats stated.
Findings by lens — funnel & behavioral, usability, friction & qualitative, conversion & hotspots, technical (performance / a11y / mobile); omit a lens with no findings and record it under What's Working.
Prioritized backlog — sorted by absolute recoverable conversion (drop rate × step volume, then by revenue where ACV is known) or a stated ICE/PXL score; each row carries a concrete fix and a validation method with a one-line hypothesis.
What's Working Well — specific strengths with evidence, to prevent over-correction of things that are not broken.
Evidence & Assumptions — sources used, what was missing, and every assumption, so the reader can weight the findings.

Every finding is gated by the guardrails in guardrails-decision-logic.md — read them before finalizing the report.

evals

skills

product-experience-audit

references

SKILL.md

README.md

tile.json