CtrlK
BlogDocsLog inGet started
Tessl Logo

sharaf/product-experience-audit

Use when the user wants to audit a user journey, audit a signup/onboarding/checkout flow, do a UX audit, find the friction in a funnel, understand why users are dropping off or where they are being lost, or improve conversion in a web app — any diagnostic review of a multi-step, in-product flow. Use it whenever the user mentions drop-off, funnels, session replay, heatmaps, activation, time-to-value, cart or checkout abandonment, onboarding friction, or rage clicks, or wants to know where users struggle and what to fix first, even if they don't say "audit." Produces a severity-ranked, prioritized, experiment-validated improvement backlog via evidence-first intake, five parallel specialist lenses, and synthesis.

94

1.26x
Quality

100%

Does it follow best practices?

Impact

72%

1.26x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-1/

{
  "context": "Tests whether the agent correctly applies the five-lens parallel audit methodology: assembling a journey brief in the required schema, dispatching (or documenting) all five specialist lenses, deduplicating findings, flagging dark patterns for removal with regulatory notes, and attaching falsifiable hypotheses to every significant recommendation. The evidence set deliberately omits analytics to test honest gap-flagging and evidence-led routing.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Journey brief present",
      "description": "The report contains a named journey brief section before any findings or lens results",
      "max_score": 8
    },
    {
      "name": "Brief schema — 9 fields",
      "description": "The journey brief includes all nine required fields: Product, Journey audited, Conversion goal, Segments & devices, Evidence available, Step inventory, Quantitative signal, Behavioral/qualitative signal, Data-trust notes",
      "max_score": 8
    },
    {
      "name": "Analytics marked missing",
      "description": "The Evidence available field (or equivalent) explicitly states that product analytics / funnel data is not provided (not invented or estimated)",
      "max_score": 7
    },
    {
      "name": "Five lenses addressed",
      "description": "The report addresses all five specialist lenses (e.g. heuristic/usability, funnel/analytics, conversion & persuasion, technical, accessibility) — either by running them or explicitly noting which were skipped and why",
      "max_score": 8
    },
    {
      "name": "Clean lens documented",
      "description": "At least one lens that finds no material issue explicitly states what was checked rather than being silently omitted",
      "max_score": 7
    },
    {
      "name": "Finding block template",
      "description": "At least three findings use the labeled block format with distinct fields: Finding, Evidence, Why it matters, Fix, Validate, Severity, Journey step",
      "max_score": 8
    },
    {
      "name": "Invite-step finding deduplicated",
      "description": "The rage-click / no-feedback issue on the invite step appears as a single merged finding (not listed separately under the qualitative quotes and replay observations as two different findings)",
      "max_score": 8
    },
    {
      "name": "Dark pattern flagged for removal",
      "description": "The persistent scarcity banner ('Only 2 seats left at this price!') is flagged as a finding to remove, NOT recommended as a persuasion technique to keep or improve",
      "max_score": 8
    },
    {
      "name": "Regulatory exposure cited",
      "description": "The dark-pattern finding references at least one of: FTC Section 5, EU DSA Article 25, or equivalent regulatory body",
      "max_score": 7
    },
    {
      "name": "Falsifiable hypothesis",
      "description": "At least two recommendations include a one-line falsifiable hypothesis in the form 'changing [X] will improve [metric] for [segment] because [evidence]'",
      "max_score": 8
    },
    {
      "name": "LIFT or Fogg model referenced",
      "description": "The conversion/persuasion lens (or equivalent section) explicitly references LIFT model components (value proposition, relevance, clarity, urgency, anxiety, distraction) OR the Fogg model (B=MAP / motivation, ability, prompt)",
      "max_score": 7
    },
    {
      "name": "What's Working Well section",
      "description": "Report includes a named 'What's Working Well' section (or equivalent) with at least one specific strength cited with evidence",
      "max_score": 7
    },
    {
      "name": "No analytics invented",
      "description": "The report does NOT present fabricated funnel percentages, drop rates, or session counts as real data — all quantitative claims are either from the provided evidence or clearly labeled as estimates/benchmarks",
      "max_score": 9
    }
  ]
}

evals

scenario-1

criteria.json

task.md

README.md

tile.json