CtrlK
BlogDocsLog inGet started
Tessl Logo

sharaf/product-experience-audit

Use when the user wants to audit a user journey, audit a signup/onboarding/checkout flow, do a UX audit, find the friction in a funnel, understand why users are dropping off or where they are being lost, or improve conversion in a web app — any diagnostic review of a multi-step, in-product flow. Use it whenever the user mentions drop-off, funnels, session replay, heatmaps, activation, time-to-value, cart or checkout abandonment, onboarding friction, or rage clicks, or wants to know where users struggle and what to fix first, even if they don't say "audit." Produces a severity-ranked, prioritized, experiment-validated improvement backlog via evidence-first intake, five parallel specialist lenses, and synthesis.

94

1.26x
Quality

100%

Does it follow best practices?

Impact

72%

1.26x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-3/

{
  "context": "Tests whether the agent correctly scopes a quick triage (2-3 lenses, not all five), applies an explicit scoring framework (ICE or PXL) with all per-factor inputs shown, routes lenses appropriately given the symptom and evidence type, builds a 9-field journey brief marking unavailable evidence as 'not provided', and avoids fabricating quantitative data.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Triage scope stated",
      "description": "The report explicitly states that this is a quick triage (not a full audit) and acknowledges that depth was traded for speed",
      "max_score": 8
    },
    {
      "name": "2-3 lenses selected",
      "description": "The report runs exactly two or three named lenses (not four or five) and does not run all five lenses",
      "max_score": 8
    },
    {
      "name": "Lens justification present",
      "description": "The report explains why those specific lenses were chosen given the available evidence and symptoms described",
      "max_score": 8
    },
    {
      "name": "Heuristic or qualitative lens included",
      "description": "At least one of the selected lenses is a Heuristic/Usability lens or a Qualitative/Friction lens (appropriate when there are user complaints and no analytics)",
      "max_score": 8
    },
    {
      "name": "9-field journey brief present",
      "description": "The report includes a journey brief with all nine fields: Product, Journey audited, Conversion goal/micro, Segments & devices in scope, Evidence available, Step inventory, Quantitative signal, Behavioral/qualitative signal, Data-trust notes",
      "max_score": 8
    },
    {
      "name": "Missing evidence marked 'not provided'",
      "description": "Journey brief fields for analytics, replay, and live URL are explicitly marked 'not provided' (or equivalent — not left blank or omitted)",
      "max_score": 8
    },
    {
      "name": "ICE or PXL framework used",
      "description": "The prioritized action list uses either ICE or PXL scoring by name — not a generic ranking or intuition-based order",
      "max_score": 8
    },
    {
      "name": "Per-factor scores shown",
      "description": "Every item in the prioritized backlog shows its individual factor scores (e.g. Impact=7, Confidence=6, Ease=8 for ICE) so the ranking is auditable",
      "max_score": 8
    },
    {
      "name": "7-field finding blocks used",
      "description": "Each finding is presented using the labeled 7-field block format: Finding, Evidence, Why it matters, Fix, Validate, Severity, Journey step",
      "max_score": 8
    },
    {
      "name": "No fabricated drop-off rates",
      "description": "The report does NOT assert any specific drop-off percentage or conversion rate (e.g. does NOT say '40% of users abandon at step 2') given that no analytics were provided",
      "max_score": 8
    },
    {
      "name": "Magnitudes described as unmeasured",
      "description": "The report explicitly states that impact magnitudes are unmeasured or unknown (since there is no funnel data), rather than presenting findings as measured facts",
      "max_score": 8
    },
    {
      "name": "Treat benchmarks as directional",
      "description": "If any external benchmark or industry figure is cited, it is attributed to a source and described as directional rather than presented as a universal target",
      "max_score": 6
    },
    {
      "name": "Next evidence step stated",
      "description": "The report names at least one specific type of additional evidence (e.g. analytics, session replay, live walkthrough) that would enable a deeper audit",
      "max_score": 6
    }
  ]
}

evals

README.md

tile.json