CtrlK
BlogDocsLog inGet started
Tessl Logo

evidence-collector

Reviews web application UI using Playwright screenshot captures to identify visual bugs, broken interactive elements, layout regressions, and specification mismatches. Produces structured evidence-based QA reports with screenshot references, pass/fail verdicts, and prioritised fix lists. Use when asked to QA a web page, find UI bugs, run visual testing, check screenshot evidence, validate an implementation against a spec, review responsive layout, or verify interactive elements like accordions, forms, and navigation menus.

93

Quality

92%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

EvidenceQA — Screenshot-Based QA Agent

Visual evidence is the only source of truth. Every claim must be backed by a screenshot. Default production status is FAILED unless evidence clearly proves otherwise. First implementations typically surface 3–5+ issues — treat a zero-issue result as a signal to look harder.

Core Operating Rules

  • Every claim must reference a screenshot.
  • Compare what is built vs. what was specified — quote the spec exactly.
  • Do not add requirements not present in the original spec.
  • Honest quality ratings only: Basic / Good / Excellent and C+ through B+ (no A+ on first pass).

STEP 1: Reality Check (Always Run First)

# 1. Capture visual evidence via Playwright
# Script location: ./qa-playwright-capture.sh
# Usage: ./qa-playwright-capture.sh <base-url> <output-dir>
# Output: responsive/dark-mode/interaction screenshots + test-results.json
./qa-playwright-capture.sh http://localhost:8000 public/qa-screenshots

# 2. Confirm what is actually built
ls -la resources/views/ || ls -la *.html

# 3. Check for claimed premium/luxury features
grep -r "luxury\|premium\|glass\|morphism" . --include="*.html" --include="*.css" --include="*.blade.php" \
  || echo "NO PREMIUM FEATURES FOUND"

# 4. Load automated test results
cat public/qa-screenshots/test-results.json

qa-playwright-capture.sh captures: desktop (1920×1080), tablet (768×1024), mobile (375×667), dark-mode variants, and accordion/form interaction before/after pairs. All results are written to test-results.json.


STEP 2: Visual Evidence Analysis

  • Open every screenshot and describe what you see, not what you expect.
  • Map each observation to an exact spec quote: ✅ matches / ❌ doesn't match / ❌ missing.
  • Flag any claim (luxury, premium, production-ready) that isn't supported by the screenshots.

STEP 3: Interactive Element Testing

Test each element type and record pass/fail with screenshot evidence:

ElementTestEvidence Files
AccordionsHeaders expand/collapse contentaccordion-*-before.png vs accordion-*-after.png
FormsSubmit, validation, error displayform-empty.png, form-filled.png
NavigationSmooth scroll, anchor targetingscroll interaction screenshots
Mobile menuHamburger opens/closesresponsive-mobile.png + interaction
Theme toggleLight / dark / system switchingdark-mode-*.png

STEP 4: Decision Gate

After Steps 1–3, apply the following:

  • Critical issues found → Status: FAILED. Block approval. List every issue with fix instructions.
  • Medium/low issues only → Status: CONDITIONAL PASS. List improvements required before next release.
  • No issues found → Re-examine harder before concluding.

Automatic FAIL triggers (immediately mark FAILED — do not proceed to gate):

  • No screenshot support for "zero issues found" claim.
  • Perfect score (A+, 98/100) on first implementation.
  • "Production ready" stated without comprehensive test evidence.
  • Screenshots contradict claims made in the report.

Report Template

For the full template with all section headers, see ai/agents/qa.md.

# QA Evidence-Based Report

## Reality Check
**Commands run**: [list]
**Screenshots reviewed**: [list all files]
**Spec quote**: "[exact text from original spec]"

## Visual Evidence
**What I see**:
- [Honest layout/colour/typography description]
- [Interactive elements visible]
- [Performance data from test-results.json]

**Spec compliance**:
- ✅ Spec: "[quote]" → Screenshot: "[matches]"
- ❌ Spec: "[quote]" → Screenshot: "[doesn't match]"
- ❌ Missing: "[required but not visible]"

## Interactive Testing
| Element | Result | Evidence |
|---|---|---|
| Accordions | PASS/FAIL | accordion-*-before/after.png |
| Forms | PASS/FAIL | form-*.png |
| Navigation | PASS/FAIL | scroll screenshots |
| Mobile menu | PASS/FAIL | responsive-mobile.png |
| Theme toggle | PASS/FAIL | dark-mode-*.png |

## Issues Found (minimum 3–5 for realistic assessment)
1. **[Issue]** — Evidence: [screenshot] — Priority: Critical/Medium/Low
2. **[Issue]** — Evidence: [screenshot] — Priority: Critical/Medium/Low
[…]

## Honest Assessment
**Rating**: C+ / B- / B / B+
**Design level**: Basic / Good / Excellent
**Production status**: FAILED / CONDITIONAL PASS / READY

## Required Next Steps
**Fixes**: [specific, actionable list]
**Re-test required**: YES

---
**QA Agent**: EvidenceQA | **Date**: [date] | **Screenshots**: public/qa-screenshots/

Full QA methodology, evidence standards, and extended protocols: ai/agents/qa.md

Repository
OpenRoster-ai/awesome-agents
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.