dogfood

Systematically explore and test a web application to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", "test this app/site/platform", or review the quality of a web application. Produces a structured report with full reproduction evidence -- step-by-step screenshots, repro videos, and detailed repro steps for every issue -- so findings can be handed directly to the responsible teams.

1.97x

Quality

85%

Does it follow best practices?

Impact

87%

1.97x

Average score across 3 eval scenarios

Securityby

Risky

Do not use without reviewing

Evaluation results

100%

34%

QA Pass on a Demo Web Application

Session setup and command usage

Criteria

Without context

With context

Direct binary used

100%

Screenshots subdirectory

100%

Videos subdirectory

100%

Report file present

100%

Named session used

50%

100%

Wait after open

60%

100%

snapshot -i for interaction

100%

snapshot without flag for content

100%

Scroll via scroll command

100%

Session closed

100%

Report header filled

66%

100%

Initial annotated screenshot

100%

72%

64%

Document Issues Found in a Public Demo Site

Issue documentation and repro evidence

Criteria

Without context

With context

Video for interactive issues

100%

Video started before repro

100%

Video stopped after repro

100%

sleep 1 between steps

sleep 2 before result screenshot

type used during recording

100%

No video for static issues

62%

Annotated screenshots

100%

ISSUE-NNN numbering

100%

Step screenshots in report

87%

Issues not batched

fill used outside recording

100%

90%

30%

Full QA Pass on a Public E-Commerce Demo

Systematic exploration and wrap-up quality

Criteria

Without context

With context

Per-page errors check

62%

Per-page console check

62%

Per-page snapshot

33%

Per-page screenshot

100%

Reproducibility verified

100%

5-10 issues documented

100%

Severity summary accurate

100%

Session closed

100%

No source code reading

100%

End-to-end workflows tested

100%

No output files deleted

100%

Final summary to user

100%

Repository: vercel-labs/agent-browser
Commit: fa043a4

Evaluated: about 2 months ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

QA Pass on a Demo Web Application Document Issues Found in a Public Demo Site Full QA Pass on a Public E-Commerce Demo

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.