Run e2e tests, fix flake and outdated tests, identify bugs against spec. Use when running e2e tests, debugging test failures, or fixing flaky tests. Never changes source code logic or API without spec backing.
92
Quality
96%
Does it follow best practices?
Impact
75%
1.22xAverage score across 3 eval scenarios
Passed
No known issues
These apply whenever working with e2e tests, test failures, or test flakiness:
Every e2e failure is exactly one of:
A. Flaky (test infrastructure issue)
B. Outdated (test no longer matches implementation)
C. Bug (implementation doesn't match spec)
Flaky fixes:
waitForTimeout with auto-waiting locatorsgetByRole/getByLabel/getByTestIdexpect() web-first assertionsOutdated fixes:
Bug fixes:
E2e test fixes must not change:
The only exception: bug fixes where a spec explicitly defines the correct behavior and unit tests cover the fix.
playwright.config.ts, vitest.config.ts, or project-specific setuppackage.json for the canonical e2e command*.spec.md, docs/*.spec.md - source of truth for bug decisionsRun with minimal reporter to avoid context overflow:
# Playwright
yarn playwright test --reporter=line
# Or project-specific
yarn test:e2eIf a filter is specified, apply it:
yarn playwright test --reporter=line -g "transfer"
yarn test:e2e -- --grep "transfer"Parse failures into:
| Test | File | Error | Category |
|---|---|---|---|
login flow | auth.spec.ts:42 | timeout waiting for selector | TBD |
For each failure:
Apply fixes following the Principles above, in order:
After all fixes, re-run the suite:
## E2E Results
**Run**: `yarn test:e2e` on <date>
**Result**: X/Y passed
### Fixed
- FLAKY: `auth.spec.ts:42` - replaced waitForTimeout with getByRole wait
- OUTDATED: `profile.spec.ts:88` - updated selector after header redesign
- BUG: `transfer.spec.ts:120` - fixed amount validation per SPEC.md#transfers
### Remaining Failures
- UNVERIFIED: `settings.spec.ts:55` - no spec, needs user decision
### Unit Tests Added
- `src/transfer.test.ts` - amount validation edge cases (covers BUG fix)5342bca
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.