e2e

Run e2e tests, fix flake and outdated tests, identify bugs against spec. Use when running e2e tests, debugging test failures, or fixing flaky tests. Never changes source code logic or API without spec backing.

1.22x

Quality

—

Does it follow best practices?

Impact

75%

1.22x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Evaluation results

85%

20%

Fix Intermittently Failing Playwright Tests

Flaky test remediation

Criteria

Without context

With context

No waitForTimeout

100%

Semantic role selectors

Web-first assertions

100%

No arbitrary delays

100%

Mock before navigation

100%

No manual retry loops

66%

100%

Assertions not weakened

100%

No source code changes

100%

41%

21%

Resolve Failing E2E Suite Before Release

Failure taxonomy and fix ordering

Criteria

Without context

With context

Auth failure: outdated category

Auth fix: test updated not source

Transfer failure: flaky category

100%

Settings failure: unverified category

No source changes for non-bug

Failure table produced

50%

75%

Fix ordering

Report: E2E Results header

50%

Report: Fixed section

75%

Report: Remaining Failures section

25%

Source code boundary respected

100%

Report: Unit Tests Added section

87%

100%

Investigate Consistent E2E Test Failure in Checkout Flow

Bug fix with spec and TDD gate

Criteria

Without context

With context

Bug classification

100%

Spec section cited

100%

Source code fixed

100%

E2E test unchanged

100%

Unit tests produced

100%

Unit test covers fix

100%

No API contract changes

100%

No source logic removed

100%

Investigation report produced

100%

Repository: NeverSight/skills_feed
Commit: ff6a653

Evaluated: 4 months ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

Fix Intermittently Failing Playwright Tests Resolve Failing E2E Suite Before Release Investigate Consistent E2E Test Failure in Checkout Flow

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.