e2e

Run e2e tests, fix flake and outdated tests, identify bugs against spec. Use when running e2e tests, debugging test failures, or fixing flaky tests. Never changes source code logic or API without spec backing.

1.22x

Quality

96%

Does it follow best practices?

Impact

75%

1.22x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that concisely covers specific actions, includes natural trigger terms, and explicitly states both what the skill does and when to use it. The added behavioral constraint ('Never changes source code logic or API without spec backing') further sharpens its identity and distinguishes it from general code-fixing skills. Minor improvement could include mentioning file types or frameworks, but overall it's well-crafted.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'Run e2e tests', 'fix flake and outdated tests', 'identify bugs against spec'. Also includes a behavioral constraint ('Never changes source code logic or API without spec backing').	3 / 3
Completeness	Clearly answers both what ('Run e2e tests, fix flake and outdated tests, identify bugs against spec') and when ('Use when running e2e tests, debugging test failures, or fixing flaky tests') with explicit trigger guidance.	3 / 3
Trigger Term Quality	Includes natural keywords users would say: 'e2e tests', 'test failures', 'flaky tests', 'debugging'. These are terms developers commonly use when dealing with end-to-end testing issues.	3 / 3
Distinctiveness Conflict Risk	Clearly scoped to e2e testing specifically, with distinct triggers like 'e2e tests', 'flaky tests', and 'test failures'. The constraint about not changing source code logic further distinguishes it from general coding or unit testing skills.	3 / 3
	Total	12 / 12 Passed

Implementation

92%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a high-quality skill that provides clear, actionable guidance for e2e test management. Its strongest features are the precise failure taxonomy with explicit fix rules per category, strong safety boundaries (never change source code without spec backing), and a well-sequenced workflow with validation gates. The only minor weakness is that all content is inline rather than split across files, though the total length is borderline acceptable.

Dimension	Reasoning	Score
Conciseness	The content is lean and well-structured. Every section earns its place—the failure taxonomy, fix rules, and workflow steps all provide information Claude wouldn't inherently know about this project's conventions. No unnecessary explanations of what e2e testing is or how Playwright works.	3 / 3
Actionability	Provides concrete, executable commands (yarn playwright test --reporter=line), specific patterns to use and avoid (replace waitForTimeout with auto-waiting locators, never add retry loops), a structured table format for parsing failures, and a complete report template. The fix rules are specific and copy-paste actionable.	3 / 3
Workflow Clarity	The 5-step workflow is clearly sequenced with explicit validation checkpoints: categorize before fixing, fix in priority order (flaky→outdated→bug), re-run after all fixes, and the bug category has a unit test gate before completion. The feedback loop of re-running and reporting remaining failures is well-defined.	3 / 3
Progressive Disclosure	The content is well-organized with clear sections (Principles vs Workflow), but everything is in a single file. The failure taxonomy and fix rules sections are substantial and could potentially be split into referenced files. However, for a skill of this size (~120 lines), inline content is reasonable, just slightly long for a single SKILL.md.	2 / 3
	Total	11 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: NeverSight/skills_feed
Commit: aa009ea

Reviewed: 41 minutes ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.