e2e-tests

Run, monitor, and fix frontend Cypress E2E tests. Handles local execution, CI monitoring, failure diagnosis, flaky test detection, and accessibility regression checks. Use this skill whenever the user mentions Cypress, E2E tests, end-to-end tests, test failures, CI failures, "CI is red", flaky tests, accessibility testing, or wants to run/debug/fix any frontend integration test — even if they don't say "e2e" explicitly.

Quality

88%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that hits all the marks. It provides specific concrete actions, comprehensive natural trigger terms, explicit 'Use when' guidance, and is clearly scoped to Cypress frontend E2E testing. The description is concise yet thorough, and the inclusion of edge cases like 'even if they don't say e2e explicitly' shows thoughtful design.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: run, monitor, fix Cypress E2E tests, local execution, CI monitoring, failure diagnosis, flaky test detection, and accessibility regression checks.	3 / 3
Completeness	Clearly answers both 'what' (run, monitor, fix Cypress E2E tests with specific sub-capabilities) and 'when' (explicit 'Use this skill whenever...' clause with comprehensive trigger scenarios).	3 / 3
Trigger Term Quality	Excellent coverage of natural trigger terms users would say: 'Cypress', 'E2E tests', 'end-to-end tests', 'test failures', 'CI failures', 'CI is red', 'flaky tests', 'accessibility testing', 'run/debug/fix', and even accounts for users who don't say 'e2e' explicitly.	3 / 3
Distinctiveness Conflict Risk	Clearly scoped to frontend Cypress E2E tests specifically, with distinct triggers like 'Cypress', 'CI is red', and 'flaky tests' that are unlikely to conflict with other testing skills (e.g., unit testing, backend testing).	3 / 3
	Total	12 / 12 Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, highly actionable skill that provides comprehensive coverage of E2E test operations with concrete commands and clear workflows. Its main weakness is that it's somewhat long for a single SKILL.md — the configuration comparison table, custom commands reference, and detailed fix methodology could be split into supporting files to improve progressive disclosure. The repeated pre-flight check blocks across sections add some redundancy that could be consolidated.

Suggestions

Extract the 'Custom Cypress Commands' and 'Local vs CI Configuration Differences' sections into separate reference files (e.g., COMMANDS.md, CI-CONFIG.md) and link to them from the main skill to reduce monolithic content.

Consolidate the repeated gh CLI pre-flight check into a single referenced pattern or note at the top (e.g., 'All CI commands require gh CLI — run `gh auth status` first') rather than repeating the same block in sections 2, 3, and 4.

Dimension	Reasoning	Score
Conciseness	The skill is comprehensive but could be tightened. The local vs CI configuration table and custom commands reference are valuable additions Claude wouldn't know, but some sections have redundant explanations (e.g., repeated pre-flight checks for gh CLI across multiple sections could be consolidated). The overall length (~250 lines) is justified by the breadth of functionality but some verbosity remains.	2 / 3
Actionability	Excellent actionability throughout — every section provides concrete, copy-paste-ready bash commands and JavaScript code. Specific file paths, exact command flags, and real patterns (like glob matching for spec files) are provided. The fix workflow includes executable commands for each diagnostic step.	3 / 3
Workflow Clarity	Multi-step workflows are clearly sequenced with explicit validation checkpoints. The 'fix' workflow has 7 well-defined steps with a verify-locally step at the end. The 'run' workflow includes pre-flight checks, execution, and automatic flaky detection. Error recovery is addressed (e.g., 'If services are not running, tell the user and suggest...' and 'If errors: fix and re-validate' patterns).	3 / 3
Progressive Disclosure	The skill is well-organized with clear section headers and a comprehensive 'Key File Locations' reference section. However, the content is entirely monolithic — at ~250 lines, the custom commands reference, configuration differences table, and detailed fix workflow could be split into separate referenced files. The help summary at the end is a nice touch for discoverability.	2 / 3
	Total	10 / 12 Passed

Validation

81%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 9 / 11 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	9 / 11 Passed

Repository: HHS/OPRE-OPS
Commit: e2a9461

Reviewed: 2 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.