Critically review and refine tests generated during a coding session. Use this skill when the user asks to review tests, clean up tests, refine tests, prune tests, improve test quality, or remove useless/low-value tests. Also use when the user says 'review my tests', 'clean up tests', or mentions test quality after a coding session.
68
83%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that excels in completeness and trigger term coverage. It clearly defines both what the skill does and when to use it, with abundant natural language triggers. The main area for improvement is in specificity—the concrete actions could be more detailed to describe exactly what 'refine' and 'review' entail technically.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names the domain (test review/refinement) and mentions several actions like 'review', 'clean up', 'refine', 'prune', 'remove useless/low-value tests', but these are somewhat overlapping and don't describe deeply concrete distinct actions (e.g., it doesn't specify what 'refine' entails technically, like checking assertions, removing duplicates, improving coverage). | 2 / 3 |
Completeness | Clearly answers both 'what' (critically review and refine tests generated during a coding session) and 'when' (explicit 'Use this skill when...' clause with multiple trigger scenarios). The 'Use when' guidance is thorough and explicit. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users would say: 'review tests', 'clean up tests', 'refine tests', 'prune tests', 'improve test quality', 'remove useless/low-value tests', 'review my tests', 'test quality after a coding session'. These are highly natural phrases a user would actually use. | 3 / 3 |
Distinctiveness Conflict Risk | The skill has a clear niche: post-coding-session test review and refinement. The specific focus on reviewing/pruning existing tests (rather than writing new tests or running tests) makes it distinct and unlikely to conflict with general testing or code review skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-crafted skill with a clear 5-step workflow, strong actionability through specific criteria and examples, and proper validation steps. Its main weakness is moderate verbosity — some explanatory sentences within pruning categories and refinement guidelines could be tightened without losing clarity. The progressive disclosure is adequate for the content length but could benefit from separating detailed criteria into reference files.
Suggestions
Tighten the pruning category descriptions by removing the explanatory second sentences (e.g., 'If the test would pass even with a broken implementation, it's not testing anything useful' — Claude can infer this from the category definition).
Consider extracting the detailed pruning criteria and refinement checklist into a referenced file (e.g., TEST_REVIEW_CRITERIA.md) to keep SKILL.md as a lean overview.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is mostly efficient and well-structured, but some explanations are slightly verbose — e.g., the extended descriptions of each pruning category include explanatory sentences that Claude could infer. The examples within categories (like the 10,000-character input example) add clarity but also add length. | 2 / 3 |
Actionability | The skill provides concrete, specific criteria for pruning (with clear examples of what constitutes trivial/weak/brittle tests), specific assertion guidance (e.g., `assertEquals` over `assertTrue`, `assertDatabaseHas`), naming conventions, and an executable test command. The guidance is directly actionable. | 3 / 3 |
Workflow Clarity | The 5-step workflow is clearly sequenced: identify → prune → refine → fill gaps → run tests. Step 5 includes an explicit validation checkpoint with a feedback loop ('If any test fails, fix it'), and the output summary step provides a verification checklist. | 3 / 3 |
Progressive Disclosure | The content is well-organized with clear headers and sub-sections, but it's a moderately long monolithic file. The pruning categories and refinement sub-sections could potentially be split into referenced files for a cleaner overview, though the current length is borderline acceptable. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
533a35f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.