Critically review and refine tests generated during a coding session. Use this skill when the user asks to review tests, clean up tests, refine tests, prune tests, improve test quality, or remove useless/low-value tests. Also use when the user says 'review my tests', 'clean up tests', or mentions test quality after a coding session.
89
87%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid description with excellent trigger term coverage and complete what/when guidance. The main weakness is that the specific actions could be more concrete (what exactly does 'refine' or 'critically review' entail?), and it could be more distinctive from general code review or test writing skills.
Suggestions
Add more specific concrete actions like 'identify redundant test cases', 'consolidate duplicate assertions', 'remove tests that don't add coverage value', or 'improve test naming and organization'.
Strengthen distinctiveness by clarifying this is specifically for post-generation cleanup rather than initial test writing or general code review.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (test review) and some actions ('review', 'refine', 'prune', 'remove useless/low-value tests'), but lacks specific concrete actions like 'identify redundant assertions', 'consolidate duplicate test cases', or 'improve test naming conventions'. | 2 / 3 |
Completeness | Clearly answers both what ('Critically review and refine tests generated during a coding session') and when with explicit 'Use this skill when...' clause containing multiple trigger scenarios. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'review tests', 'clean up tests', 'refine tests', 'prune tests', 'improve test quality', 'remove useless/low-value tests', 'review my tests', 'test quality after a coding session'. | 3 / 3 |
Distinctiveness Conflict Risk | Reasonably specific to test review/refinement, but could potentially overlap with general code review skills or test generation skills. The 'coding session' context helps but isn't strongly distinctive. | 2 / 3 |
Total | 10 / 12 Passed |
Implementation
92%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a high-quality skill that provides clear, actionable guidance for test refinement. The workflow is well-sequenced with explicit validation steps, and the pruning criteria are specific enough to apply immediately. The only minor weakness is that the content could benefit from slightly better progressive disclosure for the detailed subsections.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is lean and efficient, with no unnecessary explanations of concepts Claude already knows. Each section earns its place by providing specific, actionable criteria rather than verbose descriptions. | 3 / 3 |
Actionability | Provides concrete, specific guidance with clear categories for pruning (trivial tests, weak assertions, etc.), specific examples of bad patterns, and executable commands for running tests. The criteria are precise enough to act on immediately. | 3 / 3 |
Workflow Clarity | Clear 5-step sequence with explicit validation (Step 5: run tests and fix failures). The workflow includes a feedback loop ('If any test fails, fix it') and logical progression from identification through pruning, refinement, gap-filling, and verification. | 3 / 3 |
Progressive Disclosure | Content is well-organized with clear sections and subsections, but it's a moderately long single file. Some content (like the detailed pruning categories or project conventions) could potentially be split into reference files for a cleaner overview, though the current structure is functional. | 2 / 3 |
Total | 11 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
32c20f1
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.