Use when implementing any feature or bugfix, before writing implementation code
45
56%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
14%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is critically weak because it completely omits what the skill actually does, providing only a vague temporal trigger ('before writing implementation code'). Without any concrete actions or capabilities listed, Claude cannot meaningfully distinguish this skill from others or understand what it provides. The description reads more like a partial usage hint than a proper skill description.
Suggestions
Add specific concrete actions describing what this skill does (e.g., 'Creates implementation plans, identifies edge cases, designs data structures and interfaces' or whatever the skill actually performs).
Narrow the scope to reduce conflict risk—specify the domain or type of pre-implementation activity (e.g., 'Generates test cases before implementation' vs 'Creates architecture diagrams' vs 'Writes technical design documents').
Expand the 'Use when...' clause with more natural trigger terms users would say, such as 'plan', 'design', 'before coding', 'approach', 'strategy', or whatever matches the skill's actual purpose.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description contains no concrete actions whatsoever. It doesn't describe what the skill does—only vaguely when to use it ('implementing any feature or bugfix, before writing implementation code'). There are no specific capabilities listed. | 1 / 3 |
Completeness | The 'what does this do' is entirely missing—there is no description of capabilities or actions. While there is a 'when' clause ('before writing implementation code'), the absence of any 'what' makes this fundamentally incomplete. | 1 / 3 |
Trigger Term Quality | It includes some relevant terms like 'feature', 'bugfix', and 'implementation code' that users might naturally mention, but the coverage is incomplete and lacks specificity about what kind of pre-implementation activity this involves (e.g., planning, design, testing, scaffolding). | 2 / 3 |
Distinctiveness Conflict Risk | The description is extremely generic—'any feature or bugfix' and 'before writing implementation code' could apply to countless skills (planning, architecture, test writing, requirements gathering, etc.). It would easily conflict with many other skills. | 1 / 3 |
Total | 5 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill has excellent actionability with concrete, executable examples and a well-structured workflow with proper validation checkpoints. However, it is severely undermined by extreme verbosity — roughly half the content is repetitive motivational argumentation against skipping TDD (covered in at least three separate sections saying the same things). The Graphviz diagram wastes tokens since it can't be rendered. Cutting the persuasive content would make this a strong skill.
Suggestions
Remove or drastically condense the 'Why Order Matters', 'Common Rationalizations', and 'Red Flags' sections — they repeat the same message three times. A single concise 'Don't skip TDD' note suffices for Claude.
Remove the Graphviz dot diagram — Claude cannot render it, and the Red-Green-Refactor steps already explain the workflow clearly.
Consider extracting the Good/Bad test examples and the rationalizations table into a separate reference file if they must be retained, keeping SKILL.md focused on the actionable workflow.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose and repetitive. The 'Why Order Matters' section, 'Common Rationalizations' table, and 'Red Flags' list all cover the same ground — arguing against skipping TDD — repeated three different ways. The motivational/persuasive content (sunk cost fallacy explanations, 'pragmatic' rebuttals) is not actionable guidance Claude needs. The Graphviz dot diagram adds tokens without value since Claude can't render it. Much of this is preaching to Claude about discipline rather than teaching it how to do something. | 1 / 3 |
Actionability | The skill provides fully executable TypeScript code examples for both tests and implementations, concrete bash commands for verification, and specific good/bad comparisons. The bug fix walkthrough is copy-paste ready with clear inputs and expected outputs. | 3 / 3 |
Workflow Clarity | The Red-Green-Refactor cycle is clearly sequenced with explicit verification checkpoints at each stage ('Verify RED', 'Verify GREEN'). Feedback loops are present — if test passes when it shouldn't, fix the test; if test errors, fix and re-run. The verification checklist at the end provides a comprehensive final checkpoint. | 3 / 3 |
Progressive Disclosure | References @testing-anti-patterns.md appropriately, but the main file is monolithic at ~250+ lines. The repetitive motivational sections (Why Order Matters, Common Rationalizations, Red Flags) could be extracted or eliminated. The content that remains inline is reasonably well-organized with clear headers, but the sheer volume hurts navigability. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Reviewed
Table of Contents