Use when implementing any feature or bugfix, before writing implementation code
51
38%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/test-driven-development/SKILL.mdQuality
Discovery
14%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is critically incomplete: it only provides a vague 'when' clause without any indication of what the skill actually does. The lack of concrete actions makes it impossible for Claude to distinguish this skill from other pre-implementation skills, and the overly broad trigger scope ('any feature or bugfix') creates high conflict risk.
Suggestions
Add a clear 'what' clause describing the specific actions this skill performs (e.g., 'Creates implementation plans, identifies edge cases, and outlines test scenarios before coding').
Narrow the scope or add distinguishing details to reduce conflict with other skills—specify what kind of pre-implementation activity this covers (planning, design review, test-first approach, etc.).
Include more specific trigger terms that reflect what users would actually say, such as 'plan implementation', 'design before coding', 'outline approach', or similar natural phrases.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description contains no concrete actions at all. It doesn't describe what the skill does—only when to use it. 'Implementing any feature or bugfix' and 'writing implementation code' are vague and don't specify any particular capability. | 1 / 3 |
Completeness | The description answers 'when' (before writing implementation code for features or bugfixes) but completely omits 'what' the skill actually does. There is no indication of the skill's capabilities or actions. | 1 / 3 |
Trigger Term Quality | It includes some relevant terms like 'feature', 'bugfix', and 'implementation code' that users might naturally mention, but lacks specificity about what kind of pre-implementation activity this covers (e.g., planning, design, test writing, architecture review). | 2 / 3 |
Distinctiveness Conflict Risk | Extremely generic—'any feature or bugfix' and 'before writing implementation code' could overlap with planning skills, architecture skills, test-writing skills, design doc skills, or virtually any pre-coding workflow skill. | 1 / 3 |
Total | 5 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides excellent actionable guidance with clear workflow steps, verification checkpoints, and executable code examples. However, it is severely bloated by repetitive philosophical arguments and rationalizations that Claude doesn't need—the same 'delete and start over' message is hammered home in at least five different sections. Cutting the motivational/argumentative content by 60-70% would dramatically improve token efficiency without losing any practical value.
Suggestions
Remove or drastically condense the 'Why Order Matters' prose section and 'Common Rationalizations' table—they repeat each other and explain concepts Claude already understands. A single 3-line 'Common excuses: all mean delete and restart with TDD' note suffices.
Consolidate the 'Red Flags' list with the rationalizations table into a single compact section, eliminating the heavy duplication between them.
Move the philosophical/motivational content (why TDD matters, sunk cost arguments) to a separate reference file like TDD-RATIONALE.md, keeping SKILL.md focused on the actionable workflow.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~300+ lines. The 'Why Order Matters' section extensively argues for TDD philosophy that Claude already understands. The 'Common Rationalizations' table heavily overlaps with the preceding prose section. Multiple sections repeat the same points (e.g., 'delete code and start over' is stated at least 5 times). The 'Red Flags' list largely duplicates the rationalizations table. | 1 / 3 |
Actionability | Provides fully executable Ruby code examples for each TDD phase (RED, GREEN, REFACTOR), concrete bash commands for verification, and a complete bug fix walkthrough. The good/bad comparisons with real code are immediately usable. | 3 / 3 |
Workflow Clarity | The Red-Green-Refactor cycle is clearly sequenced with explicit verification checkpoints at each stage ('Verify RED', 'Verify GREEN'). Includes feedback loops (test errors → fix → re-run, test passes unexpectedly → fix test) and a comprehensive verification checklist before completion. | 3 / 3 |
Progressive Disclosure | References @testing-anti-patterns.md and @testing-strategy.md at the end, which is good. However, the main file is monolithic with extensive inline content that could be split out (e.g., the philosophical arguments, rationalizations table, and anti-patterns could be separate documents). The file would benefit from being a concise overview pointing to detailed materials. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
cb03f92
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.