Use when implementing any feature or bugfix, before writing implementation code
40
38%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/test-driven-development/SKILL.mdQuality
Discovery
14%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is critically weak because it only provides a vague timing cue ('before writing implementation code') without ever stating what the skill actually does. It lacks concrete actions, has overly broad scope ('any feature or bugfix'), and would conflict with many other development-related skills. Without knowing the skill's purpose, Claude cannot make an informed selection decision.
Suggestions
Add explicit capability statements describing what the skill does (e.g., 'Creates implementation plans, identifies edge cases, and designs test strategies' or whatever the actual actions are).
Narrow the scope and add distinctive trigger terms—instead of 'any feature or bugfix', specify the type of pre-implementation activity (e.g., 'architecture planning', 'requirements analysis', 'test-driven design').
Restructure to follow the pattern: '[What it does]. Use when [specific triggers].' to ensure both what and when are clearly addressed.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description contains no concrete actions whatsoever. It doesn't describe what the skill does—only vaguely when to use it ('implementing any feature or bugfix, before writing implementation code'). There are no specific capabilities listed. | 1 / 3 |
Completeness | The description answers 'when' partially ('before writing implementation code') but completely fails to answer 'what does this do'. Without knowing what the skill actually does, it's fundamentally incomplete. | 1 / 3 |
Trigger Term Quality | It includes some relevant terms like 'feature', 'bugfix', and 'implementation code' that users might naturally mention, but the coverage is incomplete and lacks specificity about what kind of pre-implementation activity this involves (e.g., planning, design, testing, scaffolding). | 2 / 3 |
Distinctiveness Conflict Risk | The phrase 'any feature or bugfix' is extremely broad and would conflict with virtually any development-related skill. There is no clear niche or distinct trigger that separates this from other coding or planning skills. | 1 / 3 |
Total | 5 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides excellent actionable guidance with clear workflow steps, verification checkpoints, and executable code examples. However, it is severely undermined by extreme verbosity—roughly half the content is repetitive motivational argumentation against skipping TDD (covered in at least three overlapping sections), which wastes token budget and doesn't add actionable value for Claude. Trimming the philosophical content and deduplicating the rationalizations sections would dramatically improve this skill.
Suggestions
Consolidate 'Why Order Matters', 'Common Rationalizations', and 'Red Flags' into a single concise section or move to a separate reference file—these three sections repeat the same arguments against skipping TDD.
Remove explanations of cognitive biases (sunk cost fallacy) and persuasive arguments—Claude doesn't need to be convinced, it needs instructions.
Cut the dot graph definition—it's not renderable in most contexts and the workflow is already clearly explained in the subsequent sections.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose and repetitive. The 'Why Order Matters' section, 'Common Rationalizations' table, and 'Red Flags' list all cover the same ground—arguing against skipping TDD—three times over. The motivational/philosophical content (sunk cost fallacy explanations, 'pragmatic' rebuttals) is not actionable instruction and wastes significant token budget. Claude doesn't need to be convinced to follow TDD; it needs clear steps. | 1 / 3 |
Actionability | Provides fully executable Ruby code examples for each TDD phase (RED, GREEN, REFACTOR), concrete bash commands for verification, good/bad comparisons, and a complete bug fix walkthrough. The guidance is specific and copy-paste ready. | 3 / 3 |
Workflow Clarity | The Red-Green-Refactor cycle is clearly sequenced with explicit mandatory verification checkpoints at each stage ('Verify RED', 'Verify GREEN'). Includes feedback loops (test errors → fix → re-run, test passes unexpectedly → fix test) and a comprehensive verification checklist before completion. | 3 / 3 |
Progressive Disclosure | References @testing-anti-patterns.md and @testing-strategy.md at the end, which is good progressive disclosure. However, the massive amount of motivational/philosophical content (rationalizations table, 'Why Order Matters' section, red flags list) is inlined when it could be split out or drastically condensed, making the main file much longer than necessary. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
c037ce3
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.