Execute a strict Red-Green-Refactor TDD cycle — one requirement at a time — in any language or framework.
97
Quality
100%
Does it follow best practices?
Impact
94%
1.11xAverage score across 5 eval scenarios
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that hits all the marks. It provides specific concrete actions, comprehensive trigger terms that developers would naturally use, explicitly answers both what and when, and carves out a distinct niche around TDD methodology that won't conflict with general testing or coding skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'Execute a strict Red-Green-Refactor TDD cycle', 'failing test written first', 'minimum implementation', 'clean refactor'. Also specifies test types (unit, integration, UI component, API) and supported stacks. | 3 / 3 |
Completeness | Clearly answers both what ('Execute a strict Red-Green-Refactor TDD cycle...') and when ('Use when the user provides a business rule, acceptance criterion, or feature requirement...' plus explicit 'Triggered by phrases like...' clause). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users would say: 'write a test first', 'TDD', 'red-green-refactor', 'behavioral test', 'test-driven', 'failing test first', 'write the test before the code'. These are phrases developers naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Clear niche focused specifically on TDD methodology with distinct triggers. The emphasis on 'test first', 'red-green-refactor', and 'failing test first' clearly distinguishes it from general testing or coding skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
100%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is an excellent skill file that demonstrates best practices across all dimensions. It provides a clear, actionable TDD workflow with explicit phase boundaries and user confirmation checkpoints. The multi-stack examples are complete and executable, and the content efficiently conveys constraints and patterns without over-explaining concepts Claude already knows.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is lean and efficient, assuming Claude's competence. No unnecessary explanations of what TDD is or how testing frameworks work—it jumps straight into actionable phases and constraints. | 3 / 3 |
Actionability | Provides fully executable code examples across three different stacks (Python/pytest, C#/xUnit, TypeScript/React), with complete RED-GREEN-REFACTOR cycles that are copy-paste ready. | 3 / 3 |
Workflow Clarity | The three-phase workflow is explicitly sequenced with clear stopping points ('close your response and wait for the user to confirm'). Each phase has numbered steps and explicit validation through test pass/fail status. | 3 / 3 |
Progressive Disclosure | Main content provides a complete overview with well-signaled one-level-deep references to mocking.md and tests.md for additional patterns and examples. Content is appropriately split between core workflow and supplementary material. | 3 / 3 |
Total | 12 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 16 / 16 Passed
Validation for skill structure
No warnings or errors.
Install with Tessl CLI
npx tessl i haletothewood/behavioural-tddReviewed
Table of Contents