This skill reads more like a TDD cheat sheet or reference card than an actionable skill for Claude. It thoroughly covers TDD philosophy and principles but fails to provide any concrete, executable examples—no test code, no framework-specific commands, no sample RED-GREEN-REFACTOR cycle walkthrough. Much of the content restates well-known software engineering concepts that Claude already understands.

Suggestions

Add at least one complete, executable TDD cycle example showing a failing test, minimal implementation, and refactored result in a specific language/framework (e.g., Python with pytest or JavaScript with Jest).

Remove or drastically condense sections covering concepts Claude already knows (Three Laws of TDD, YAGNI definition, AAA pattern, When to Use TDD) to improve token efficiency.

Add concrete validation steps: specific commands to run tests, expected output for failing vs passing tests, and what to do when the refactor phase breaks a test.

Include a concrete workflow example: 'Given requirement X, here is the exact sequence of files to create and commands to run' rather than abstract principles.

Dimension	Reasoning	Score
Conciseness	The content is reasonably organized with tables, but includes several sections that explain concepts Claude already knows well (TDD principles, YAGNI, AAA pattern, anti-patterns). The 'Three Laws of TDD' and general TDD philosophy are well-known concepts that don't need restating. Some sections like 'When to Use TDD' and 'Test Prioritization' add little actionable value.	2 / 3
Actionability	The skill is almost entirely conceptual with no executable code examples, no concrete commands, and no specific language/framework guidance. It describes TDD principles abstractly but never shows an actual test being written, a failing test output, or a concrete implementation example. For a skill about a development workflow, the absence of any executable examples is a significant gap.	1 / 3
Workflow Clarity	The RED-GREEN-REFACTOR cycle is clearly sequenced with a visual diagram, and each phase has defined rules. However, there are no validation checkpoints (e.g., how to verify the test actually fails, how to confirm all tests pass before refactoring), no concrete commands for running tests, and no feedback loops for when things go wrong.	2 / 3
Progressive Disclosure	The content is well-structured with numbered sections and tables, but it's a monolithic document with no references to external files for deeper topics. The 'AI-Augmented TDD' section and detailed anti-patterns could be split out. For a conceptual skill of this length (~100 lines), the organization is adequate but not optimal.	2 / 3
	Total	7 / 12 Passed

Description

22%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is too terse and abstract, naming a methodology (TDD, RED-GREEN-REFACTOR) without specifying concrete actions the skill performs or when Claude should select it. It lacks a 'Use when...' clause and relies on the reader already knowing what TDD entails rather than explicitly stating capabilities.

Suggestions

Add explicit concrete actions such as 'Guides writing a failing test first, implementing minimal code to pass, then refactoring for clean design'.

Add a 'Use when...' clause with natural trigger terms like 'Use when the user asks about TDD, writing tests first, test-driven development, red-green-refactor, or wants to follow a test-first coding workflow'.

Include common user-facing keywords and variations: 'TDD', 'test first', 'write tests before code', 'unit test workflow', 'failing test'.

Dimension	Reasoning	Score
Specificity	The description mentions 'RED-GREEN-REFACTOR cycle' which is a known concept but doesn't list concrete actions like 'write failing tests first, implement minimal code to pass, then refactor'. It stays at the level of naming a methodology rather than describing specific capabilities.	1 / 3
Completeness	The description loosely addresses 'what' (TDD workflow principles) but has no 'Use when...' clause or equivalent trigger guidance. The 'what' itself is also quite weak—'principles' is vague. Missing explicit 'when' caps this at 2 per guidelines, but the 'what' is also weak enough to warrant a 1.	1 / 3
Trigger Term Quality	Includes relevant terms like 'Test-Driven Development', 'TDD' (implied), and 'RED-GREEN-REFACTOR' which users familiar with TDD would recognize. However, it misses common natural language variations like 'write tests first', 'failing test', 'unit testing workflow', or 'TDD'.	2 / 3
Distinctiveness Conflict Risk	TDD and RED-GREEN-REFACTOR are fairly specific to a particular development methodology, which helps distinguish it from generic coding or testing skills. However, it could overlap with general testing skills or code quality skills due to the vague 'workflow principles' framing.	2 / 3
	Total	6 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning

	Total	10 / 11 Passed

Reviewed

3 months ago

Table of Contents

Discovery Implementation Validation