test-driven-development

Use when implementing any feature or bugfix, before writing implementation code - write the test first, watch it fail, write minimal code to pass; ensures tests actually verify behavior by requiring failure first

Quality

61%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/tdd/skills/test-driven-development/SKILL.md

Quality

Discovery

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description effectively communicates the TDD workflow and includes an explicit 'Use when' clause, which is a strength. However, it lacks key trigger terms users would naturally use (like 'TDD' or 'test-driven development') and its broad trigger conditions ('any feature or bugfix') create overlap risk with other implementation-focused skills. The description reads more as a process instruction than a capability description.

Suggestions

Add explicit trigger terms users would naturally say, such as 'TDD', 'test-driven development', 'red-green-refactor', 'unit test first', and 'write tests before code'.

Narrow the trigger scope to reduce conflict risk - instead of 'any feature or bugfix', specify 'when the user requests TDD, test-first development, or asks to write tests before implementation'.

List more concrete actions the skill enables, such as 'Generates failing test cases, validates test failure, implements minimal passing code, and refactors while keeping tests green'.

Dimension	Reasoning	Score
Specificity	It describes a methodology (write test first, watch it fail, write minimal code to pass) which names the domain (TDD) and some actions, but doesn't list multiple concrete specific actions beyond the general TDD cycle. The actions are more process-oriented than capability-oriented.	2 / 3
Completeness	The description explicitly answers both 'what' (write the test first, watch it fail, write minimal code to pass, ensures tests verify behavior) and 'when' (when implementing any feature or bugfix, before writing implementation code) with a clear 'Use when' clause at the beginning.	3 / 3
Trigger Term Quality	Includes some relevant terms like 'test first', 'feature', 'bugfix', 'implementation code', but misses common natural trigger terms users would say such as 'TDD', 'test-driven development', 'red-green-refactor', 'unit test', or 'write tests'.	2 / 3
Distinctiveness Conflict Risk	The description is somewhat specific to TDD methodology, but terms like 'feature', 'bugfix', and 'implementation code' are very broad and could overlap with general coding skills, code review skills, or testing skills that aren't specifically TDD-focused.	2 / 3
	Total	9 / 12 Passed

Implementation

55%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill provides excellent actionable guidance with clear workflow steps, verification checkpoints, and executable code examples. However, it is severely bloated—the same core messages are repeated across multiple sections (rationalizations table, 'Why Order Matters' prose, red flags list), and the anti-patterns content should be a separate file. The persuasive/motivational tone ('Thinking about skipping TDD? Stop. That's rationalization.') wastes tokens on things Claude doesn't need to be convinced of.

Suggestions

Cut the content by ~50%: merge the 'Why Order Matters' prose section and 'Common Rationalizations' table into a single concise table, and remove the 'Red Flags' list which largely duplicates both.

Split the Testing Anti-Patterns into a separate ANTI_PATTERNS.md file referenced from the main skill, reducing the monolithic structure and improving progressive disclosure.

Remove persuasive/motivational language ('Stop. That's rationalization.', 'Sunk cost fallacy.') — Claude doesn't need to be convinced, it needs instructions.

Eliminate the dot graph definition which is not renderable in most contexts and replace with a simple numbered list or ASCII flow that's already present in the step-by-step sections.

Dimension	Reasoning	Score
Conciseness	The skill is extremely verbose at ~400+ lines. It repeats the same points multiple times (e.g., 'delete code and start over' appears in at least 5 places, the rationalizations table duplicates the 'Why Order Matters' section almost entirely). The anti-patterns section rehashes TDD principles already covered. Claude already understands TDD conceptually—this should focus on the specific workflow rules, not persuasive essays about why TDD matters.	1 / 3
Actionability	The skill provides fully executable TypeScript code examples for both tests and implementations, concrete bash commands for verification steps, and specific good/bad comparisons. The bug fix example walks through a complete TDD cycle with real code.	3 / 3
Workflow Clarity	The Red-Green-Refactor cycle is clearly sequenced with explicit mandatory verification steps at each stage ('Verify RED - Watch It Fail' is marked MANDATORY). Feedback loops are present (wrong failure → fix test, test passes unexpectedly → fix test, other tests fail → fix now). The verification checklist at the end provides a comprehensive gate before completion.	3 / 3
Progressive Disclosure	This is a monolithic wall of text with no bundle files or external references. The TDD skill and Testing Anti-Patterns content are combined into a single massive file when they could easily be split. There's no navigation structure—the anti-patterns section alone could be a separate referenced file, and the rationalizations/common excuses sections are redundant with each other.	1 / 3
	Total	8 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (699 lines); consider splitting into references/ and linking	Warning

	Total	10 / 11 Passed

Repository: NeoLabHQ/context-engineering-kit
Commit: dedca19

Reviewed: 29 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.