tdd

Test-driven development. Use when the user wants to build features or fix bugs test-first, mentions "red-green-refactor", or wants integration tests.

1.02x

Quality

82%

Does it follow best practices?

Impact

88%

1.02x

Average score across 17 eval scenarios

Securityby

Passed

No known issues

Quality

Content

65%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The body has a strong, well-sequenced workflow with explicit checkpoints, but it leans verbose on familiar philosophy and relies on references (tests.md, mocking.md, refactoring.md) that do not exist in the bundle, undermining progressive disclosure.

Suggestions

Tighten or trim the Philosophy and Anti-Pattern prose that re-explains concepts Claude already knows (e.g. what makes tests good/bad, why horizontal slicing fails) to improve conciseness.

Add the referenced bundle files (tests.md, mocking.md, refactoring.md) or remove the inline links so navigation is not left dangling.

Include at least one concrete, executable test/example pair (language-agnostic or in a named language) so the guidance is copy-paste ready rather than purely procedural.

Dimension	Reasoning	Score
Conciseness	Mostly efficient and assumes Claude's competence, but several passages over-explain familiar concepts — e.g. the extended 'Good tests... Bad tests...' philosophy and the rationale behind horizontal slicing pads the body beyond what is strictly necessary.	2 / 3
Actionability	The ASCII RED/GREEN diagrams and checklists give concrete shape, but core guidance is procedural prose rather than executable code or copy-paste commands; the skill is instruction-heavy with no runnable examples.	2 / 3
Workflow Clarity	A clearly sequenced multi-step workflow (Planning → Tracer Bullet → Incremental Loop → Refactor) with explicit checkpoints, a per-cycle checklist, and an explicit validation gate ('Never refactor while RED. Get to GREEN first.').	3 / 3
Progressive Disclosure	The body points to tests.md, mocking.md, and refactoring.md, but no bundle directories or files exist, so these are dangling references with no actual deeper content to navigate to.	1 / 3
	Total	8 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is concise, concrete, and well-triggered: it states what the skill does and exactly when to use it via natural terms like 'red-green-refactor'. It occupies a distinct niche with minimal conflict risk.

Dimension	Reasoning	Score
Specificity	Lists multiple concrete actions: 'build features', 'fix bugs test-first', and 'integration tests', naming specific activities rather than vague claims.	3 / 3
Completeness	Explicitly answers both what (test-driven development; build features / fix bugs test-first) and when ('Use when...') with explicit trigger guidance, matching the top anchor.	3 / 3
Trigger Term Quality	Includes natural trigger phrases users would say — 'test-first', 'red-green-refactor', and 'integration tests' — giving good coverage of the domain's common vocabulary.	3 / 3
Distinctiveness Conflict Risk	Targets a clear niche (TDD methodology) with distinct triggers like 'red-green-refactor' that are unlikely to collide with generic coding or testing skills.	3 / 3
	Total	12 / 12 Passed

Validation

93%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 15 / 16 Passed

Validation for skill structure

Criteria	Description	Result
relative_links	Relative link issues: 3 missing	Warning

	Total	15 / 16 Passed

Repository: coder/agent-tty
Commit: fae02cb

Reviewed: 23 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.