test-driven-development

Drives development with tests. Use when implementing any logic, fixing any bug, or changing any behavior. Use when you need to prove that code works, when a bug report arrives, or when you're about to modify existing functionality.

Quality

63%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/test-driven-development/SKILL.md

Quality

Discovery

49%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description has strong completeness with explicit 'Use when' clauses, but suffers from being overly broad in its triggers and vague in its capabilities. The scope ('any logic', 'any bug', 'any behavior') makes it a catch-all that would conflict with most development skills, and the 'what' portion lacks concrete actions like writing specific test types or using particular testing patterns.

Suggestions

Narrow the trigger conditions to be more specific, e.g., 'Use when the user asks to write tests, add test coverage, or practice TDD' rather than 'any logic, any bug, any behavior'.

Add concrete actions to the capability description, e.g., 'Writes unit tests, integration tests, and regression tests using TDD methodology. Generates test cases, assertions, and mocks.'

Include natural trigger terms users would say, such as 'TDD', 'unit test', 'test coverage', 'spec', 'test suite', 'regression test'.

Dimension	Reasoning	Score
Specificity	The description uses vague language like 'drives development with tests' without listing concrete actions. It doesn't specify what kind of tests, what frameworks, or what specific operations (e.g., write unit tests, run test suites, generate test fixtures). 'Drives development' is abstract.	1 / 3
Completeness	The description answers both 'what' (drives development with tests) and 'when' with explicit triggers ('Use when implementing any logic, fixing any bug, or changing any behavior. Use when you need to prove that code works, when a bug report arrives, or when you're about to modify existing functionality').	3 / 3
Trigger Term Quality	Includes some relevant terms like 'tests', 'bug', 'logic', 'fixing', and 'behavior' that users might naturally say. However, it misses common variations like 'unit test', 'TDD', 'test-driven', 'test coverage', 'assertion', 'spec', or specific framework names.	2 / 3
Distinctiveness Conflict Risk	The triggers are extremely broad — 'implementing any logic', 'fixing any bug', 'changing any behavior' would match virtually every coding task. This would conflict with nearly any development-related skill and trigger far too often.	1 / 3
	Total	7 / 12 Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a comprehensive, well-structured TDD skill with excellent actionability — concrete TypeScript examples, clear workflows, and explicit verification steps. Its main weakness is verbosity: it includes motivational content, philosophical justifications, and explanations of concepts Claude already understands (test pyramid, why TDD matters, common rationalizations). The content would benefit from trimming ~30-40% of explanatory prose and moving the Browser Testing and anti-patterns sections into referenced files.

Suggestions

Remove or drastically shorten the 'Common Rationalizations' table and 'Red Flags' section — these are motivational rather than actionable, and Claude doesn't need convincing about TDD's value.

Move the 'Browser Testing with DevTools' section into its own referenced file since it's a distinct concern from core TDD workflow and adds significant length.

Trim the Test Pyramid section — Claude knows what unit/integration/E2E tests are. Keep only the decision guide and resource model table.

Dimension	Reasoning	Score
Conciseness	The skill is well-written but verbose for its audience. Sections like 'Common Rationalizations', the test pyramid explanation, the Beyoncé Rule, and general TDD philosophy are things Claude already knows. The ASCII diagrams are nice but consume tokens. The core actionable content could be ~40% shorter.	2 / 3
Actionability	Excellent executable TypeScript examples throughout — the TDD cycle, Prove-It Pattern, Arrange-Act-Assert, good vs bad test examples are all concrete and copy-paste ready. The verification checklist and decision guide provide specific, actionable guidance.	3 / 3
Workflow Clarity	The TDD cycle (RED → GREEN → REFACTOR) and Prove-It Pattern are clearly sequenced with explicit validation checkpoints. The bug fix workflow includes verification steps (test fails → fix → test passes → full suite). The verification checklist at the end serves as a final checkpoint.	3 / 3
Progressive Disclosure	References to `browser-testing-with-devtools` and `references/testing-patterns.md` are mentioned but no bundle files are provided, making it impossible to verify they exist. The SKILL.md itself is quite long (~300 lines) and could benefit from moving the Browser Testing section and anti-patterns table into separate referenced files.	2 / 3
	Total	10 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: addyosmani/agent-skills
Commit: f17c6e8

Reviewed: about 11 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.