test-driven-development

Use when implementing any feature or bugfix, before writing implementation code

Quality

38%

Does it follow best practices?

Run evals on this skill

Adds up to 20 points to the overall score

View guide

Securityby

Passed

No findings from the security scan

Fix and improve this skill with Tessl

tessl review fix ./skills/test-driven-development/SKILL.md

Quality

Content

62%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill has excellent workflow clarity with a well-defined Red-Green-Refactor cycle including mandatory verification checkpoints and feedback loops, plus strong actionability with executable Ruby examples. However, it is severely bloated by repetitive motivational content — the 'Why Order Matters,' 'Common Rationalizations,' and 'Red Flags' sections all argue the same points about not skipping TDD, which is unnecessary persuasion for an AI assistant that will simply follow instructions. Roughly half the content could be removed or moved to a reference file without losing any actionable guidance.

Suggestions

Remove or drastically condense the 'Why Order Matters,' 'Common Rationalizations,' and 'Red Flags' sections — they repeat the same message (don't skip TDD) in three different formats. A single brief 'If tempted to skip: delete code, start over' line suffices for Claude.

Move the persuasive/philosophical content (sunk cost fallacy, tests-after vs tests-first arguments) to a separate reference file like RATIONALE.md for human readers, keeping SKILL.md focused on executable instructions.

Consolidate the Good/Bad comparison tables and inline examples — the retry_operation example appears in both RED and GREEN sections which is good, but the 'Good Tests' table and anti-pattern guidance could be tighter.

Dimension	Reasoning	Score
Conciseness	Extremely verbose and repetitive. The 'Why Order Matters' section, 'Common Rationalizations' table, and 'Red Flags' list all cover the same ground multiple times — arguing against skipping TDD. Claude doesn't need to be persuaded; it needs instructions. The motivational/philosophical content (sunk cost fallacy explanations, 'pragmatic shortcuts = debugging in production') wastes significant tokens on things Claude already understands or doesn't need convincing about.	1 / 3
Actionability	Provides fully executable Ruby code examples for each TDD phase (RED, GREEN, REFACTOR), specific bash commands for running tests, good/bad comparisons with concrete code, and a complete bug fix walkthrough. The verification checklist and 'When Stuck' table give specific, actionable guidance.	3 / 3
Workflow Clarity	The Red-Green-Refactor cycle is clearly sequenced with explicit mandatory verification steps at each phase. Includes feedback loops (test errors → fix → re-run, test passes unexpectedly → fix test), a verification checklist before completion, and clear decision points (e.g., 'Test passes? You're testing existing behavior. Fix test.').	3 / 3
Progressive Disclosure	References @testing-anti-patterns.md and @testing-strategy.md at the end, which is good progressive disclosure. However, the massive amount of persuasive/motivational content (rationalizations, why order matters, red flags) is all inline when it could be in a separate reference file, making the main skill unnecessarily long. The core workflow is buried among argumentative content.	2 / 3
	Total	9 / 12 Passed

Description

14%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is critically incomplete — it only specifies a timing trigger ('before writing implementation code') without ever explaining what the skill actually does. The lack of concrete actions makes it impossible for Claude to distinguish this skill from other pre-implementation skills, and users would have no way to understand its purpose.

Suggestions

Add explicit capability statements describing what the skill does (e.g., 'Creates implementation plans with step-by-step approach' or 'Writes test cases before implementation' or 'Designs architecture for new features').

Narrow the scope to reduce conflict risk — specify what kind of pre-implementation activity this covers rather than claiming applicability to 'any feature or bugfix'.

Include natural trigger terms that reflect what users would actually say, such as 'plan', 'design', 'architecture', 'approach', 'strategy', or whatever the skill's actual function is.

Dimension	Reasoning	Score
Specificity	The description contains no concrete actions whatsoever. It doesn't describe what the skill actually does — only when to use it. 'Implementing any feature or bugfix' is vague about the skill's capabilities.	1 / 3
Completeness	The description answers 'when' (before writing implementation code) but completely fails to answer 'what does this do'. There is no indication of the skill's actual function or output.	1 / 3
Trigger Term Quality	It includes some relevant terms like 'feature', 'bugfix', and 'implementation code' that users might naturally mention, but lacks specificity about what kind of pre-implementation activity this covers (e.g., planning, design, test writing, architecture review).	2 / 3
Distinctiveness Conflict Risk	Extremely generic — 'any feature or bugfix' and 'before writing implementation code' could overlap with planning skills, design skills, testing skills, architecture skills, or virtually any pre-coding workflow skill.	1 / 3
	Total	5 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: lucianghinda/superpowers-ruby
Path: skills/test-driven-development/SKILL.md
Commit: 712d734

Reviewed: 1 day ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.