CtrlK
BlogDocsLog inGet started
Tessl Logo

jbvc/test-driven-development

Use when implementing any feature or bugfix, before writing implementation code

58

Quality

58%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

Quality

Discovery

14%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is critically weak because it completely omits what the skill actually does, providing only a vague timing cue ('before writing implementation code'). Without any concrete actions, capabilities, or domain specificity, Claude would have no reliable basis for selecting this skill over others. The description reads more like a partial usage note than a functional skill description.

Suggestions

Add concrete actions describing what this skill does (e.g., 'Creates implementation plans, identifies edge cases, and outlines test scenarios for features and bugfixes').

Add explicit trigger terms users would naturally say, such as 'plan implementation', 'design approach', 'before coding', 'implementation strategy', or whatever the skill's actual domain is.

Include a clear 'Use when...' clause that combines both the timing and the specific user intent, e.g., 'Use when the user asks for an implementation plan, wants to think through an approach before coding, or needs to break down a feature into steps.'

DimensionReasoningScore

Specificity

The description contains no concrete actions whatsoever. It does not describe what the skill does—only vaguely when to use it ('implementing any feature or bugfix, before writing implementation code'). There are no specific capabilities listed.

1 / 3

Completeness

The 'what does this do' is entirely missing—there is no indication of what actions or outputs this skill provides. While there is a 'when' clause ('before writing implementation code'), the absence of 'what' makes this fundamentally incomplete.

1 / 3

Trigger Term Quality

It includes some natural terms like 'feature', 'bugfix', and 'implementation code' that users might mention, but these are extremely broad and lack specificity about what domain or tooling is involved. Missing common variations or more precise trigger terms.

2 / 3

Distinctiveness Conflict Risk

The description is extremely generic—'any feature or bugfix' and 'before writing implementation code' could apply to countless skills (planning, testing, design, architecture, etc.). This would easily conflict with many other skills.

1 / 3

Total

5

/

12

Passed

Implementation

62%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill provides excellent actionable guidance with clear workflow steps, verification checkpoints, and executable code examples. However, it is severely bloated with repetitive philosophical arguments and rationalization rebuttals that Claude doesn't need—the same 'delete code, start over' message is repeated in at least 5 different sections. The content would be far more effective at half its current length with persuasive/motivational content moved to a separate reference file.

Suggestions

Cut 'Why Order Matters' and 'Common Rationalizations' sections entirely or consolidate into a single 5-line table—these repeat the same points as 'Red Flags' and 'The Iron Law'

Move the 'Red Flags' list and rationalization rebuttals to a separate RATIONALE.md file, keeping only the core workflow and examples in SKILL.md

Remove the dot graph definition—it adds tokens without actionable value since Claude can't render it; replace with a simple text flow like 'RED → verify fail → GREEN → verify pass → REFACTOR → repeat'

Consolidate the Good/Bad comparison tables and inline examples to avoid restating the same principles (e.g., 'one behavior per test' appears in at least 3 places)

DimensionReasoningScore

Conciseness

The skill is extremely verbose at ~300+ lines, with extensive sections arguing why TDD matters, debunking rationalizations, and repeating the same points multiple times. Claude already understands TDD concepts—the 'Why Order Matters' section, 'Common Rationalizations' table, and 'Red Flags' list all heavily overlap and explain things Claude doesn't need explained. The content could be cut by 60%+ without losing actionable guidance.

1 / 3

Actionability

The skill provides fully executable TypeScript code examples for each TDD phase (RED, GREEN, REFACTOR), concrete bash commands for verification, good/bad comparisons, and a complete bug fix walkthrough. The examples are copy-paste ready and specific.

3 / 3

Workflow Clarity

The Red-Green-Refactor cycle is clearly sequenced with explicit verification checkpoints after each phase ('Verify RED', 'Verify GREEN'). Feedback loops are present (test errors → fix → re-run; test passes unexpectedly → fix test). The verification checklist at the end provides a final validation gate.

3 / 3

Progressive Disclosure

There is one external reference (@testing-anti-patterns.md) which is well-signaled, but the massive amount of inline content (rationalizations, philosophy, anti-patterns) should be split into separate files. The skill tries to be both a quick reference and a persuasive essay, making it a near-monolithic wall of text.

2 / 3

Total

9

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents