Creates detailed, bite-sized implementation plans with TDD structure, exact file paths, complete code, and test commands. Use when you have a spec, requirements, design doc, or feature request and need to plan before coding — especially for multi-step tasks, large features, or when handing off to another session. DO NOT TRIGGER when asked to write code directly or fix a simple bug.
90
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly communicates what the skill does (creates TDD-structured implementation plans with specific deliverables), when to use it (specs, requirements, feature requests, multi-step tasks), and when NOT to use it (direct coding, simple bugs). The inclusion of negative triggers is a strong differentiator that reduces conflict with coding skills. The description is concise yet comprehensive.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'detailed, bite-sized implementation plans', 'TDD structure', 'exact file paths', 'complete code', and 'test commands'. These are concrete, actionable outputs. | 3 / 3 |
Completeness | Clearly answers both 'what' (creates implementation plans with TDD structure, file paths, code, test commands) and 'when' (explicit 'Use when' clause with multiple trigger scenarios, plus a 'DO NOT TRIGGER' clause for disambiguation). | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'spec', 'requirements', 'design doc', 'feature request', 'plan before coding', 'multi-step tasks', 'large features', 'handing off to another session'. Also includes negative triggers to reduce false matches. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche: implementation planning with TDD, not direct coding. The 'DO NOT TRIGGER' clause explicitly disambiguates from code-writing or bug-fixing skills, significantly reducing conflict risk. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, actionable skill that provides clear TDD-structured planning templates with concrete examples, exact commands, and explicit validation steps. Its main weakness is minor verbosity in the overview section and some redundancy between the task structure template and the common mistakes section. The workflow is exceptionally clear with well-defined checkpoints and a thoughtful execution handoff pattern.
Suggestions
Tighten the Overview section — remove personality-driven phrases like 'questionable taste' and 'don't know good test design very well' in favor of the concrete guidance that follows.
Consider deduplicating the Common Mistakes section by removing items already demonstrated in the Task Structure template (e.g., 'Missing test commands' is already shown with exact commands in the template).
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Generally efficient but has some unnecessary verbosity — phrases like 'assuming the engineer has zero context for our codebase and questionable taste' and 'Assume they don't know good test design very well' are flavor text that don't add actionable value. The 'Common Mistakes' section partially overlaps with guidance already given in the task structure. | 2 / 3 |
Actionability | Highly actionable with concrete templates, exact file path conventions, complete code examples in the task structure, specific test commands with expected outputs, and a clear plan document header template. Everything is copy-paste ready for generating plans. | 3 / 3 |
Workflow Clarity | The workflow is crystal clear: plan header → bite-sized tasks with TDD steps (write failing test → verify failure → implement → verify pass) → execution handoff with explicit user choice. Each step has validation checkpoints (run tests, check expected output), and the granularity guidance (2-5 minutes per step, max 8 steps per task) provides clear guardrails. | 3 / 3 |
Progressive Disclosure | The content is well-structured with clear sections, but everything is inline in a single file. The skill references other skills (kit:team-dev, brainstorming skill) but doesn't link to any supplementary documentation. For a skill of this length (~80 lines of meaningful content), this is acceptable but the Common Mistakes section could potentially be a separate reference. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
a01bac9
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.