Use when creating or editing any prompt (commands, hooks, skills, subagent instructions) to verify it produces desired behavior - applies RED-GREEN-REFACTOR cycle to prompt engineering using subagents for isolated testing
71
64%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/customaize-agent/skills/test-prompt/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid description that clearly communicates both when to use the skill and what it does, with a distinctive niche in TDD-style prompt testing. The trigger terms are well-chosen, covering the various types of prompts a user might be working with. The main weakness is that the specific actions/capabilities could be more granular - it describes the methodology but not the discrete steps or outputs.
Suggestions
Add more specific concrete actions, e.g., 'writes test cases for prompts, runs them in isolated subagents, compares actual vs expected outputs, and iterates until tests pass'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | It names the domain (prompt engineering/testing) and mentions a specific methodology (RED-GREEN-REFACTOR cycle) and mechanism (subagents for isolated testing), but the concrete actions are somewhat vague - 'verify it produces desired behavior' is not as specific as listing discrete actions like 'run test cases, compare outputs, iterate on prompt wording'. | 2 / 3 |
Completeness | Explicitly answers both 'what' (applies RED-GREEN-REFACTOR cycle to prompt engineering using subagents for isolated testing) and 'when' ('Use when creating or editing any prompt (commands, hooks, skills, subagent instructions) to verify it produces desired behavior'). | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'prompt', 'commands', 'hooks', 'skills', 'subagent instructions', 'testing', 'prompt engineering'. These cover multiple variations of what a user might mention when needing this skill. | 3 / 3 |
Distinctiveness Conflict Risk | This is a very specific niche - TDD-style prompt testing using subagents. The combination of prompt engineering + RED-GREEN-REFACTOR + subagent isolation creates a clear, distinct identity that is unlikely to conflict with other skills like general coding TDD or basic prompt writing skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
39%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill has excellent workflow clarity with a well-structured RED-GREEN-REFACTOR cycle, explicit validation checkpoints, and comprehensive checklists. However, it is severely over-engineered for its purpose — at 400+ lines it explains TDD basics, subagent benefits, and prompt engineering concepts that Claude already knows, and inlines extensive examples that should be in separate files. The result is a token-expensive skill that buries its actionable content in verbose explanation.
Suggestions
Reduce content by 60-70%: remove explanations of why subagents provide isolation, what TDD is, and other concepts Claude already knows. Keep only the process steps, key rules, and one concise example.
Extract the detailed prompt-type-specific examples (instruction, discipline, guidance, reference) into a separate EXAMPLES.md file and reference it from the main skill.
Extract the full git commit walkthrough into a separate file (e.g., EXAMPLE-WALKTHROUGH.md) — it's valuable but consumes ~100 lines that duplicate the already-explained process.
Make Task tool invocations more concrete with actual tool call syntax rather than markdown pseudocode blocks describing what to launch.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~400+ lines. Extensively explains concepts Claude already knows (what subagents are, why isolation matters, TDD basics). Massive redundancy: the git commit example repeats the entire RED-GREEN-REFACTOR cycle already explained in detail above. Multiple sections restate the same principles. The 'Why Use Subagents' section explains obvious benefits. References to other skills are repeated multiple times. | 1 / 3 |
Actionability | Provides concrete examples and scenarios (git commit walkthrough, discipline-enforcing test), but much of the guidance is pseudocode-level Task tool invocations rather than actual executable commands. The subagent launch examples use vague markdown blocks rather than precise tool invocation syntax. The testing patterns section describes approaches abstractly rather than giving copy-paste ready implementations. | 2 / 3 |
Workflow Clarity | The RED-GREEN-REFACTOR workflow is clearly sequenced with explicit phases, checklists, validation checkpoints (verify RED, verify GREEN, re-verify after refactor), and feedback loops (if agent still fails, revise and re-test; if new failures appear, revert). The comprehensive checklist at the end provides a clear verification gate before deployment. | 3 / 3 |
Progressive Disclosure | Monolithic wall of text with no bundle files to offload content. The detailed examples for each prompt type (instruction, discipline-enforcing, guidance, reference, subagent), the full git commit walkthrough, and the testing patterns could all be separate reference files. Everything is inlined into one massive document, making it hard to navigate and consuming excessive context window. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (715 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
dedca19
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.