test-prompt

Use when creating or editing any prompt (commands, hooks, skills, subagent instructions) to verify it produces desired behavior - applies RED-GREEN-REFACTOR cycle to prompt engineering using subagents for isolated testing

Quality

64%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/customaize-agent/skills/test-prompt/SKILL.md

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid description that clearly communicates both when to use the skill and what it does, with a distinctive niche in TDD-style prompt testing. The trigger terms are well-chosen, covering the various types of prompts a user might be working with. The main weakness is that the specific actions/capabilities could be more granular - it describes the methodology but not the discrete steps or outputs.

Suggestions

Add more specific concrete actions, e.g., 'writes test cases for prompts, runs them in isolated subagents, compares actual vs expected outputs, and iterates until tests pass'

Dimension	Reasoning	Score
Specificity	It names the domain (prompt engineering/testing) and mentions a specific methodology (RED-GREEN-REFACTOR cycle) and mechanism (subagents for isolated testing), but the concrete actions are somewhat vague - 'verify it produces desired behavior' is not as specific as listing discrete actions like 'run test cases, compare outputs, iterate on prompt wording'.	2 / 3
Completeness	Explicitly answers both 'what' (applies RED-GREEN-REFACTOR cycle to prompt engineering using subagents for isolated testing) and 'when' ('Use when creating or editing any prompt (commands, hooks, skills, subagent instructions) to verify it produces desired behavior').	3 / 3
Trigger Term Quality	Includes strong natural trigger terms users would say: 'prompt', 'commands', 'hooks', 'skills', 'subagent instructions', 'testing', 'prompt engineering'. These cover multiple variations of what a user might mention when needing this skill.	3 / 3
Distinctiveness Conflict Risk	This is a very specific niche - TDD-style prompt testing using subagents. The combination of prompt engineering + RED-GREEN-REFACTOR + subagent isolation creates a clear, distinct identity that is unlikely to conflict with other skills like general coding TDD or basic prompt writing skills.	3 / 3
	Total	11 / 12 Passed

Implementation

39%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill has excellent workflow clarity with a well-structured RED-GREEN-REFACTOR cycle, explicit validation checkpoints, and comprehensive checklists. However, it is severely over-engineered for its purpose — at 400+ lines it explains TDD basics, subagent benefits, and prompt engineering concepts that Claude already knows, and inlines extensive examples that should be in separate files. The result is a token-expensive skill that buries its actionable content in verbose explanation.

Suggestions

Reduce content by 60-70%: remove explanations of why subagents provide isolation, what TDD is, and other concepts Claude already knows. Keep only the process steps, key rules, and one concise example.

Extract the detailed prompt-type-specific examples (instruction, discipline, guidance, reference) into a separate EXAMPLES.md file and reference it from the main skill.

Extract the full git commit walkthrough into a separate file (e.g., EXAMPLE-WALKTHROUGH.md) — it's valuable but consumes ~100 lines that duplicate the already-explained process.

Make Task tool invocations more concrete with actual tool call syntax rather than markdown pseudocode blocks describing what to launch.

Dimension	Reasoning	Score
Conciseness	Extremely verbose at ~400+ lines. Extensively explains concepts Claude already knows (what subagents are, why isolation matters, TDD basics). Massive redundancy: the git commit example repeats the entire RED-GREEN-REFACTOR cycle already explained in detail above. Multiple sections restate the same principles. The 'Why Use Subagents' section explains obvious benefits. References to other skills are repeated multiple times.	1 / 3
Actionability	Provides concrete examples and scenarios (git commit walkthrough, discipline-enforcing test), but much of the guidance is pseudocode-level Task tool invocations rather than actual executable commands. The subagent launch examples use vague markdown blocks rather than precise tool invocation syntax. The testing patterns section describes approaches abstractly rather than giving copy-paste ready implementations.	2 / 3
Workflow Clarity	The RED-GREEN-REFACTOR workflow is clearly sequenced with explicit phases, checklists, validation checkpoints (verify RED, verify GREEN, re-verify after refactor), and feedback loops (if agent still fails, revise and re-test; if new failures appear, revert). The comprehensive checklist at the end provides a clear verification gate before deployment.	3 / 3
Progressive Disclosure	Monolithic wall of text with no bundle files to offload content. The detailed examples for each prompt type (instruction, discipline-enforcing, guidance, reference, subagent), the full git commit walkthrough, and the testing patterns could all be separate reference files. Everything is inlined into one massive document, making it hard to navigate and consuming excessive context window.	1 / 3
	Total	7 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (715 lines); consider splitting into references/ and linking	Warning

	Total	10 / 11 Passed

Repository: NeoLabHQ/context-engineering-kit
Commit: dedca19

Reviewed: 29 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.