Enforces fresh verification evidence before any completion or success claims. Use when about to say "done", "fixed", "tests pass", "build succeeds", or any synonym; before committing, creating PRs, or moving to the next task; before expressing satisfaction or positive statements about work state; and after agent delegation to independently verify results. Prevents unverified claims by requiring command execution, output inspection, and exit code confirmation.
90
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly defines a specific behavioral guardrail for verification before completion claims. It provides comprehensive trigger scenarios with natural language terms, explicitly states both what the skill does and when to use it, and occupies a distinct niche that is unlikely to conflict with other skills. The description uses proper third-person voice throughout.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'command execution, output inspection, and exit code confirmation.' Also specifies concrete behaviors like 'committing, creating PRs, moving to the next task' and 'agent delegation to independently verify results.' | 3 / 3 |
Completeness | Clearly answers both 'what' (enforces fresh verification evidence before completion claims, requires command execution/output inspection/exit code confirmation) and 'when' (explicit 'Use when...' clause with four detailed trigger scenarios). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users/agents would encounter: 'done', 'fixed', 'tests pass', 'build succeeds', 'committing', 'creating PRs', 'moving to the next task', 'agent delegation'. These are terms that naturally arise in development workflows. | 3 / 3 |
Distinctiveness Conflict Risk | Occupies a very clear niche — verification enforcement before claims of completion. This is distinct from testing skills, CI/CD skills, or code review skills. The focus on preventing unverified claims is unique and unlikely to conflict with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong behavioral skill with excellent actionability and workflow clarity. The 5-step verification process, evidence tables, and OK/BAD patterns give Claude concrete, unambiguous guidance. The main weakness is moderate verbosity — several sections overlap in content (partial verification failures, evidence requirements, unverified claim patterns), and the motivational framing, while purposeful, adds tokens. The skill could be tightened by ~25% without losing clarity.
Suggestions
Consolidate the 'What Counts as Evidence', 'Why Partial Verification Fails', and 'Recognizing Unverified Claims' sections into a single reference table to eliminate redundancy and save tokens.
Move the '24 failure memories' motivational section to a brief inline note or remove it — Claude doesn't need persuasion, just instructions.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is moderately efficient but has some redundancy — the 'Why Partial Verification Fails' table largely repeats information from the 'What Counts as Evidence' table and 'Recognizing Unverified Claims' section. The motivational framing ('dishonesty, not efficiency', 'trust broken') adds some bulk but is arguably justified for a behavioral skill. Some tightening is possible. | 2 / 3 |
Actionability | The skill provides a concrete 5-step verification process, a detailed evidence table mapping claims to required proof, specific OK/BAD patterns showing exactly what to do vs. avoid, and clear triggers for when to apply. The guidance is specific and directly executable despite being an instruction-only (non-code) skill. | 3 / 3 |
Workflow Clarity | The 5-step verification workflow is clearly sequenced with an explicit validation checkpoint (step 4) and a feedback loop (if no: state actual status). The regression test pattern includes a full red-green-restore cycle. The 'When To Apply' section clearly defines triggers. This is a well-structured workflow with proper validation gates. | 3 / 3 |
Progressive Disclosure | The content is well-organized with clear section headers and tables, but it's a relatively long single file (~120 lines of content) with no references to external files. Some content like the detailed failure memories context or the full evidence tables could be split out. However, for a behavioral skill of this nature, inline content is more defensible than for a technical skill. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
a01bac9
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.