Run verification commands and confirm output before claiming success. Use when about to claim work is complete, fixed, or passing, before committing or creating PRs.
78
72%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/verification-before-completion/skills/verification-before-completion/SKILL.mdQuality
Discovery
67%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description effectively communicates when to use the skill with a clear 'Use when' clause tied to completion/commit scenarios. However, it lacks specificity about what verification commands are actually run, and the trigger terms could be broader to capture more natural user language. The skill occupies a somewhat unique niche (pre-completion verification) but could benefit from more concrete action descriptions.
Suggestions
Add specific concrete actions like 'run tests, linting, type-checking, and build commands' instead of the vague 'verification commands'.
Expand trigger terms to include natural variations like 'tests passing', 'build succeeds', 'lint clean', 'CI', 'ready to merge', 'done with changes'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | It names a general action ('run verification commands and confirm output') but doesn't list specific concrete actions like running tests, linting, type-checking, building, etc. The description is somewhat vague about what 'verification commands' entails. | 2 / 3 |
Completeness | Clearly answers both 'what' (run verification commands and confirm output before claiming success) and 'when' (when about to claim work is complete, fixed, or passing, before committing or creating PRs) with explicit trigger conditions. | 3 / 3 |
Trigger Term Quality | Includes some relevant trigger terms like 'complete', 'fixed', 'passing', 'committing', 'PRs', but misses common natural variations users might say such as 'tests', 'build', 'lint', 'check', 'verify', 'CI', or 'done'. | 2 / 3 |
Distinctiveness Conflict Risk | The concept of 'verification before completion' is somewhat distinct, but 'run verification commands' is broad enough that it could overlap with testing skills, CI/CD skills, or code review skills. The trigger of 'before committing or creating PRs' helps narrow it but could still conflict with commit-related or PR-related skills. | 2 / 3 |
Total | 9 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is highly actionable with clear verification commands, a well-defined gate function workflow, and excellent good/bad pattern examples. Its main weakness is verbosity—the core message (always run verification before claiming success) is repeated across many sections with overlapping content, consuming more tokens than necessary. The motivational framing and rationalization prevention sections, while well-intentioned, explain concepts Claude can internalize from a more concise presentation.
Suggestions
Consolidate the 'Red Flags', 'Rationalization Prevention', and 'Why This Matters' sections into a single compact section or remove them—the Gate Function and Common Failures table already convey the same information actionably.
Remove motivational/philosophical framing like 'Claiming work is complete without verification is dishonesty, not efficiency' and 'Honesty is a core value. If you lie, you'll be replaced'—Claude doesn't need emotional motivation, just clear instructions.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is moderately verbose with repetitive emphasis on the same point (verification before claims) across multiple sections. The 'Rationalization Prevention' table, 'Red Flags', and 'Why This Matters' sections overlap significantly. Some motivational/philosophical content ('Claiming work is complete without verification is dishonesty, not efficiency') is unnecessary padding for Claude. | 2 / 3 |
Actionability | Provides concrete, executable verification commands (bun test, npm test, bun run build, etc.), a clear gate function with specific steps, and concrete good/bad patterns showing exactly what to do vs. what not to do. The common failures table maps claims to required evidence precisely. | 3 / 3 |
Workflow Clarity | The Gate Function provides a clear 5-step sequential workflow with an explicit validation checkpoint (step 4) and a feedback loop (if NO: state actual status). The regression test pattern includes a full red-green cycle with revert verification. The workflow is unambiguous for the task at hand. | 3 / 3 |
Progressive Disclosure | The content is well-structured with clear headers and tables, but it's a monolithic document that could benefit from being more compact. For a skill of this length (~120 lines of substantive content), some sections like 'Rationalization Prevention' and 'Why This Matters' could be trimmed or moved to a separate reference. No external file references are used, though the content length arguably warrants it. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
88da5ff
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.