A comprehensive verification system for Claude Code sessions.
51
51%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
0%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is critically underspecified. It fails to explain what concrete actions the skill performs, provides no natural trigger terms a user would use, and lacks any guidance on when Claude should select it. It reads as a vague tagline rather than a functional skill description.
Suggestions
Replace 'comprehensive verification system' with specific concrete actions, e.g., 'Validates code output, checks for regressions, and confirms task completion in Claude Code sessions.'
Add an explicit 'Use when...' clause with natural trigger terms, e.g., 'Use when the user asks to verify, validate, test, or confirm results of a Claude Code session.'
Clarify what 'verification' means in this context (e.g., output correctness, test execution, diff review) to distinguish this skill from general testing or code review skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague, abstract language ('comprehensive verification system') without naming any concrete actions. It does not specify what is being verified or how. | 1 / 3 |
Completeness | The description barely addresses 'what' (a verification system) in the vaguest terms and completely omits 'when' Claude should use it. There is no 'Use when...' clause or equivalent trigger guidance. | 1 / 3 |
Trigger Term Quality | The only potentially relevant term is 'verification' and 'Claude Code sessions,' but these are not natural keywords a user would say. There are no actionable trigger terms like 'test,' 'validate,' 'check,' or specific task-related words. | 1 / 3 |
Distinctiveness Conflict Risk | The description is extremely generic—'comprehensive verification system' could overlap with testing, linting, code review, validation, or any number of quality-assurance-related skills. It provides no distinct niche or triggers. | 1 / 3 |
Total | 4 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a solid, actionable verification skill with clear sequential phases and executable commands. Its main strengths are the concrete bash commands for each phase and the explicit stop-gates between phases. Weaknesses include some unnecessary explanatory sections (When to Use, Continuous Mode, Integration with Hooks) and the content could benefit from splitting language-specific commands into separate references.
Suggestions
Remove or significantly trim the 'When to Use', 'Continuous Mode', and 'Integration with Hooks' sections — they explain things Claude can infer and add little actionable value.
Consider splitting language-specific commands (JS/TS vs Python) into separate referenced files to reduce inline bulk and improve navigation.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Generally efficient but includes some unnecessary sections like 'When to Use' (Claude knows when to verify) and the 'Continuous Mode' section which is vague. The 'Integration with Hooks' section adds little value. The core phases are reasonably lean though. | 2 / 3 |
Actionability | Each phase provides concrete, executable bash commands with specific tools and flags. The output format template is copy-paste ready, and the commands include practical touches like piping to tail/head for manageable output. | 3 / 3 |
Workflow Clarity | Clear sequential phases with explicit stop-gates ('If build fails, STOP and fix before continuing'). The workflow progresses logically from build → types → lint → tests → security → diff review, with validation checkpoints and a structured output report that serves as a final checklist. | 3 / 3 |
Progressive Disclosure | Content is all inline in a single file, which is borderline acceptable given the length (~90 lines of content). However, the security scan patterns, language-specific commands, and output format template could be split into referenced files for better organization. The structure within the file is good with clear headers. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Reviewed
Table of Contents