A comprehensive verification system for Claude Code sessions.
58
38%
Does it follow best practices?
Impact
92%
2.42xAverage score across 3 eval scenarios
Risky
Do not use without reviewing
Optimize this skill with Tessl
npx tessl skill review --optimize ./.agents/skills/verification-loop/SKILL.mdQuality
Discovery
0%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is critically underspecified. It reads as a vague tagline rather than a functional description that would help Claude select the right skill. It lacks concrete actions, natural trigger terms, explicit usage guidance, and any distinguishing characteristics.
Suggestions
Specify what exactly is being verified (e.g., 'Validates code output correctness, checks for regressions, verifies tool call results') to replace the vague 'comprehensive verification system'.
Add an explicit 'Use when...' clause with natural trigger terms (e.g., 'Use when the user asks to verify, validate, check, or confirm results from a Claude Code session').
Clarify the distinct niche this skill occupies to avoid overlap with testing, linting, or code review skills (e.g., specify the type of verification and what makes it unique to Claude Code sessions).
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague, abstract language ('comprehensive verification system') without naming any concrete actions. It does not specify what is being verified or how. | 1 / 3 |
Completeness | The description fails to clearly answer 'what does this do' (what specifically is verified?) and completely lacks any 'when should Claude use it' guidance. There is no 'Use when...' clause or equivalent. | 1 / 3 |
Trigger Term Quality | The only potentially relevant term is 'verification' and 'Claude Code sessions,' but these are not natural keywords a user would say. A user would more likely say 'check,' 'validate,' 'test,' or describe a specific verification task. | 1 / 3 |
Distinctiveness Conflict Risk | 'Verification system' is extremely generic and could overlap with testing skills, linting skills, code review skills, or any quality assurance-related skill. There are no distinct triggers to differentiate it. | 1 / 3 |
Total | 4 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a solid, actionable verification skill with clear sequential phases and executable commands. Its main strengths are the concrete bash commands for each phase and the explicit stop-gates between phases. Weaknesses include some unnecessary explanatory sections (When to Use, Continuous Mode, Integration with Hooks) and the 'Continuous Mode' section which is vague and not actionable.
Suggestions
Remove or significantly trim the 'When to Use', 'Continuous Mode', and 'Integration with Hooks' sections as they add little actionable value and consume tokens.
Consider extracting language-specific command variants (Python vs JS/TS) into separate referenced files to reduce cognitive load for single-language projects.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Generally efficient but includes some unnecessary sections like 'When to Use' (Claude knows when to verify) and the 'Continuous Mode' section which is vague. The 'Integration with Hooks' section adds little value. The core phases are reasonably lean though. | 2 / 3 |
Actionability | Each phase provides concrete, executable bash commands with specific tools and flags. The output format template is copy-paste ready, and the commands include practical touches like piping to tail/head for manageable output. | 3 / 3 |
Workflow Clarity | Clear sequential phases with explicit stop-gates ('If build fails, STOP and fix before continuing'). The workflow progresses logically from build → types → lint → tests → security → diff review, with validation checkpoints and a structured output report summarizing pass/fail status. | 3 / 3 |
Progressive Disclosure | Content is well-structured with clear headers for each phase, but everything is inline in a single file. The security scan patterns, output format template, and language-specific commands could be split into referenced files. For a skill of this length (~100 lines), it's borderline acceptable but slightly monolithic. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
5df943e
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.