Use when about to claim work is complete, fixed, or passing, before committing or creating PRs - requires running verification commands and confirming output before making any success claims; evidence before assertions always
93
91%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Risky
Do not use without reviewing
Quality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a well-structured skill description with strong trigger terms and clear when/what guidance. The main weakness is that the specific verification actions could be more concrete (e.g., 'run tests, check linter output, verify builds'). The phrase 'evidence before assertions always' is a good principle but slightly redundant.
Suggestions
Add specific concrete verification actions like 'run tests, check linter, verify build succeeds' to improve specificity
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (verification before claims) and some actions ('running verification commands', 'confirming output'), but doesn't list specific concrete actions like 'run tests', 'check linter', 'verify build passes'. | 2 / 3 |
Completeness | Clearly answers both what ('running verification commands and confirming output before making success claims') and when ('about to claim work is complete, fixed, or passing, before committing or creating PRs'). Has explicit 'Use when' clause. | 3 / 3 |
Trigger Term Quality | Includes natural keywords users/Claude would encounter: 'complete', 'fixed', 'passing', 'committing', 'creating PRs', 'success claims'. These are terms that naturally appear when finishing work. | 3 / 3 |
Distinctiveness Conflict Risk | Clear niche focused on verification before completion claims - distinct from general testing skills or commit message skills. The specific trigger of 'about to claim work is complete' creates a unique activation context. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
92%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is an excellent skill that clearly articulates verification requirements with concrete examples, actionable patterns, and explicit workflows. The content is appropriately stern given its purpose (preventing false completion claims) and provides comprehensive coverage of failure modes. Minor improvement possible by extracting BuildStream-specific content to a separate file for better progressive disclosure.
Suggestions
Consider moving the BuildStream-Specific Verification section to a separate file (e.g., BUILDSTREAM-VERIFICATION.md) and linking to it, keeping SKILL.md focused on universal verification principles
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Every section earns its place with actionable tables, concrete examples, and no explanation of concepts Claude already knows. The content is dense with useful information without padding. | 3 / 3 |
Actionability | Provides concrete verification patterns with clear ✅/❌ examples, specific command patterns, and a detailed gate function. The BuildStream section includes exact commands like 'just bst build <element>'. | 3 / 3 |
Workflow Clarity | The 5-step gate function provides explicit sequencing with validation checkpoints. The regression test pattern shows a complete red-green cycle. Each workflow has clear decision points and feedback loops. | 3 / 3 |
Progressive Disclosure | Content is well-organized with clear sections and tables, but it's a substantial document (~150 lines) that could benefit from splitting domain-specific verification (BuildStream) into a separate reference file. | 2 / 3 |
Total | 11 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
f062bf8
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.