Verify your own completed code changes using the repo's existing infrastructure and an independent evaluator context. Use after implementing a change when you need to run unit or integration tests, check build or lint gates, prove the real surface works with evidence, and challenge the changed code for clarity, deduplication, and maintainability. Do not use when the repo is not verifiable yet or when reviewing someone else's code.
97
98%
Does it follow best practices?
Impact
95%
1.05xAverage score across 3 eval scenarios
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that hits all the marks. It provides specific concrete actions, abundant natural trigger terms, explicit 'Use when' and 'Do not use when' clauses, and clear boundaries that distinguish it from related skills like code review or CI/CD. The description is comprehensive without being padded, and uses proper third-person voice throughout.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: run repo guardrails (lint, typecheck, tests, build), exercise the real surface with evidence, catch self-correctable issues, and produces a specific verdict output (ready for review / needs more work / blocked). | 3 / 3 |
Completeness | Clearly answers both 'what' (self-check completed changes, run guardrails, exercise real surface, produce a verdict) and 'when' (explicit 'Use when' clause with multiple triggers). Also includes helpful 'Do not use when' guidance for boundary cases. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users would say: 'check your work', 'run checks', 'validate changes', 'make sure a change is ready', 'test it end-to-end', 'lint', 'typecheck', 'tests', 'build', 'ready for review'. These are phrases a user would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Clearly carved niche as a pre-review self-check skill, distinct from code review of others' work, CI/CD pipeline skills, or general testing skills. The 'Do not use when auditing someone else's diff/branch/PR' explicitly prevents overlap with code review skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
100%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is an exceptionally well-crafted skill. It is concise yet comprehensive, with a clear multi-step workflow that includes explicit validation checkpoints and self-correction loops. The output format is precisely defined with a concrete example, and the progressive disclosure to reference files is well-signaled and appropriately scoped. The only minor note is that bundle files weren't provided to verify the referenced paths exist, but the structure itself is sound.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is lean and efficient throughout. It assumes Claude's competence — no explanations of what linting, typechecking, or CI are. Every section earns its place: principles set boundaries, workflow gives concrete steps, output defines the exact format. The examples (curl commands, make verify, etc.) are illustrative without being padded. | 3 / 3 |
Actionability | The skill provides concrete, executable guidance: specific commands (make verify, curl http://127.0.0.1:3000/health, node dist/cli.js --help), specific self-correction examples (replace `any` with real types, add a `throw`), a precise 3-value verdict enum, and a copy-paste-ready output template. The workflow steps are specific enough to follow without ambiguity. | 3 / 3 |
Workflow Clarity | The 5-step workflow is clearly sequenced with explicit ordering rationale (guardrails first, then real surface). Validation is built into the structure: step 1 runs deterministic checks, step 2 exercises the real surface, step 3 self-corrects, step 4 probes edge cases, and step 5 synthesizes a verdict. The 'Before You Start' section adds pre-flight validation. Feedback loops are present (self-correct during exercise, re-test after fixes). | 3 / 3 |
Progressive Disclosure | The main SKILL.md is a well-structured overview with clear references to deeper materials: evidence-rules.md, verification.md, and simplification.md are each referenced exactly where relevant with one-level-deep links. The References section at the bottom provides a clean navigation index with brief descriptions of each file's purpose. | 3 / 3 |
Total | 12 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Reviewed
Table of Contents