Enforces fresh verification evidence before any completion or success claims. Use when about to say "done", "fixed", "tests pass", "build succeeds", or any synonym; before committing, creating PRs, or moving to the next task; before expressing satisfaction or positive statements about work state; and after agent delegation to independently verify results. Prevents unverified claims by requiring command execution, output inspection, and exit code confirmation.
90
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Claiming work is complete without verification is dishonesty, not efficiency.
Core principle: Evidence before claims, always.
If you haven't run the verification command in this message, you cannot claim it passes. Stale results from previous runs don't count — code may have changed since then, and re-running is cheap compared to shipping a false claim.
Before claiming any status or expressing satisfaction:
Skipping steps means the claim is unsupported. Unsupported claims erode trust — your human partner can't distinguish "I checked and it passes" from "I assume it passes" unless you show evidence.
| Claim | Requires | Not Sufficient |
|---|---|---|
| Tests pass | Test command output: 0 failures | Previous run, "should pass" |
| Linter clean | Linter output: 0 errors | Partial check, extrapolation |
| Build succeeds | Build command: exit 0 | Linter passing, logs look good |
| Bug fixed | Test original symptom: passes | Code changed, assumed fixed |
| Regression test works | Red-green cycle verified | Test passes once |
| Agent completed | VCS diff shows changes | Agent reports "success" |
| Requirements met | Line-by-line checklist | Tests passing |
Watch for these patterns in your own output — they indicate you're about to make a claim without evidence:
When you notice one of these, pause and run the verification command before continuing.
| Shortcut | Why it doesn't work |
|---|---|
| "Should work now" | Confidence isn't evidence. Run the command. |
| "Linter passed" | Linter checks style, not compilation or correctness. They verify different things. |
| "Agent said success" | Agents can report success while producing incomplete or broken output. Check the diff. |
| "Partial check is enough" | Each verification tool checks different properties. Passing one doesn't imply others pass. |
Tests:
OK: [Run test command] [See: 34/34 pass] "All tests pass"
BAD: "Should pass now" / "Looks correct"Regression tests (TDD Red-Green):
OK: Write -> Run (pass) -> Revert fix -> Run (fails) -> Restore -> Run (pass)
BAD: "I've written a regression test" (without red-green verification)Build:
OK: [Run build] [See: exit 0] "Build passes"
BAD: "Linter passed" (linter doesn't check compilation)Requirements:
OK: Re-read plan -> Create checklist -> Verify each -> Report gaps or completion
BAD: "Tests pass, phase complete"Agent delegation:
OK: Agent reports success -> Check VCS diff -> Verify changes -> Report actual state
BAD: Trust agent report at face valueFrom 24 failure memories:
Verification takes seconds. Rebuilding trust takes much longer.
Before:
The rule applies to exact phrases, paraphrases, synonyms, and implications of success — any communication suggesting completion or correctness.
Run the command. Read the output. Then claim the result.
a01bac9
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.