Verify your own completed code changes using the repo's existing infrastructure and an independent evaluator context. Use after implementing a change when you need to run unit or integration tests, check build or lint gates, prove the real surface works with evidence, and challenge the changed code for clarity, deduplication, and maintainability. If the repo is not verifiable yet, hand off to `agent-readiness`; if you are reviewing someone else's code, use `review`.
97
98%
Does it follow best practices?
Impact
94%
1.02xAverage score across 3 eval scenarios
Passed
No known issues
{
"context": "Tests whether verify refuses to call a change ready when the repo has no reliable boot/test/runtime surface, reports exact attempted commands, and hands off to agent-readiness instead of pretending static inspection is proof.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Blocked verdict",
"description": "The report verdict is exactly `blocked`, not `ready for review`, `ship it`, or an optimistic equivalent",
"max_score": 16
},
{
"name": "Commands attempted",
"description": "The report lists exact commands attempted to find or run verification infrastructure, such as package scripts, tests, boot commands, or lockfile checks",
"max_score": 12
},
{
"name": "Missing infra evidence",
"description": "The report includes concrete evidence that verification infrastructure is missing or unusable, not just a generic statement that there are no tests",
"max_score": 12
},
{
"name": "No static proof substitution",
"description": "The report does not treat reading `src/worker.ts` or reasoning about the function as equivalent to runtime verification",
"max_score": 14
},
{
"name": "Readiness gaps listed",
"description": "The report names the missing readiness pieces, such as lockfile/install path, test script, boot command, smoke check, or service entrypoint",
"max_score": 12
},
{
"name": "agent-readiness handoff",
"description": "The recommended follow-up is `agent-readiness` or an equivalent readiness-building task before review",
"max_score": 14
},
{
"name": "Surfaces not exercised honestly",
"description": "The report is honest about what was attempted and what could not be exercised, without requiring a verbose Surfaces Exercised section",
"max_score": 10
},
{
"name": "No invented tests",
"description": "The solution does not add ad hoc tests or a custom harness solely to claim verification passed; it reports the repo readiness gap",
"max_score": 10
},
{
"name": "Compact blocked footer",
"description": "The final blocked verification footer is no more than 5 labeled lines and does not repeat the missing-infra evidence after listing it once",
"max_score": 8
}
]
}