Review existing code, diffs, branches, or pull requests using concern-specific reviewer personas and evidence. Use when auditing someone else's work, triaging risk in a PR, or producing a ship-it / needs-review / blocked verdict. Do not use to verify your own completed change; use `verify` for that.
98
100%
Does it follow best practices?
Impact
92%
1.31xAverage score across 3 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent correctly applies minimal persona selection for a tiny documentation-only change, avoids reviewer spam, correctly applies the shape-based shortcut for doc-heavy changes, produces a valid verdict, and does not over-inflate minor findings into defects.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Minimal personas selected",
"description": "The report uses only 1-2 personas, NOT all 6 personas — specifically avoids spawning silent-failures, types, and cleanup for this purely documentary change",
"max_score": 12
},
{
"name": "General persona used",
"description": "The report includes the general reviewer persona (or equivalent broad code review lens)",
"max_score": 8
},
{
"name": "Comments persona used",
"description": "The report includes the comments reviewer persona (or equivalent documentation/docstring lens) — appropriate given docstring change",
"max_score": 10
},
{
"name": "Tests persona omitted",
"description": "The report does NOT apply a tests reviewer persona — no test coverage concern applies to a typo/docstring fix",
"max_score": 8
},
{
"name": "Verdict present",
"description": "The report contains exactly one verdict label: 'ship it', 'needs review', or 'blocked'",
"max_score": 10
},
{
"name": "Scope stated",
"description": "The report explicitly names the scope reviewed (e.g. the two files changed, README and format.ts, or the specific diff)",
"max_score": 8
},
{
"name": "Personas listed in output",
"description": "The report explicitly names the reviewer personas used in a dedicated section or label",
"max_score": 8
},
{
"name": "No nit inflation",
"description": "The report does NOT flag the parameter rename (amount → value) as a defect or high-severity finding — the JSDoc now correctly matches the actual parameter name",
"max_score": 10
},
{
"name": "Docstring accuracy noted",
"description": "The report identifies that the @param name was corrected from 'amount' to 'value' to match the actual function signature — and treats this as an accuracy improvement, not a bug",
"max_score": 8
},
{
"name": "Unverified areas or residual risk",
"description": "The report mentions any residual risk or unverified surfaces, OR explicitly states there are none (either outcome is acceptable — silence is not)",
"max_score": 8
},
{
"name": "Recommended follow-up",
"description": "The report includes a recommended follow-up action from the allowed set: implementation, verify, agent-readiness, or docs",
"max_score": 10
}
]
}