General-purpose coding policy for Baruch's AI agents
91
93%
Does it follow best practices?
Impact
91%
1.15xAverage score across 12 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests whether the install-reviewer skill halts cleanly when a required piece of tooling (the gh-aw extension) is not installed, and hands the user a concrete recovery command instead of failing later with a cryptic compile error.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Identifies the missing dependency",
"description": "The plan explicitly names the missing gh-aw GitHub CLI extension as the cause of the halt, not some later step or cryptic symptom",
"max_score": 25
},
{
"name": "Stops before making changes",
"description": "The plan does NOT proceed to create a feature branch, copy or compile workflow files, commit, or open a PR. The halt happens at preflight and leaves the repo in the same state as it started",
"max_score": 20
},
{
"name": "Provides the install command",
"description": "Tells the user to run `gh extension install github/gh-aw` (or equivalent explicit command), not just generic prose like \"install the gh-aw extension\"",
"max_score": 25
},
{
"name": "Explains why gh-aw is needed",
"description": "Gives a reason — e.g., gh-aw is what compiles the workflow source (`review.md`) into the runnable lock file (`review.lock.yml`) that GitHub Actions executes",
"max_score": 15
},
{
"name": "Invites re-invocation",
"description": "Tells the user that after installing gh-aw they should re-invoke the skill — the halt is recoverable, not terminal",
"max_score": 15
}
]
}