Creates, updates, or prunes an AGENTS.md for any repository by auditing the codebase, detecting non-discoverable gaps, and drafting minimal high-signal instructions that agents cannot infer from reading the code.
90
94%
Does it follow best practices?
Impact
78%
1.06xAverage score across 3 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent correctly recommends hierarchical AGENTS.md files for a large monorepo, avoids asking generic discoverable questions while asking targeted questions about genuine ambiguities, and keeps questions limited to the prescribed maximum.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Hierarchical structure recommended",
"description": "The root AGENTS.md includes a '## Scope & routing' section that identifies which packages have their own AGENTS.md files, OR the agent creates package-level AGENTS.md files in at least 2 of the package directories",
"max_score": 12
},
{
"name": "No monolithic file for all modules",
"description": "The root AGENTS.md does NOT contain detailed per-package guidance for all 6 packages in a single file without routing to package-level files",
"max_score": 10
},
{
"name": "No generic discoverable questions",
"description": "The questions file or plan does NOT include questions about the tech stack, programming language, testing framework, or other information discoverable from reading the repo files",
"max_score": 10
},
{
"name": "Targeted ambiguity question asked",
"description": "The agent produces a questions.md (or similar file) with at most 3 questions, each tied to a specific genuine ambiguity found in the repo (e.g., the conflicting deploy scripts for the payments package, or the meaning of the DO-NOT-DEPLOY marker)",
"max_score": 12
},
{
"name": "Question count within limit",
"description": "No more than 3 questions are posed to the user — the questions.md or equivalent file contains at most 3 items",
"max_score": 8
},
{
"name": "CI vs local difference documented",
"description": "AGENTS.md (root or package-level) documents the non-obvious 'INTEGRATION=true' environment variable required by CI but not present in local test scripts or .env.example",
"max_score": 10
},
{
"name": "Do-not-touch constraint documented",
"description": "AGENTS.md documents the constraint on packages/payments/ (marked DO-NOT-DEPLOY, requires SRE sign-off) — not just that the directory exists",
"max_score": 10
},
{
"name": "No tech stack summaries in output",
"description": "AGENTS.md files produced do NOT contain sections describing the tech stack (language, framework, package manager) as general context",
"max_score": 8
},
{
"name": "Scope statement present",
"description": "The root AGENTS.md has a one-line or two-sentence scope statement immediately after the '# Agent instructions' heading",
"max_score": 8
},
{
"name": "Audit trail created",
"description": "An audit-notes.md file exists with at least 3 entries documenting items considered but excluded, each citing the file that makes it discoverable",
"max_score": 12
}
]
}