Run an evidence-grounded software architecture audit workflow that builds a repo brief, selects single-auditor or specialist-panel mode, inspects boundary, layering, dependency, composition, cohesion, and testability risks, writes required finding blocks, and sequences incremental refactors. Use when asked for an architecture audit, architecture review, repo-structure review, software architecture report, audit_report.md, structural issue findings, or specialist-panel synthesis across multi-module systems.
100
100%
Does it follow best practices?
Impact
100%
1.85xAverage score across 3 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent correctly chooses single-auditor mode for a narrow single-package target, produces a complete repo brief with all required fields, applies severity levels correctly, and places ambiguous items in Open Evidence Gaps rather than making unfounded claims.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Single-auditor mode",
"description": "Report explicitly states or clearly uses 'single-auditor' mode (not specialist-panel), given the narrow single-package scope",
"max_score": 10
},
{
"name": "Repo brief: product name",
"description": "The Repo Brief section names the product or package (Ledger Service or similar)",
"max_score": 5
},
{
"name": "Repo brief: language/framework",
"description": "The Repo Brief section identifies the implementation language (Go) and dependencies (database/sql, lib/pq, net/http)",
"max_score": 5
},
{
"name": "Repo brief: audit scope",
"description": "The Repo Brief section explicitly states the audit scope (single package, named files or directory)",
"max_score": 5
},
{
"name": "Repo brief: docs checked",
"description": "The Repo Brief section lists which docs were checked (README.md, go.mod, or similar)",
"max_score": 5
},
{
"name": "Repo brief: code surfaces checked",
"description": "The Repo Brief section names the primary code surfaces inspected (e.g. handler.go, service.go, repository.go, model.go, middleware.go)",
"max_score": 5
},
{
"name": "Repo brief: module map",
"description": "The Repo Brief section includes a top-level module map or file layout description",
"max_score": 5
},
{
"name": "Repo brief: test layout",
"description": "The Repo Brief section describes the test layout and testing style (unit tests, mocked repo, skipped tests)",
"max_score": 5
},
{
"name": "Severity levels used",
"description": "At least two findings include a 'Severity:' field with a value of critical, high, or medium",
"max_score": 10
},
{
"name": "Evidence from actual files",
"description": "At least two finding blocks cite specific file names (e.g. service.go, repository.go) as evidence rather than speaking in generalities",
"max_score": 10
},
{
"name": "Open Evidence Gaps section present",
"description": "Report contains a '## Open Evidence Gaps' section",
"max_score": 10
},
{
"name": "Ambiguous items in Open Evidence Gaps",
"description": "At least one item is placed in '## Open Evidence Gaps' rather than stated as a confirmed finding (e.g. unknown production behavior, external system assumptions)",
"max_score": 10
},
{
"name": "Incremental recommendations",
"description": "Recommendations describe incremental structural changes (NOT a full rewrite or microservice split)",
"max_score": 15
}
]
}