Spec-driven workflow covering requirement gathering, spec authoring, implementation review, and verification — with skills, rules, and evaluation scenarios.
96
90%
Does it follow best practices?
Impact
98%
1.19xAverage score across 9 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent writes a properly structured .spec.md file following the spec-writer skill: correct file placement, valid frontmatter, API contract block, headings, and inline [@test] links.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Correct file location",
"description": "The spec file is placed inside a specs/ directory (e.g. specs/inventory-reservation.spec.md or similar path under specs/)",
"max_score": 10
},
{
"name": ".spec.md extension",
"description": "The spec filename ends with .spec.md",
"max_score": 8
},
{
"name": "Frontmatter present",
"description": "The spec file opens with YAML frontmatter delimited by --- ... ---",
"max_score": 5
},
{
"name": "Frontmatter: name field",
"description": "The frontmatter contains a non-empty 'name' field",
"max_score": 5
},
{
"name": "Frontmatter: description field",
"description": "The frontmatter contains a non-empty 'description' field (one-line summary)",
"max_score": 5
},
{
"name": "Frontmatter: targets field",
"description": "The frontmatter contains a 'targets' list with at least one relative path pointing to src/inventory/reservation.py or a glob that would match it",
"max_score": 10
},
{
"name": "API contract code block",
"description": "The spec body includes a code block containing the public function signatures (reserve, release, get_reserved) as an API contract",
"max_score": 12
},
{
"name": "Headings used for sections",
"description": "The spec body uses at least two markdown headings (##) to organize requirements by feature area (e.g. Core Operations, Business Rules, Error Handling)",
"max_score": 8
},
{
"name": "Inline test links",
"description": "At least two [@test] links appear inline next to individual requirements (not grouped in a separate section at the bottom)",
"max_score": 12
},
{
"name": "Test links reference correct files",
"description": "[@test] links reference paths under tests/inventory/ (test_reservation.py or test_reservation_edge_cases.py)",
"max_score": 10
},
{
"name": "Single spec for service",
"description": "Exactly one spec file is created (not multiple files splitting the reservation service into unrelated chunks)",
"max_score": 8
},
{
"name": "Edge cases documented",
"description": "The spec explicitly documents at least two edge cases or error conditions (e.g. InsufficientStockError with quantities, idempotent release of expired reservations)",
"max_score": 7
}
]
}