Closing the intent-to-code chasm - specification-driven development with BDD verification chain
86
92%
Does it follow best practices?
Impact
86%
1.82xAverage score across 14 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests whether the agent generates a plan with an ASCII architecture diagram and correctly populates .specify/context.json with node classifications, preserving existing data through a merge rather than overwrite.",
"type": "weighted_checklist",
"checklist": [
{
"name": "ASCII architecture diagram",
"description": "plan.md contains an architecture diagram using ASCII box-drawing characters (e.g., +---, |, →, or similar)",
"max_score": 12
},
{
"name": "Named components in diagram",
"description": "The architecture diagram names at least 3 distinct components (e.g., 'React SPA' or 'Web Client', 'Search Service' or 'API', 'PostgreSQL' or similar database, 'AI Ranking API' or similar external service)",
"max_score": 10
},
{
"name": "context.json exists",
"description": ".specify/context.json exists and contains valid JSON",
"max_score": 8
},
{
"name": "planview.nodeClassifications key",
"description": ".specify/context.json contains a 'planview' key with a 'nodeClassifications' sub-key",
"max_score": 12
},
{
"name": "Existing data preserved",
"description": ".specify/context.json still contains the original keys from the input file ('projectName' and 'version') — the file was merged, not overwritten",
"max_score": 14
},
{
"name": "Client node classified",
"description": "At least one component in nodeClassifications is classified as 'client' (browser, web app, CLI, or mobile app)",
"max_score": 11
},
{
"name": "Server node classified",
"description": "At least one component in nodeClassifications is classified as 'server' (API, service, worker, or middleware)",
"max_score": 11
},
{
"name": "Storage node classified",
"description": "At least one component in nodeClassifications is classified as 'storage' (database, cache, queue, or file store)",
"max_score": 11
},
{
"name": "External node classified",
"description": "At least one component in nodeClassifications is classified as 'external' (third-party API, SaaS service, or system outside project boundary)",
"max_score": 11
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
rules
skills
iikit-00-constitution
scripts
dashboard
iikit-01-specify
iikit-02-plan
iikit-03-checklist
scripts
bash
dashboard
iikit-04-testify
iikit-05-tasks
iikit-06-analyze
iikit-07-implement
iikit-08-taskstoissues
iikit-bugfix
scripts
dashboard
iikit-clarify
iikit-core
references
scripts
bash
dashboard
powershell
templates