Closing the intent-to-code chasm - specification-driven development with BDD verification chain
86
92%
Does it follow best practices?
Impact
86%
1.82xAverage score across 14 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests whether the agent uses IIKit's bug fix task format: T-BNNN task IDs, TS-NNN references when TDD is mandatory, structured bugs.md fields with BUG-NNN identifiers, and no modification of existing tasks in tasks.md.",
"type": "weighted_checklist",
"checklist": [
{
"name": "T-BNNN task ID format",
"description": "All bug fix tasks use the T-BNNN format (e.g., T-B001, T-B002) rather than T001, T002 or any other format",
"max_score": 15
},
{
"name": "At least 1 fix task with TS ref",
"description": "Since TDD is mandatory (tdd_determination=mandatory in context.json), at least one bug fix task references a test spec ID (TS-NNN) in its description, ensuring the fix is test-driven",
"max_score": 15
},
{
"name": "TDD task references test spec",
"description": "The implement fix task explicitly references a test spec ID (TS-NNN) in its description",
"max_score": 10
},
{
"name": "bugs.md BUG-NNN entry",
"description": "bugs.md contains a structured entry with a BUG-NNN identifier (e.g., BUG-001)",
"max_score": 10
},
{
"name": "bugs.md required fields",
"description": "The bugs.md entry includes: a bug description, severity, status (reported), and reproduction steps",
"max_score": 10
},
{
"name": "bugs.md date format",
"description": "The Reported date in bugs.md is in YYYY-MM-DD format",
"max_score": 5
},
{
"name": "Existing tasks unmodified",
"description": "The existing tasks T001 through T008 (all marked [x]) are preserved unchanged in tasks.md — the new bug fix tasks are appended, not merged into existing sections",
"max_score": 12
},
{
"name": "Bug ID in task descriptions",
"description": "Each bug fix task description includes the BUG-NNN identifier (e.g., '[BUG-001]') linking task to bug record",
"max_score": 10
},
{
"name": "New .feature file created",
"description": "Since TDD is mandatory, a new .feature file (e.g., bugfix_BUG-001.feature) is created in the tests/features/ directory with a scenario for the bug",
"max_score": 13
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
rules
skills
iikit-00-constitution
scripts
dashboard
iikit-01-specify
iikit-02-plan
iikit-03-checklist
scripts
bash
dashboard
iikit-04-testify
iikit-05-tasks
iikit-06-analyze
iikit-07-implement
iikit-08-taskstoissues
iikit-bugfix
scripts
dashboard
iikit-clarify
iikit-core
references
scripts
bash
dashboard
powershell
templates