Closing the intent-to-code chasm - specification-driven development with BDD verification chain
86
92%
Does it follow best practices?
Impact
86%
1.82xAverage score across 14 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests the full IIKit pipeline (specify→plan→testify→tasks) on a greenfield feature with TDD-mandatory constitution. Validates complete traceability chain: every FR-XXX in spec must trace to plan decisions, every FR has at least one TS-XXX in .feature files, every TS appears in at least one task, phase separation is maintained throughout, and TDD ordering is respected in the task breakdown.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Spec is technology-agnostic",
"description": "spec.md does NOT mention Slack SDK, specific databases, programming languages, or frameworks. It describes WHAT the bot does, not HOW",
"max_score": 8
},
{
"name": "Plan has no governance content",
"description": "plan.md does NOT restate constitutional principles about TDD, privacy, or minimal disruption. May reference the constitution but must not duplicate governance text",
"max_score": 6
},
{
"name": "FR-XXX to TS-XXX coverage",
"description": "Every FR-XXX in spec.md has at least one @FR-XXX tag in the generated .feature files. No functional requirement is untested",
"max_score": 15
},
{
"name": "TS-XXX to task coverage",
"description": "Every @TS-XXX tag in the .feature files is referenced by at least one task in tasks.md. No test scenario is orphaned without an implementation task",
"max_score": 12
},
{
"name": ".feature files have DO NOT MODIFY headers",
"description": "Each generated .feature file starts with the DO NOT MODIFY header comment block warning against changing scenarios",
"max_score": 6
},
{
"name": ".feature files have required tags",
"description": "Every Scenario in .feature files has @TS-XXX, @FR-XXX, @US-XXX, priority (@P1/@P2), and test type (@acceptance/@contract/@validation) tags",
"max_score": 8
},
{
"name": "Tasks ordered: Setup → Foundational → Stories → Polish",
"description": "tasks.md follows the phase structure: Setup (project init), Foundational (shared infrastructure), User Story phases (by priority), Polish/Final",
"max_score": 6
},
{
"name": "TDD task ordering within story phases",
"description": "Within each user story phase, test-related tasks (step definitions, test setup) appear before or alongside implementation tasks — consistent with the TDD-mandatory constitution",
"max_score": 10
},
{
"name": "No phantom requirements",
"description": "No .feature scenarios, tasks, or plan components address features not in the spec (e.g., sprint planning integration, retrospective automation, Jira sync — all out of scope)",
"max_score": 8
},
{
"name": "Privacy constraint in .feature files",
"description": "At least one .feature scenario tests the privacy requirement: cross-team blocker view shows only blocker text, not full standup responses",
"max_score": 7
},
{
"name": "No FR orphans in either direction",
"description": "No FR-XXX in spec lacks plan coverage AND no plan component introduces requirements without spec FR basis. The traceability chain is bidirectional",
"max_score": 8
},
{
"name": "TS-XXX IDs are unique across files",
"description": "All @TS-XXX tags across all .feature files are unique — no duplicate TS IDs",
"max_score": 6
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
rules
skills
iikit-00-constitution
scripts
dashboard
iikit-01-specify
iikit-02-plan
iikit-03-checklist
scripts
bash
dashboard
iikit-04-testify
iikit-05-tasks
iikit-06-analyze
iikit-07-implement
iikit-08-taskstoissues
iikit-bugfix
scripts
dashboard
iikit-clarify
iikit-core
references
scripts
bash
dashboard
powershell
templates