Closing the intent-to-code chasm - specification-driven development with BDD verification chain
86
92%
Does it follow best practices?
Impact
86%
1.82xAverage score across 14 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests whether the agent creates a CONSTITUTION.md that follows governance best practices: technology-agnostic principles, structured versioning, TDD determination extracted to machine-readable config, and proper separation from feature-level concerns.",
"type": "weighted_checklist",
"checklist": [
{
"name": "No technology stack in constitution",
"description": "CONSTITUTION.md does NOT contain specific technology choices such as database names (PostgreSQL, DynamoDB, MySQL), frameworks (FastAPI, Django, React), languages (Python, Node.js), or library/tool versions",
"max_score": 12
},
{
"name": "No implementation details",
"description": "CONSTITUTION.md does NOT contain implementation specifics like file structures, API schemas, data models, architecture diagrams, or service configurations",
"max_score": 8
},
{
"name": "At least 3 principles",
"description": "CONSTITUTION.md contains at least 3 distinct named principles (sections or subsections identifying governance rules)",
"max_score": 8
},
{
"name": "Principles are declarative",
"description": "Constitution principles use declarative language (MUST, SHALL, REQUIRED, PROHIBITED) rather than vague aspirational language",
"max_score": 8
},
{
"name": "Semver version present",
"description": "CONSTITUTION.md includes a version number in semver format (e.g., 1.0.0, 0.1.0) with a CONSTITUTION_VERSION or equivalent field",
"max_score": 7
},
{
"name": "Amendment procedure present",
"description": "CONSTITUTION.md includes a section describing how the constitution can be amended (versioning, ratification, or change process)",
"max_score": 7
},
{
"name": ".specify/context.json created",
"description": "A .specify/context.json file exists containing a tdd_determination field — the governance document's testing stance extracted to a machine-readable config file",
"max_score": 12
},
{
"name": "TDD determination value valid",
"description": "The tdd_determination value in .specify/context.json is one of: 'mandatory', 'optional', or 'forbidden'",
"max_score": 10
},
{
"name": "Principles are domain-agnostic",
"description": "Constitution principles are written as reusable governance rules, not project-specific requirements. They should apply regardless of which feature is being built — e.g., 'All PII MUST be encrypted at rest' not 'Patient HL7 messages must use TLS 1.3'",
"max_score": 10
},
{
"name": "Testing philosophy explicitly stated",
"description": "CONSTITUTION.md explicitly addresses testing approach — whether TDD is required, optional, or forbidden — as a named principle or clear subsection, not just implied by other rules",
"max_score": 8
},
{
"name": "Dates in ISO format",
"description": "Any dates in CONSTITUTION.md are in ISO 8601 format (YYYY-MM-DD)",
"max_score": 5
},
{
"name": "No feature-specific content",
"description": "CONSTITUTION.md does not contain user stories, acceptance criteria, functional requirements, or other feature-level artifacts that belong in a spec",
"max_score": 5
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
rules
skills
iikit-00-constitution
scripts
dashboard
iikit-01-specify
iikit-02-plan
iikit-03-checklist
scripts
bash
dashboard
iikit-04-testify
iikit-05-tasks
iikit-06-analyze
iikit-07-implement
iikit-08-taskstoissues
iikit-bugfix
scripts
dashboard
iikit-clarify
iikit-core
references
scripts
bash
dashboard
powershell
templates