CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/intent-integrity-kit

Closing the intent-to-code chasm - specification-driven development with BDD verification chain

86

1.82x
Quality

92%

Does it follow best practices?

Impact

86%

1.82x

Average score across 14 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-10/

{
  "context": "Tests whether the agent writes a feature spec that is technology-agnostic (WHAT not HOW), uses FR-XXX and SC-XXX numbered requirements, includes Given/When/Then acceptance scenarios, uses a 2-4 word action-noun branch name, and avoids leaving many [NEEDS CLARIFICATION] placeholders by making reasonable assumptions.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "No technology stack in spec",
      "description": "spec.md does NOT mention specific technologies, frameworks, databases, languages, or architectural patterns (e.g., no mention of REST, GraphQL, WebSocket, PostgreSQL, Redis, React, microservices)",
      "max_score": 15
    },
    {
      "name": "FR-XXX numbered requirements",
      "description": "spec.md contains at least 4 functional requirements numbered with the FR-XXX pattern (e.g., FR-001, FR-002)",
      "max_score": 10
    },
    {
      "name": "SC-XXX success criteria",
      "description": "spec.md contains at least 2 success criteria numbered with the SC-XXX pattern (e.g., SC-001, SC-002)",
      "max_score": 8
    },
    {
      "name": "Given/When/Then scenarios",
      "description": "spec.md contains at least 4 acceptance scenarios in Given/When/Then format covering the sharing permission use cases",
      "max_score": 10
    },
    {
      "name": "User stories present",
      "description": "spec.md contains explicitly labeled user stories (US-1, US-2, or similar) with role-based framing ('As a [role], I want to...')",
      "max_score": 8
    },
    {
      "name": "Measurable success criteria",
      "description": "At least one SC-XXX success criterion includes a measurable/quantifiable element (a number, percentage, time measurement, or explicit condition)",
      "max_score": 8
    },
    {
      "name": "Max 3 NEEDS CLARIFICATION",
      "description": "spec.md contains at most 3 [NEEDS CLARIFICATION] markers (agent makes reasonable assumptions rather than leaving many unresolved questions)",
      "max_score": 10
    },
    {
      "name": "No implementation details",
      "description": "spec.md does NOT describe HOW the system will work internally (no database schemas, API endpoints, service names, file structures, or deployment configurations)",
      "max_score": 12
    },
    {
      "name": "2-4 word branch name",
      "description": "spec-report.md mentions a feature branch name that is 2-4 hyphenated words in action-noun format (e.g., 'doc-sharing', 'document-permissions', 'share-document-access')",
      "max_score": 10
    },
    {
      "name": "Requirements.md checklist created",
      "description": "specs/004-doc-sharing/checklists/requirements.md exists and contains checklist items that evaluate requirement quality (not implementation correctness)",
      "max_score": 9
    }
  ]
}

evals

README.md

tile.json