CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/intent-integrity-kit

Closing the intent-to-code chasm - specification-driven development with BDD verification chain

86

1.82x
Quality

92%

Does it follow best practices?

Impact

86%

1.82x

Average score across 14 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-7/

{
  "context": "Tests whether the agent generates tasks that are fully traceable to the plan and spec: file paths match the plan's project structure, user story tags match spec stories, TS-XXX references are comma-separated (not ranges), and no tasks reference technologies or files not defined in the plan.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "File paths match plan structure",
      "description": "Task descriptions reference file paths that exist in the plan's Project Structure section (e.g., src/services/event-service.ts, src/routes/events.ts, prisma/schema.prisma). No invented paths like src/controllers/ or src/utils/ that aren't in the plan",
      "max_score": 15
    },
    {
      "name": "Every user story has tagged tasks",
      "description": "All three user stories (US-1, US-2, US-3) have at least one task tagged with [US1], [US2], [US3] respectively",
      "max_score": 10
    },
    {
      "name": "Setup/Foundational tasks have no story tags",
      "description": "Tasks in Setup and Foundational phases do NOT have [USn] labels — these are shared infrastructure, not story-specific",
      "max_score": 8
    },
    {
      "name": "TS references are comma-separated",
      "description": "When tasks reference multiple test specs, they use comma-separated format like [TS-003, TS-004, TS-005] — NOT prose ranges like 'TS-003 through TS-005' or 'TS-003 to TS-005'",
      "max_score": 12
    },
    {
      "name": "TS references match provided .feature files",
      "description": "TS-XXX references in tasks correspond to actual @TS-XXX tags in the provided .feature files (TS-001 through TS-007). No references to TS-008 or higher that don't exist",
      "max_score": 10
    },
    {
      "name": "Priority ordering respected",
      "description": "P1 user story tasks (US-1 Create Event, US-2 Purchase Ticket) appear before P2 tasks (US-3 View Listings) in the phase structure",
      "max_score": 8
    },
    {
      "name": "Phase structure complete",
      "description": "tasks.md has all required phases: Setup (project init, schema), Foundational (shared models, database), User Story phases (by priority), and a Polish/Final phase",
      "max_score": 8
    },
    {
      "name": "[P] markers only on parallelizable tasks",
      "description": "[P] markers appear only on tasks that can genuinely run in parallel (different files, no mutual dependencies). Tasks that depend on each other (e.g., model before service) are NOT marked [P]",
      "max_score": 8
    },
    {
      "name": "No technologies beyond the plan",
      "description": "Tasks do not introduce frameworks, libraries, or tools not mentioned in the plan (e.g., no Redis, no GraphQL, no React — the plan specifies Express, Prisma, Vitest, Resend)",
      "max_score": 10
    },
    {
      "name": "Checkbox format used",
      "description": "All tasks use the markdown checkbox format: - [ ] TNNN description",
      "max_score": 5
    },
    {
      "name": "Sequential T-prefixed IDs",
      "description": "Tasks use sequential zero-padded IDs: T001, T002, T003, etc. with no gaps or duplicates",
      "max_score": 6
    }
  ]
}

evals

README.md

tile.json