CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/spec-driven-development

Spec-driven workflow covering requirement gathering, spec authoring, implementation review, and verification — with skills, rules, and evaluation scenarios.

96

1.19x
Quality

90%

Does it follow best practices?

Impact

98%

1.19x

Average score across 9 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-9/

{
  "context": "Tests whether the agent performs a thorough work review following the work-review skill: checking each requirement, running linked tests, capturing discovered requirements, updating the spec, and producing a structured review summary.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Review summary file created",
      "description": "A file named review.md is produced containing the review output",
      "max_score": 5
    },
    {
      "name": "Pass/fail per requirement",
      "description": "review.md contains a requirements section with individual pass/fail status for each spec requirement (using checkboxes or equivalent)",
      "max_score": 12
    },
    {
      "name": "File references on passing items",
      "description": "Passing requirements in review.md include a reference to the implementing file and approximate location (e.g. src/notifications/sender.py)",
      "max_score": 8
    },
    {
      "name": "Missing requirement identified",
      "description": "review.md identifies that send_sms silently skips when no phone number is on file (no MissingContactError raised), which is not documented in the spec",
      "max_score": 12
    },
    {
      "name": "Discovered requirement documented",
      "description": "review.md includes a 'Discovered requirements' section noting the account.reactivated event support added in the implementation (not in the original spec)",
      "max_score": 12
    },
    {
      "name": "Spec updated with discovered requirement",
      "description": "specs/notifications.spec.md is updated to include the account.reactivated event (or a note about the newly discovered behavior)",
      "max_score": 10
    },
    {
      "name": "Test results section",
      "description": "review.md contains a test results section listing the linked test files and their outcomes",
      "max_score": 8
    },
    {
      "name": "Linked tests referenced",
      "description": "review.md explicitly references at least three of the [@test]-linked files from the spec (test_delivery.py, test_events.py, test_retry.py, test_templates.py)",
      "max_score": 8
    },
    {
      "name": "Spec updates section",
      "description": "review.md contains a spec updates section describing what was changed in the spec",
      "max_score": 8
    },
    {
      "name": "Targets still accurate",
      "description": "After review, specs/notifications.spec.md targets still include both src/notifications/sender.py and src/notifications/templates.py",
      "max_score": 7
    },
    {
      "name": "Structured review format",
      "description": "review.md is organized into clear sections covering: requirement status, discovered differences, test outcomes, and spec changes — exact heading names do not matter as long as these four topics are covered in distinct sections",
      "max_score": 10
    }
  ]
}

README.md

tile.json