CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/spec-driven-development

Spec-driven workflow covering requirement gathering, spec authoring, implementation review, and verification — with skills, rules, and evaluation scenarios.

96

1.19x
Quality

90%

Does it follow best practices?

Impact

98%

1.19x

Average score across 9 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-2/

{
  "context": "Tests whether the agent follows the requirement-gathering skill when preparing for a stakeholder interview: reviewing existing specs to identify what's already covered, identifying genuine gaps, formulating one-at-a-time questions based on those gaps, and producing a structured requirements summary.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Output file created",
      "description": "A file named requirements-prep.md is produced",
      "max_score": 4
    },
    {
      "name": "Existing coverage section",
      "description": "requirements-prep.md contains a section summarizing what the existing specs (user-management.spec.md, auth.spec.md) already document that is relevant to the export feature",
      "max_score": 10
    },
    {
      "name": "Cites list_users as existing",
      "description": "The coverage summary specifically notes that list_users with role/status filtering already exists in the spec",
      "max_score": 8
    },
    {
      "name": "Gap analysis present",
      "description": "requirements-prep.md contains a section identifying specific ambiguous or undocumented areas for the export feature (not just generic categories)",
      "max_score": 10
    },
    {
      "name": "Questions are separate items",
      "description": "Interview questions are listed as individual numbered items — NOT bundled together as a multi-part question or a comma-separated list",
      "max_score": 12
    },
    {
      "name": "No redundant questions",
      "description": "Questions do NOT ask about things already documented in the existing specs (e.g. does NOT ask what roles exist, does NOT ask about pagination behavior already defined)",
      "max_score": 12
    },
    {
      "name": "Performance/scale addressed",
      "description": "Gap analysis or questions address the large-org scale requirement (50k+ users) — e.g. async job vs. synchronous, streaming, timeouts",
      "max_score": 10
    },
    {
      "name": "Preliminary scope section",
      "description": "requirements-prep.md contains a scope section distinguishing what is likely in scope versus out of scope",
      "max_score": 8
    },
    {
      "name": "Non-functional gaps identified",
      "description": "Gap analysis or questions address at least one non-functional requirement (performance, security, file size limits, rate limiting, or similar)",
      "max_score": 8
    },
    {
      "name": "Gap-driven questions",
      "description": "Each question clearly addresses a gap identified in the gap analysis — not generic process questions or things answerable from existing specs",
      "max_score": 8
    },
    {
      "name": "Questions are specific and bounded",
      "description": "Questions are specific and bounded (e.g. 'Should the export be synchronous or generated as a background job?' not 'How should the export work?')",
      "max_score": 10
    }
  ]
}

README.md

tile.json