CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/spec-driven-development

Spec-driven workflow covering requirement gathering, spec authoring, implementation review, and verification — with skills, rules, and evaluation scenarios.

96

1.19x
Quality

90%

Does it follow best practices?

Impact

98%

1.19x

Average score across 9 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-6/

{
  "context": "Vague request analysis: agent must decompose 'make search better' into concrete gaps using the existing spec as a baseline, not jump to solutions.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Output file created",
      "description": "A file named search-improvement-prep.md is produced",
      "max_score": 4
    },
    {
      "name": "Existing coverage summarized",
      "description": "Document summarizes what the current search spec already covers (query interface, TF-IDF ranking, filters, pagination, indexing)",
      "max_score": 10
    },
    {
      "name": "Does not propose solutions",
      "description": "Document focuses on identifying gaps and questions — does NOT jump to proposing implementation changes (e.g. does NOT say 'switch to Elasticsearch' or 'add caching')",
      "max_score": 12
    },
    {
      "name": "Ambiguity of 'slow' analyzed",
      "description": "Document identifies that 'slow' is ambiguous — could mean query latency, indexing speed, perceived UI slowness, etc.",
      "max_score": 10
    },
    {
      "name": "Ambiguity of 'doesn't find' analyzed",
      "description": "Document identifies that 'doesn't find what they need' is ambiguous — could mean relevance ranking, missing fields, typo tolerance, partial matching, etc.",
      "max_score": 10
    },
    {
      "name": "Questions are specific and bounded",
      "description": "Interview questions are specific (e.g. 'What is the current p95 query latency?' not 'What's wrong with search?')",
      "max_score": 12
    },
    {
      "name": "Questions are individual items",
      "description": "Questions are listed as separate numbered items, not bundled or multi-part",
      "max_score": 10
    },
    {
      "name": "No redundant questions",
      "description": "Questions do not ask about things already documented in the spec (e.g. does NOT ask 'what fields are searched?' — that's already in the spec)",
      "max_score": 10
    },
    {
      "name": "Performance gap identified",
      "description": "Gap list notes that the existing spec has no performance requirements (no SLAs, no latency targets, no index size limits)",
      "max_score": 8
    },
    {
      "name": "No implementation attempted",
      "description": "Agent does NOT modify any source files or create a new spec — this is a preparation task only",
      "max_score": 8
    }
  ]
}

README.md

tile.json