CtrlK
BlogDocsLog inGet started
Tessl Logo

markusdowne/detectability-contract

Creates boundary-point validation contracts, defines invariant-based success criteria, and sets up automated verification probes so reliability workflows trigger on objective evidence rather than intuition. Use when designing robust handoff, memory-persistence, or tool-call reliability workflows; when you need to verify handoffs work, check memory persistence, validate tool calls succeeded, or convert vague reliability goals into concrete, testable checks at each boundary point with explicit failure-class mapping (operational vs. critical); or when you want to test your workflow end-to-end, make sure it works, or verify your automation runs correctly using read-back probes and escalation triggers rather than agent confidence. Includes explicit untrusted-content/prompt-injection guardrails for third-party inputs.

96

1.25x

Quality

90%

Does it follow best practices?

Impact

98%

1.25x

Average score across 9 eval scenarios

Overview
Skills
Evals
Files

rubric.jsonevals/scenario-2/

{
  "context": "Tests whether the agent correctly identifies file handoff boundary points, defines invariants including artifact existence and schema validity, produces a contract table with the correct columns, and implements Python invariant checks using the expected code patterns.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Boundary identification",
      "description": "The contract identifies at least 2 distinct boundary points in the pipeline (e.g. extract→transform handoff, transform→load handoff)",
      "max_score": 8
    },
    {
      "name": "Artifact exists invariant",
      "description": "The contract includes an 'artifact exists' invariant (e.g. file path exists, size > 0) for at least one boundary",
      "max_score": 8
    },
    {
      "name": "Schema valid invariant",
      "description": "The contract includes a schema/format validity invariant (e.g. parseable as JSON, required fields present) for at least one boundary",
      "max_score": 8
    },
    {
      "name": "Table format",
      "description": "contract.md contains a markdown table with exactly these column headers: Boundary, Required Invariants, Verification Probes, Failure Class, Escalation Trigger",
      "max_score": 10
    },
    {
      "name": "Failure class mapping",
      "description": "The table maps at least one invariant failure to 'critical' and at least one to 'operational' (or equivalent terms indicating severity distinction)",
      "max_score": 10
    },
    {
      "name": "Escalation trigger defined",
      "description": "Each boundary row in the table contains a non-empty escalation trigger specifying what action to take (e.g. retry once then halt, raise alert)",
      "max_score": 8
    },
    {
      "name": "Verification probe defined",
      "description": "The contract includes at least one verification probe per boundary (e.g. read-back, re-open and hash, schema validation)",
      "max_score": 8
    },
    {
      "name": "Path exists assert",
      "description": "verify.py contains a check using Path(...).exists() or os.path.exists(...) to verify the artifact is present",
      "max_score": 10
    },
    {
      "name": "Schema parse check",
      "description": "verify.py contains a JSON parse check (e.g. json.loads or json.load) to verify schema validity",
      "max_score": 10
    },
    {
      "name": "Missing file as critical",
      "description": "The contract or script treats a missing file as a critical failure (explicit critical label, non-zero exit, or halt instruction)",
      "max_score": 10
    },
    {
      "name": "Retry-then-halt escalation",
      "description": "At least one boundary specifies an escalation of 'retry once then halt and report' (or semantically equivalent)",
      "max_score": 10
    }
  ]
}

Install with Tessl CLI

npx tessl i markusdowne/detectability-contract@0.1.2

evals

SKILL.md

tile.json