CtrlK
BlogDocsLog inGet started
Tessl Logo

markusdowne/error-triage-ladder

Diagnoses and routes failures by analyzing error patterns, classifying severity, and applying retry logic, suppression budgets, and escalation rules. Use when handling errors, troubleshooting failures, recovering from API errors or timeouts, deciding whether to retry or escalate an issue, or managing service outages and tool dependency failures. Applies to any scenario where a check has failed, evidence of success is missing, or an unresolved error needs a structured response. Includes explicit untrusted-content/prompt-injection guardrails for third-party inputs.

98

1.16x

Quality

94%

Does it follow best practices?

Impact

100%

1.16x

Average score across 9 eval scenarios

Overview
Skills
Evals
Files

rubric.jsonevals/scenario-6/

{
  "context": "Tests whether the agent implements the suppression budget pattern correctly: a keyed store tracking count and first_seen, escalation when either MAX_RECURRENCE or MAX_WINDOW is exceeded, and clearing the record after escalation.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Keyed budget store",
      "description": "Code uses a store (dict, database, file, etc.) keyed by failure identifier (failure_key or equivalent string), not a global counter",
      "max_score": 10
    },
    {
      "name": "Count initialization",
      "description": "Store initializes count to 0 and sets first_seen to current time when a failure_key is first seen (not seen before)",
      "max_score": 8
    },
    {
      "name": "Count increment",
      "description": "Code increments the count each time the same failure_key is checked",
      "max_score": 8
    },
    {
      "name": "Elapsed time calculation",
      "description": "Code calculates elapsed time as the difference between current time and first_seen",
      "max_score": 8
    },
    {
      "name": "Escalate on count threshold",
      "description": "Code escalates (calls escalate function or returns escalation decision) when count exceeds MAX_RECURRENCE",
      "max_score": 10
    },
    {
      "name": "Escalate on time threshold",
      "description": "Code escalates when elapsed time exceeds MAX_WINDOW (time-based escalation, not just count-based)",
      "max_score": 10
    },
    {
      "name": "Clear after escalation",
      "description": "Code clears (deletes or resets) the budget store entry for a failure_key after escalating — not after suppressing",
      "max_score": 10
    },
    {
      "name": "Suppress when within budget",
      "description": "Code suppresses (does not escalate) and updates the store when neither threshold is exceeded",
      "max_score": 8
    },
    {
      "name": "Configurable thresholds",
      "description": "MAX_RECURRENCE and MAX_WINDOW are defined as configurable parameters (not hardcoded magic numbers inline)",
      "max_score": 8
    },
    {
      "name": "Demo shows escalation trigger",
      "description": "demo_tracker.py output shows at least one suppression decision AND at least one escalation decision for the same failure_key",
      "max_score": 12
    },
    {
      "name": "Mocked time in demo",
      "description": "demo_tracker.py uses mocked or injected timestamps (not actual sleep or time.time() for waiting) to simulate time passage",
      "max_score": 8
    }
  ]
}

Install with Tessl CLI

npx tessl i markusdowne/error-triage-ladder

evals

SKILL.md

tile.json