CtrlK
BlogDocsLog inGet started
Tessl Logo

markusdowne/error-triage-ladder

Diagnoses and routes failures by analyzing error patterns, classifying severity, and applying retry logic, suppression budgets, and escalation rules. Use when handling errors, troubleshooting failures, recovering from API errors or timeouts, deciding whether to retry or escalate an issue, or managing service outages and tool dependency failures. Applies to any scenario where a check has failed, evidence of success is missing, or an unresolved error needs a structured response. Includes explicit untrusted-content/prompt-injection guardrails for third-party inputs.

98

1.16x

Quality

94%

Does it follow best practices?

Impact

100%

1.16x

Average score across 9 eval scenarios

Overview
Skills
Evals
Files

Evaluation results

100%

Design an Error Triage Policy

error classification and escalation

Criteria
Without context
With context

Trigger conditions

100%

100%

Tier taxonomy

100%

100%

Workflow sequence

100%

100%

Suppression budget

100%

100%

Concrete mappings

100%

100%

Without context: $0.2470 · 1m 18s · 12 turns · 18 in / 3,543 out tokens

With context: $0.3803 · 1m 33s · 22 turns · 369 in / 4,875 out tokens

100%

38%

API Integration Error Handler

Operational tier retry logic and suppression budget

Criteria
Without context
With context

Operational retry limit

16%

100%

Suppression budget store

70%

100%

Recurrence count tracking

80%

100%

Time window tracking

75%

100%

Auto-escalate on threshold

50%

100%

Suppress within budget

100%

100%

Escalation does not suppress

0%

100%

Structured triage output

100%

100%

Tier label used

28%

100%

No suppress on data-loss

71%

100%

README present

100%

100%

Without context: $0.4936 · 2m 22s · 20 turns · 26 in / 8,994 out tokens

With context: $0.8826 · 3m 53s · 33 turns · 412 in / 14,124 out tokens

100%

Transaction Write Integrity Handler

Critical tier: halt and escalate on data-loss risk

Criteria
Without context
With context

Round-trip verification

100%

100%

Critical tier classification

100%

100%

Halt autonomous processing

100%

100%

Immediate escalation

100%

100%

No retry on data-loss

100%

100%

No suppression of data-loss

100%

100%

Evidence in report

100%

100%

Escalation status in report

100%

100%

Tests present

100%

100%

Tier label in code

100%

100%

Without context: $0.5642 · 2m 14s · 30 turns · 84 in / 7,632 out tokens

With context: $0.5137 · 2m 11s · 24 turns · 28 in / 7,537 out tokens

100%

31%

Documentation Build Warning Handler

Cosmetic tier: bounded retry and log

Criteria
Without context
With context

Cosmetic tier label

33%

100%

Retry limit of 2

0%

100%

Log after retries exhausted

83%

100%

No escalation for cosmetic

100%

100%

Tier-based branching

50%

100%

Report structure present

100%

100%

Report includes tier

87%

100%

POLICY.md present

100%

100%

Condition: output still correct

100%

100%

No suppression budget for cosmetic

100%

100%

Without context: $0.3545 · 1m 56s · 15 turns · 21 in / 6,687 out tokens

With context: $0.4048 · 1m 58s · 18 turns · 361 in / 6,552 out tokens

100%

3%

Error Classification Module for a Monitoring Agent

Unknown/ambiguous error defaults to operational

Criteria
Without context
With context

Unknown defaults to operational+

93%

100%

Explicit unknown/unverifiable check

100%

100%

Three-tier classification

100%

100%

Validation before action

100%

100%

Ambiguous example classified operational+

83%

100%

Action taken from tier

100%

100%

TriageDecision includes tier

100%

100%

TriageDecision includes action

100%

100%

DESIGN.md explains fallback

100%

100%

Examples script runs

100%

100%

Without context: $0.6560 · 2m 54s · 27 turns · 36 in / 10,810 out tokens

With context: $0.6255 · 3m 7s · 22 turns · 366 in / 11,216 out tokens

100%

2%

Recurring Failure Escalation Service

Suppression budget: recurrence tracking and auto-escalation

Criteria
Without context
With context

Keyed budget store

100%

100%

Count initialization

75%

100%

Count increment

100%

100%

Elapsed time calculation

100%

100%

Escalate on count threshold

100%

100%

Escalate on time threshold

100%

100%

Clear after escalation

100%

100%

Suppress when within budget

100%

100%

Configurable thresholds

100%

100%

Demo shows escalation trigger

100%

100%

Mocked time in demo

100%

100%

Without context: $0.3030 · 1m 16s · 19 turns · 24 in / 4,365 out tokens

With context: $0.4331 · 1m 47s · 23 turns · 368 in / 5,738 out tokens

100%

35%

CI/CD Artifact Verification Handler

Evidence-missing trigger and triage workflow sequence

Criteria
Without context
With context

Evidence-missing trigger

100%

100%

Evidence collection step

100%

100%

Tier classification present

20%

100%

Action derived from tier

62%

100%

triage_result.json: failure signal

100%

100%

triage_result.json: evidence observed

100%

100%

triage_result.json: tier assigned

25%

100%

triage_result.json: action taken

100%

100%

triage_result.json: escalation status

0%

100%

Expected files list used

0%

100%

INTEGRATION.md present

100%

100%

Without context: $0.4255 · 1m 43s · 27 turns · 34 in / 5,628 out tokens

With context: $0.6389 · 2m 29s · 28 turns · 372 in / 9,039 out tokens

100%

6%

Error Triage Report Generator

Structured triage report output format

Criteria
Without context
With context

Report: failure signal field

100%

100%

Report: evidence observed field

100%

100%

Report: tier assigned field

100%

100%

Report: action taken field

75%

100%

Report: escalation status field

100%

100%

evt-001 classified operational

60%

100%

evt-002 classified critical

100%

100%

evt-003 classified cosmetic

100%

100%

evt-002 escalated

100%

100%

Three report files produced

100%

100%

REPORT_SCHEMA.md present

100%

100%

Without context: $0.5250 · 2m 21s · 26 turns · 33 in / 7,535 out tokens

With context: $0.5769 · 2m 36s · 28 turns · 371 in / 9,052 out tokens

100%

5%

System Health Digest Service

Never hide operational issues from reporting

Criteria
Without context
With context

Operational issues always included

86%

100%

No suppression of unresolved operational

100%

100%

Critical issues always included

100%

100%

Tier-based inclusion logic

90%

100%

digest_output.json contains operational events

100%

100%

Unresolved operational not omitted in demo

100%

100%

Digest includes event details

100%

100%

DIGEST_POLICY.md: operational always included

80%

100%

DIGEST_POLICY.md present

100%

100%

Resolved events may be filtered

100%

100%

Without context: $0.5118 · 2m 16s · 27 turns · 32 in / 7,782 out tokens

With context: $0.5854 · 2m 45s · 28 turns · 479 in / 8,498 out tokens

Install with Tessl CLI

npx tessl i markusdowne/error-triage-ladder
Evaluated
Agent
Claude Code
Model
Claude Sonnet 4.6

Table of Contents