Diagnose failed checks, flaky runs, API/tool errors, missing evidence, false-success risk, and unresolved incidents by classifying failure severity, applying bounded retries or suppression budgets, and deciding when to escalate. Use when troubleshooting jobs, pipelines, automations, integrations, agent/tool failures, timeouts, rate limits, stale read-backs, or any situation where you need a clear retry-vs-escalate decision instead of ad hoc recovery.
94
94%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Loading evals