incident-runbook-templates

Create structured incident response runbooks with step-by-step procedures, escalation paths, and recovery actions. Use this skill when building a service outage runbook for a payment processing system; creating database incident procedures covering connection pool exhaustion, replication lag, and disk space alerts; onboarding new on-call engineers who need step-by-step recovery guides written for a 3 AM brain; or standardizing escalation matrices across multiple engineering teams.

Quality

77%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/incident-response/skills/incident-runbook-templates/SKILL.md

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly defines what it does (creates incident response runbooks with specific components) and when to use it (with four detailed, realistic trigger scenarios). The description uses appropriate third-person voice, includes rich natural trigger terms that engineers would actually use, and occupies a clear, distinctive niche. The only minor note is that the description is somewhat long, but the detail is substantive rather than padded.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'step-by-step procedures, escalation paths, recovery actions' and gives detailed examples like 'connection pool exhaustion, replication lag, disk space alerts' and 'escalation matrices'.	3 / 3
Completeness	Clearly answers both 'what' (create structured incident response runbooks with step-by-step procedures, escalation paths, and recovery actions) and 'when' (explicit 'Use this skill when...' clause with four detailed trigger scenarios).	3 / 3
Trigger Term Quality	Excellent coverage of natural terms users would say: 'incident response', 'runbook', 'service outage', 'on-call', 'escalation', 'recovery guides', 'database incident procedures', 'disk space alerts', 'replication lag'. These are terms engineers naturally use when seeking this kind of help.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive niche focused specifically on incident response runbooks for engineering/ops teams. The specific domain terms like 'runbook', 'on-call', 'escalation matrices', and 'incident procedures' make it very unlikely to conflict with other skills.	3 / 3
	Total	12 / 12 Passed

Implementation

55%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides highly actionable, well-structured incident response templates with executable commands and clear workflows. However, it is severely bloated — the entire content is inlined into one massive file with no progressive disclosure, and it includes unnecessary explanations and verbose best-practices sections that Claude doesn't need. The content would be significantly improved by extracting the two runbook templates into separate files and trimming the explanatory prose.

Suggestions

Extract Template 1 (Service Outage) and Template 2 (Database Incident) into separate referenced files (e.g., SERVICE_OUTAGE_RUNBOOK.md, DATABASE_RUNBOOK.md) and keep SKILL.md as a concise overview with links.

Remove the 'Best Practices' Do's/Don'ts section entirely — these are generic incident management principles Claude already knows, not actionable skill content.

Trim the troubleshooting section significantly — most items (e.g., 'stakeholder communication is delayed', 'engineer panics') are process advice rather than technical guidance, and the solutions are verbose restatements of what's already in the templates.

Remove the 'When to Use This Skill' and 'Core Concepts' sections — the severity table and runbook structure outline add little value when the full templates already demonstrate these concepts.

Dimension	Reasoning	Score
Conciseness	The skill is extremely verbose at ~350+ lines. It includes extensive best practices sections ('Do's and Don'ts'), explains concepts Claude already knows (what severity levels are, what runbooks are), and the troubleshooting section rehashes common sense advice at length. The communication templates, while useful as examples, are repeated three times. Much of this could be condensed to a fraction of the size.	1 / 3
Actionability	The skill provides fully executable bash commands, SQL queries, kubectl commands, and curl requests that are copy-paste ready. Each mitigation procedure has numbered, concrete steps with real commands. The database runbook includes specific SQL queries for common scenarios.	3 / 3
Workflow Clarity	Multi-step procedures are clearly numbered and sequenced. Verification steps are explicit (dedicated 'Verification Steps' section with specific commands). The troubleshooting section adds dry-run checks before destructive operations, and the escalation matrix provides clear decision points. Feedback loops are present (e.g., 'If errors: fix and re-validate' pattern in rollback and mitigation sections).	3 / 3
Progressive Disclosure	The entire skill is a monolithic wall of text with no bundle files to offload detailed content. The two full runbook templates (Service Outage and Database) should be separate referenced files. The communication templates, troubleshooting section, and best practices could all be split out. Everything is inlined into a single massive document with no external references except the 'Related Skills' at the bottom.	1 / 3
	Total	8 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: wshobson/agents
Commit: 34632bc

Reviewed: 5 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.