Execute Databricks incident response procedures with triage, mitigation, and postmortem. Use when responding to Databricks-related outages, investigating job failures, or running post-incident reviews for pipeline failures. Trigger with phrases like "databricks incident", "databricks outage", "databricks down", "databricks on-call", "databricks emergency", "job failed".
89
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly defines its scope (Databricks incident response), lists concrete actions (triage, mitigation, postmortem), and provides explicit trigger guidance with natural user phrases. It uses proper third-person voice and is concise without being vague. The description would effectively differentiate this skill from others in a large skill library.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'triage', 'mitigation', and 'postmortem'. Also mentions investigating job failures and running post-incident reviews, giving a clear picture of what the skill does. | 3 / 3 |
Completeness | Clearly answers both 'what' (execute incident response procedures with triage, mitigation, and postmortem) and 'when' (explicit 'Use when' clause covering outages, job failures, and post-incident reviews, plus a 'Trigger with phrases' section). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users would say: 'databricks incident', 'databricks outage', 'databricks down', 'databricks on-call', 'databricks emergency', 'job failed'. These are realistic phrases a user would type during an incident. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche: Databricks-specific incident response. The combination of 'Databricks' + 'incident response' creates a very specific domain unlikely to conflict with general monitoring, alerting, or other platform skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, highly actionable incident runbook with excellent workflow clarity and executable commands throughout. Its main weakness is length — at 200+ lines with communication templates, postmortem templates, and multiple remediation paths all inline, it would benefit from splitting detailed sub-procedures into referenced files. The decision tree and step-by-step structure are exemplary for incident response.
Suggestions
Split communication templates, postmortem template, and evidence collection script into separate referenced files (e.g., COMMS_TEMPLATES.md, POSTMORTEM.md) to reduce the main skill's token footprint.
Remove the overview paragraph — the title and structure already convey the purpose, and the YAML description covers the trigger context.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly efficient and avoids explaining basic concepts, but it's quite long (~200+ lines) with some sections that could be tightened. The severity level table and some communication templates add bulk, though most content earns its place. The overview paragraph is somewhat redundant given the structure speaks for itself. | 2 / 3 |
Actionability | Excellent actionability throughout — every step has executable bash commands, SQL queries, or copy-paste templates. The triage script, cluster diagnostics, run repair commands, and evidence collection script are all concrete and immediately usable. | 3 / 3 |
Workflow Clarity | The workflow is clearly sequenced (triage → decision tree → specific remediation → communication → evidence → postmortem) with explicit validation at each step. The decision tree provides clear branching logic, and the triage script serves as an initial validation checkpoint. Error handling table covers recovery scenarios. | 3 / 3 |
Progressive Disclosure | The content is well-structured with clear headers and logical sections, but it's entirely monolithic — the detailed remediation steps, communication templates, and postmortem template could be split into separate referenced files. For a skill this long, inline content for every scenario makes it heavy. The single 'Next Steps' reference to databricks-data-handling is good but insufficient. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
c8a915c
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.