Execute Databricks incident response procedures with triage, mitigation, and postmortem. Use when responding to Databricks-related outages, investigating job failures, or running post-incident reviews for pipeline failures. Trigger with phrases like "databricks incident", "databricks outage", "databricks down", "databricks on-call", "databricks emergency", "job failed".
89
88%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly defines its scope (Databricks incident response), lists concrete actions (triage, mitigation, postmortem), provides explicit 'Use when' guidance, and includes a comprehensive set of natural trigger phrases. It uses proper third-person voice throughout and is concise without being vague. Minor improvement could include mentioning specific artifact types like runbooks or alerting systems, but overall this is well-crafted.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'triage', 'mitigation', 'postmortem', 'investigating job failures', 'running post-incident reviews'. These are clear, actionable procedures rather than vague language. | 3 / 3 |
Completeness | Clearly answers both 'what' (execute incident response procedures with triage, mitigation, and postmortem) and 'when' (responding to outages, investigating job failures, running post-incident reviews) with an explicit 'Use when' clause and additional trigger phrases. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users would actually say: 'databricks incident', 'databricks outage', 'databricks down', 'databricks on-call', 'databricks emergency', 'job failed'. These cover multiple natural phrasings a user in an incident scenario would use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche: Databricks-specific incident response. The combination of 'Databricks' + 'incident response' creates a very specific domain unlikely to conflict with general Databricks skills or general incident response skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, highly actionable incident runbook with excellent workflow clarity and concrete, executable commands at every step. Its main weakness is that it packs too much into a single file—communication templates, postmortem templates, and detailed remediation steps would benefit from being split into referenced files to improve token efficiency and progressive disclosure. The decision tree and error handling table are particularly well done.
Suggestions
Extract the postmortem template and communication templates into separate referenced files (e.g., POSTMORTEM_TEMPLATE.md, COMMS_TEMPLATES.md) to reduce the main file's token footprint.
Move detailed remediation steps (3a-3d) into a separate REMEDIATION.md file, keeping only the decision tree and brief summaries in the main SKILL.md.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly efficient and avoids explaining basic concepts, but it's quite long (~200 lines) with some sections that could be tightened—the communication templates and postmortem template add bulk that could be in separate files. The severity table and decision tree are useful but the overall document is heavy for a single SKILL.md. | 2 / 3 |
Actionability | Excellent actionability throughout—every step has executable bash commands, SQL queries, or copy-paste-ready templates. The triage script, cluster diagnostics, run repair commands, and evidence collection script are all concrete and immediately usable. | 3 / 3 |
Workflow Clarity | The workflow is clearly sequenced (triage → decision tree → specific remediation → communication → evidence → postmortem) with explicit validation checkpoints. The decision tree provides clear branching logic, and the error handling table covers common failure modes with specific recovery steps. | 3 / 3 |
Progressive Disclosure | The content is well-structured with clear headers and logical sections, but it's monolithic—the communication templates, postmortem template, and detailed remediation steps could be split into separate referenced files. The single reference to 'databricks-data-handling' at the end is good but insufficient given the document's length. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
3a2d27d
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.