CtrlK
BlogDocsLog inGet started
Tessl Logo

debugging-dags

Comprehensive DAG failure diagnosis and root cause analysis. Use for complex debugging requests requiring deep investigation like "diagnose and fix the pipeline", "full root cause analysis", "why is this failing and how to prevent it". For simple debugging ("why did dag fail", "show logs"), the airflow entrypoint skill handles it directly. This skill provides structured investigation and prevention recommendations.

86

Quality

83%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-crafted description that excels at completeness and distinctiveness by explicitly defining when to use this skill versus a simpler alternative, with natural trigger phrases. Its main weakness is that the specific capabilities could be more concrete—listing particular investigation steps or analysis techniques rather than the somewhat abstract 'structured investigation and prevention recommendations.'

Suggestions

Add more concrete action verbs describing what the skill does, e.g., 'traces dependency chains, analyzes task logs, identifies resource bottlenecks, and generates prevention recommendations.'

DimensionReasoningScore

Specificity

The description names the domain (DAG failure diagnosis, root cause analysis) and mentions 'structured investigation and prevention recommendations,' but doesn't list multiple concrete actions like 'analyze task logs, trace dependency failures, identify resource bottlenecks, recommend retry policies.'

2 / 3

Completeness

Clearly answers both 'what' (comprehensive DAG failure diagnosis and root cause analysis, structured investigation and prevention recommendations) and 'when' (complex debugging requests requiring deep investigation, with explicit example triggers and a clear boundary distinguishing it from simpler debugging handled elsewhere).

3 / 3

Trigger Term Quality

Includes strong natural trigger terms users would say: 'diagnose and fix the pipeline', 'full root cause analysis', 'why is this failing and how to prevent it', plus differentiating terms like 'complex debugging'. Also references 'DAG failure' and 'pipeline' which are natural user terms.

3 / 3

Distinctiveness Conflict Risk

Explicitly distinguishes itself from the 'airflow entrypoint skill' for simple debugging, carving out a clear niche for complex/deep investigation scenarios. The boundary between simple and complex debugging is well-articulated with examples for each.

3 / 3

Total

11

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured diagnostic skill with a clear 4-step workflow and concrete CLI commands throughout. Its main strengths are actionability and workflow clarity, providing specific commands and a logical investigation sequence. Minor weaknesses include some verbosity in explaining diagnostic concepts Claude already understands (like what constitutes a data vs infrastructure issue) and keeping all content inline rather than splitting platform-specific guidance into separate files.

Suggestions

Trim the failure categorization list in Step 2 and the Prevention bullet points in Step 4, as Claude can infer these categories without explicit enumeration.

Consider moving the Astro-specific and OSS-specific guidance into separate referenced files to improve progressive disclosure.

DimensionReasoningScore

Conciseness

Generally efficient but includes some unnecessary elaboration. Phrases like 'Be specific - not "the task failed" but "the task failed because column X was null in 15% of rows when the code expected 0%"' are helpful examples but some sections like the Impact Assessment and Prevention bullet points explain things Claude already knows how to do as a data engineer.

2 / 3

Actionability

Provides specific, executable CLI commands throughout (e.g., `af runs diagnose <dag_id> <dag_run_id>`, `af tasks logs`, `af runs clear`). Each step has concrete actions to take, and the Quick Commands section gives copy-paste ready commands for remediation.

3 / 3

Workflow Clarity

Clear 4-step sequential workflow with logical progression from identification → error details → context gathering → actionable output. Includes branching logic (if DAG specified vs not), failure categorization, and the output structure serves as a validation checkpoint ensuring thorough diagnosis before recommending fixes.

3 / 3

Progressive Disclosure

Content is well-structured with clear headers and sections, but everything is inline in a single file. The Astro vs OSS Airflow sections and the detailed output template (Root Cause, Impact Assessment, etc.) could potentially be split into referenced files. However, the total length is moderate enough that this is a minor issue.

2 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
astronomer/agents
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.