CtrlK
BlogDocsLog inGet started
Tessl Logo

debugging-dags

Comprehensive DAG failure diagnosis and root cause analysis. Use for complex debugging requests requiring deep investigation like "diagnose and fix the pipeline", "full root cause analysis", "why is this failing and how to prevent it". For simple debugging ("why did dag fail", "show logs"), the airflow entrypoint skill handles it directly. This skill provides structured investigation and prevention recommendations.

68

Quality

83%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, actionable diagnostic skill with a clear four-step workflow and excellent concrete guidance including specific CLI commands, code snippets, and structured output templates. Its main weakness is length — the package version changes section, while genuinely valuable and non-obvious, is quite detailed for inline content and could benefit from being extracted to a referenced file. Minor verbosity in some sections (platform-specific tips, some context-gathering items) could be trimmed.

Suggestions

Extract the 'Package version changes' deep-dive into a separate reference file (e.g., PACKAGE_DEBUGGING.md) and link to it from the main skill to improve progressive disclosure and reduce inline length.

Trim the 'On Astro' and 'On OSS Airflow' sections — they add little actionable guidance beyond 'check the UI' and could be condensed to a single line each or removed.

DimensionReasoningScore

Conciseness

The skill is mostly efficient and provides useful, non-obvious information (especially the package version changes section), but some sections are somewhat verbose — e.g., the 'On Astro' and 'On OSS Airflow' sections add little actionable value, and the Step 3 context-gathering checklist includes items Claude could infer. The package version changes section, while detailed and valuable, could be tightened.

2 / 3

Actionability

The skill provides specific, executable CLI commands throughout (af runs diagnose, af tasks logs, af runs clear, etc.), concrete code snippets for image diffing and PyPI querying, and clear categorization frameworks. The guidance is copy-paste ready and leaves little ambiguity about what to do.

3 / 3

Workflow Clarity

The four-step workflow is clearly sequenced with logical progression from identification → error details → context gathering → actionable output. Each step has explicit decision points (e.g., 'If a specific DAG was mentioned' vs not), and the output structure in Step 4 includes validation-like checkpoints (root cause, impact assessment, immediate fix, prevention). The feedback loop of 'clear and rerun' is explicitly provided.

3 / 3

Progressive Disclosure

The content is well-structured with clear headers and sub-sections, and uses an internal anchor link for 'Package version changes'. However, the skill is quite long (~150 lines of substantive content) with no references to external files. The detailed package version changes section and platform-specific sections could be split into separate reference files to keep the main skill leaner.

2 / 3

Total

10

/

12

Passed

Description

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-crafted description that excels at completeness and distinctiveness by explicitly defining when to use this skill versus a simpler alternative, with natural trigger phrases. Its main weakness is that the specific capabilities could be more concrete—listing actual diagnostic actions rather than abstract terms like 'structured investigation' would strengthen it further.

Suggestions

Replace 'structured investigation and prevention recommendations' with concrete actions like 'analyzes task logs, traces dependency chains, identifies upstream failures, and generates prevention recommendations.'

DimensionReasoningScore

Specificity

The description names the domain (DAG failure diagnosis, root cause analysis) and mentions 'structured investigation and prevention recommendations,' but doesn't list multiple concrete actions like 'analyze task logs, trace dependency failures, check scheduler health, review XCom data.' The actions remain somewhat abstract.

2 / 3

Completeness

Clearly answers both 'what' (comprehensive DAG failure diagnosis and root cause analysis, structured investigation and prevention recommendations) and 'when' (complex debugging requests with explicit example phrases, plus a clear boundary distinguishing it from the simpler airflow entrypoint skill).

3 / 3

Trigger Term Quality

Includes strong natural trigger terms users would say: 'diagnose and fix the pipeline', 'full root cause analysis', 'why is this failing and how to prevent it', 'dag fail', 'show logs'. Also differentiates from simpler queries, which helps with routing.

3 / 3

Distinctiveness Conflict Risk

Explicitly distinguishes itself from the simpler 'airflow entrypoint skill' by defining the complexity boundary. The trigger phrases for complex vs. simple debugging are clearly delineated, making it unlikely to conflict with related skills.

3 / 3

Total

11

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
astronomer/agents
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.