testing-dags

Complex DAG testing workflows with debugging and fixing cycles. Use for multi-step testing requests like "test this dag and fix it if it fails", "test and debug", "run the pipeline and troubleshoot issues". For simple test requests ("test dag", "run dag"), the airflow entrypoint skill handles it directly. This skill is for iterative test-debug-fix cycles.

Quality

—

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Content

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The content is highly actionable with a clear, well-validated test-debug-fix loop and concrete commands, but it is longer than necessary due to redundant listings and an ASCII diagram, and the single-file structure mixes quick-start guidance with extensive reference material. Trim repetition and consider extracting the CLI reference and scenarios into a separate reference file.

Suggestions

Remove the ASCII workflow diagram and the repeated CLI listings across the reference table, scenarios, and phase sections; a single canonical command reference is enough.

Extract the full 'CLI Quick Reference' table and the six 'Testing Scenarios' into a separate reference file, leaving SKILL.md as a lean overview that links to it.

Tighten the 'DO NOT' pre-flight list since the 'FIRST ACTION: Just Trigger the DAG' section already makes the point concisely.

Dimension	Reasoning	Score
Conciseness	The body is largely command-driven and efficient, but the ASCII workflow diagram, repeated command listings in both the reference table and scenario sections, and redundant 'DO NOT' framing restate the same guidance multiple times, adding tokens Claude doesn't need.	2 / 3
Actionability	Every phase provides concrete, executable `af` commands with realistic arguments, expected JSON response shapes, and copy-paste-ready examples (e.g. `af runs trigger-wait my_dag --timeout 300`).	3 / 3
Workflow Clarity	The test -> debug -> fix loop is explicitly sequenced with numbered phases, a clear trigger-first philosophy, and a retest feedback loop ('Repeat the test → debug → fix loop until the DAG succeeds').	3 / 3
Progressive Disclosure	No bundle files exist, so all content is inline in SKILL.md; the body is well-sectioned, but the ~400-line single file with repeated CLI reference tables and multiple scenario blocks could be split into a quick-start overview with a separate reference rather than kept monolithic.	2 / 3
	Total	10 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is specific, well-triggered, and clearly scoped to iterative test-debug-fix cycles with explicit boundary guidance against a simpler sibling skill. It answers both what and when with natural trigger terms. No changes needed.

Dimension	Reasoning	Score
Specificity	It names concrete actions ('testing', 'debugging', 'fixing cycles', 'iterative test-debug-fix cycles') and a specific domain (DAG testing), listing multiple specific concrete operations.	3 / 3
Completeness	It clearly states what the skill does (complex DAG testing workflows with debug/fix cycles) and when to use it ('Use for multi-step testing requests like...'), plus when not to use it (simple requests handled by the airflow entrypoint skill).	3 / 3
Trigger Term Quality	It embeds natural user phrases like 'test this dag and fix it if it fails', 'test and debug', and 'run the pipeline and troubleshoot issues' that users would actually say.	3 / 3
Distinctiveness Conflict Risk	It carves a clear niche (iterative test-debug-fix cycles) and explicitly distinguishes it from the simpler 'airflow entrypoint skill' for basic test requests, reducing conflict risk.	3 / 3
	Total	12 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 16 / 16 Passed

Validation for skill structure

No warnings or errors.

Repository: astronomer/agents
Commit: 8827e93

Reviewed: 8 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.