ml-pipeline-automation

Automate ML workflows with Airflow, Kubeflow, MLflow. Use for reproducible pipelines, retraining schedules, MLOps, or encountering task failures, dependency errors, experiment tracking issues.

1.25x

Quality

70%

Does it follow best practices?

Impact

100%

1.25x

Average score across 3 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/ml-pipeline-automation/skills/ml-pipeline-automation/SKILL.md

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid description that clearly identifies its niche in ML workflow automation and MLOps with specific tool names. The 'Use for' clause provides good trigger coverage across both proactive use cases and reactive troubleshooting scenarios. The main weakness is that the 'what' portion could be more specific about concrete actions beyond the general 'automate' verb.

Suggestions

Expand the capability list with more concrete actions, e.g., 'Build and debug DAGs, configure retraining schedules, track experiments, manage model registries' instead of the broad 'Automate ML workflows'.

Dimension	Reasoning	Score
Specificity	Names the domain (ML workflows) and key tools (Airflow, Kubeflow, MLflow), and mentions some actions like 'reproducible pipelines' and 'retraining schedules', but doesn't list multiple concrete actions—'automate' is fairly broad and the rest are more like contexts/scenarios than specific capabilities.	2 / 3
Completeness	Clearly answers both 'what' (automate ML workflows with specific tools) and 'when' (explicit 'Use for' clause covering reproducible pipelines, retraining schedules, MLOps, task failures, dependency errors, experiment tracking issues).	3 / 3
Trigger Term Quality	Includes strong natural keywords users would say: 'Airflow', 'Kubeflow', 'MLflow', 'MLOps', 'pipelines', 'retraining', 'task failures', 'dependency errors', 'experiment tracking'. These cover a good range of terms a user working in this space would naturally use.	3 / 3
Distinctiveness Conflict Risk	The combination of specific ML orchestration tools (Airflow, Kubeflow, MLflow) and MLOps-specific triggers creates a clear niche that is unlikely to conflict with general coding, data science, or other skills.	3 / 3
	Total	11 / 12 Passed

Implementation

50%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill provides highly actionable, executable code examples covering Airflow, MLflow, and common ML pipeline patterns, which is its primary strength. However, it is significantly too verbose — containing redundant DAG examples, generic best practices Claude already knows, and detailed troubleshooting that belongs in reference files. The workflow could benefit from explicit validation checkpoints during setup and a leaner main file that delegates detail to the referenced documents.

Suggestions

Remove the redundant 'Basic Airflow DAG' section since it largely duplicates the Quick Start example; consolidate into one canonical example.

Move the 7 Known Issues and Common Patterns sections into reference files (e.g., references/airflow-patterns.md) and keep only 1-2 critical gotchas inline.

Cut the 'Core Concepts > Pipeline Stages' list and 'Orchestration Tools Comparison' table — Claude knows these concepts and the table adds little actionable value.

Add explicit validation checkpoints to the Quick Start workflow (e.g., 'Verify airflow db init succeeded', 'Confirm MLflow server is reachable before triggering').

Dimension	Reasoning	Score
Conciseness	The skill is extremely verbose at ~350+ lines. It includes redundant code examples (the Quick Start DAG and the Basic Airflow DAG are nearly identical), explains basic concepts Claude already knows (pipeline stages, tool comparisons), lists 8 best practices that are generic software engineering wisdom, and the 'When to Use This Skill' section restates obvious triggers. Much content could be cut or moved to reference files.	1 / 3
Actionability	The skill provides fully executable code examples throughout — complete Airflow DAGs, MLflow tracking snippets, sensor configurations, branching patterns, and bash commands for setup. Code is copy-paste ready with imports and context managers included.	3 / 3
Workflow Clarity	The Quick Start provides a numbered 5-step sequence, and the Known Issues section addresses common failure modes with solutions. However, there are no explicit validation checkpoints in the main workflow (e.g., verify Airflow DB initialized successfully, confirm MLflow server is running before triggering). The data validation task exists but there's no feedback loop for the overall pipeline setup process.	2 / 3
Progressive Disclosure	The 'When to Load References' section properly signals three reference files with clear descriptions of when to load them. However, the main file itself is bloated with content that should be in those reference files — the 7 Known Issues, Common Patterns, and the full Basic Airflow DAG example could all be offloaded, keeping the SKILL.md as a lean overview.	2 / 3
	Total	8 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
metadata_version	'metadata.version' is missing	Warning

	Total	10 / 11 Passed

Repository: secondsky/claude-skills
Commit: 88da5ff

Reviewed: 17 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.