ml-pipeline

Designs and implements production-grade ML pipeline infrastructure: configures experiment tracking with MLflow or Weights & Biases, creates Kubeflow or Airflow DAGs for training orchestration, builds feature store schemas with Feast, deploys model registries, and automates retraining and validation workflows. Use when building ML pipelines, orchestrating training workflows, automating model lifecycle, implementing feature stores, managing experiment tracking systems, setting up DVC for data versioning, tuning hyperparameters, or configuring MLOps tooling like Kubeflow, Airflow, MLflow, or Prefect.

Quality

88%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, well-structured ML pipeline skill with strong actionability through executable code templates and clear workflow sequencing with validation checkpoints. Its main weaknesses are the missing bundle files that the reference table points to, and some verbosity in the constraints and knowledge reference sections that could be trimmed. The progressive disclosure design is sound in concept but unverifiable without the referenced files.

Suggestions

Remove or significantly trim the 'Knowledge Reference' section — listing tools Claude already knows wastes tokens without adding actionable value.

Create the referenced bundle files (references/feature-engineering.md, etc.) to make the progressive disclosure table functional rather than aspirational.

Dimension	Reasoning	Score
Conciseness	Generally efficient but includes some unnecessary padding — the opening line restating the role description, the Knowledge Reference list of tools Claude already knows, and some comments in code that explain obvious things (e.g., '# Log metrics'). The constraints section has some redundancy between Always and Never lists. Could be tightened.	2 / 3
Actionability	Provides three fully executable, copy-paste-ready code templates (MLflow logging, Kubeflow component, Great Expectations validation) with concrete parameters, imports, and realistic patterns. The output format section specifies exactly what deliverables to produce. Guidance is specific and instructive rather than abstract.	3 / 3
Workflow Clarity	The Core Workflow provides a clear 6-step sequence with explicit validation checkpoints — step 2 mandates schema validation with halt-on-failure, step 6 includes evaluation gates before promotion. The data validation code template reinforces the feedback loop pattern with raise-on-failure semantics. The constraints reinforce never skipping validation or deploying without metrics.	3 / 3
Progressive Disclosure	The reference table with 5 topic-specific files and 'Load When' conditions is well-structured for progressive disclosure. However, no bundle files are provided, meaning all referenced files (references/feature-engineering.md, etc.) don't actually exist, making the references non-functional. The inline code templates are appropriately sized for the main file, but the missing bundle undermines the disclosure strategy.	2 / 3
	Total	10 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that hits all the marks. It provides highly specific capabilities with named tools and concrete actions, includes a comprehensive 'Use when...' clause with abundant natural trigger terms, and occupies a clearly distinct niche in MLOps pipeline infrastructure. The description uses proper third-person voice throughout and is both comprehensive and well-organized.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: configures experiment tracking with MLflow/W&B, creates Kubeflow/Airflow DAGs, builds feature store schemas with Feast, deploys model registries, and automates retraining/validation workflows.	3 / 3
Completeness	Clearly answers both 'what' (designs and implements ML pipeline infrastructure with specific tools and actions) and 'when' (explicit 'Use when...' clause listing multiple trigger scenarios like building ML pipelines, orchestrating training workflows, etc.).	3 / 3
Trigger Term Quality	Excellent coverage of natural terms users would say: ML pipelines, training workflows, model lifecycle, feature stores, experiment tracking, DVC, data versioning, hyperparameters, MLOps, Kubeflow, Airflow, MLflow, Prefect, Weights & Biases, Feast.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive niche focused on MLOps infrastructure and pipeline orchestration with specific tool names (MLflow, Kubeflow, Feast, Airflow, DVC, Prefect). Unlikely to conflict with general ML/data science skills or generic DevOps skills.	3 / 3
	Total	12 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: Jeffallan/claude-skills
Commit: e8be415

Reviewed: about 1 month ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.