ml-pipeline

Designs and implements production-grade ML pipeline infrastructure: configures experiment tracking with MLflow or Weights & Biases, creates Kubeflow or Airflow DAGs for training orchestration, builds feature store schemas with Feast, deploys model registries, and automates retraining and validation workflows. Use when building ML pipelines, orchestrating training workflows, automating model lifecycle, implementing feature stores, managing experiment tracking systems, setting up DVC for data versioning, tuning hyperparameters, or configuring MLOps tooling like Kubeflow, Airflow, MLflow, or Prefect.

1.12x

Quality

92%

Does it follow best practices?

Impact

87%

1.12x

Average score across 6 eval scenarios

Securityby

Passed

No known issues

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly articulates specific capabilities with named tools and concrete actions, provides comprehensive trigger terms covering both tool names and task descriptions, and includes an explicit 'Use when...' clause. The description is well-scoped to a distinct MLOps/pipeline infrastructure niche, making it highly distinguishable from adjacent skills like general ML modeling or data analysis.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: configures experiment tracking with MLflow/W&B, creates Kubeflow/Airflow DAGs, builds feature store schemas with Feast, deploys model registries, and automates retraining/validation workflows.	3 / 3
Completeness	Clearly answers both 'what' (designs and implements ML pipeline infrastructure with specific tools and actions) and 'when' (explicit 'Use when...' clause listing multiple trigger scenarios like building ML pipelines, orchestrating training workflows, etc.).	3 / 3
Trigger Term Quality	Excellent coverage of natural terms users would say: ML pipelines, training workflows, model lifecycle, feature stores, experiment tracking, DVC, data versioning, hyperparameters, MLOps, Kubeflow, Airflow, MLflow, Prefect, Weights & Biases, Feast.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive niche focused on MLOps infrastructure and pipeline orchestration with specific tool names (MLflow, Kubeflow, Feast, Airflow, Prefect, DVC). Unlikely to conflict with general coding or data science skills due to the specificity of the domain.	3 / 3
	Total	12 / 12 Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, well-structured skill that provides actionable code templates, clear workflow sequencing with validation checkpoints, and excellent progressive disclosure through a well-organized reference table. Minor weaknesses include a somewhat unnecessary 'Knowledge Reference' keyword dump at the end and a few constraint items that state obvious best practices, but overall the content is highly effective and production-ready.

Dimension	Reasoning	Score
Conciseness	The content is mostly efficient but includes some unnecessary elements — the 'Knowledge Reference' section at the end is just a keyword list that adds no actionable value, and some constraint items restate obvious best practices. The code templates are well-sized but could be slightly tighter.	2 / 3
Actionability	The skill provides three fully executable code templates (MLflow logging, Kubeflow pipeline component, Great Expectations validation) that are copy-paste ready with concrete imports, parameters, and realistic patterns. The constraints and output format sections give specific, actionable guidance.	3 / 3
Workflow Clarity	The core workflow is clearly sequenced with six numbered steps including explicit validation checkpoints (step 2: schema checks with halt-on-failure, step 6: evaluation gates before promotion). The data validation template includes a raise-on-failure pattern, and the constraints reinforce validation-before-training as mandatory.	3 / 3
Progressive Disclosure	Excellent use of a reference table with five clearly signaled one-level-deep references, each with explicit 'Load When' conditions. The main skill provides a concise overview with templates while deferring detailed guidance to topic-specific reference files.	3 / 3
	Total	11 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: jeffallan/claude-skills
Commit: 3d95bb1

Reviewed: 8 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.