Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows.
39
37%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/machine-learning-ops/skills/ml-pipeline-workflow/SKILL.mdQuality
Discovery
67%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is structurally sound with a clear 'what' and explicit 'Use when' clause, which is its strongest aspect. However, the capabilities listed are high-level pipeline stages rather than specific concrete actions, and the trigger terms could be expanded to cover more natural user language variations. It occupies a reasonable niche but could better differentiate itself from adjacent ML or DevOps skills.
Suggestions
Add more specific concrete actions such as 'configure experiment tracking, set up model registries, implement feature stores, automate retraining schedules' to increase specificity.
Expand trigger terms to include natural variations like 'machine learning pipeline', 'model serving', 'CI/CD for ML', 'experiment tracking', 'model monitoring', and common tool names users might reference.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (MLOps) and lists some actions ('data preparation', 'model training', 'validation', 'production deployment'), but these are high-level pipeline stages rather than multiple specific concrete actions like 'create feature stores, configure hyperparameter tuning, set up model registries, implement A/B testing'. | 2 / 3 |
Completeness | Clearly answers both 'what' (build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment) and 'when' (explicit 'Use when' clause covering creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows). | 3 / 3 |
Trigger Term Quality | Includes relevant keywords like 'ML pipelines', 'MLOps', 'model training', 'deployment workflows', but misses common natural variations users might say such as 'machine learning', 'CI/CD for models', 'model serving', 'experiment tracking', 'feature engineering', 'model registry', or specific tools like 'Kubeflow', 'MLflow'. | 2 / 3 |
Distinctiveness Conflict Risk | The MLOps focus provides some distinctiveness, but terms like 'model training' and 'data preparation' could overlap with general data science or machine learning skills. The description doesn't clearly delineate boundaries from adjacent skills like a general ML skill or a deployment/DevOps skill. | 2 / 3 |
Total | 9 / 12 Passed |
Implementation
7%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is essentially a high-level outline or table of contents for MLOps concepts, not an actionable skill. It extensively describes what an ML pipeline should contain but provides almost no executable code, concrete commands, or specific implementation guidance. The vast majority of content is general knowledge that Claude already possesses, making it a poor use of context window tokens.
Suggestions
Replace the descriptive sections with concrete, executable code examples — e.g., a complete minimal Airflow DAG or Dagster pipeline definition that actually runs, rather than a Python list of stage name strings.
Remove sections that describe concepts Claude already knows (tool listings, generic best practices like 'modularity' and 'idempotency', platform enumerations) and replace with specific implementation patterns with real code.
Add explicit validation checkpoints with actual commands — e.g., 'Run `great_expectations checkpoint run my_checkpoint` and verify all expectations pass before proceeding to training'.
Either provide the referenced bundle files (references/*.md, assets/*.yaml.template) or inline the critical content; currently the skill defers all actionable content to files that don't exist.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose and padded with information Claude already knows. Lists of orchestration tools, deployment platforms, best practices like 'modularity' and 'idempotency' are general knowledge. The 'When to Use This Skill' and 'What This Skill Provides' sections are meta-descriptions that waste tokens describing the skill rather than teaching it. The entire document reads more like a table of contents or marketing overview than actionable instruction. | 1 / 3 |
Actionability | Almost no executable code or concrete commands. The Python code examples are either trivial (a list of stage name strings) or empty comments pointing to other files ('# See references/data-preparation.md'). The YAML example is a skeleton with no real content. The skill describes what to do at a high level but never shows how to actually do any of it. | 1 / 3 |
Workflow Clarity | The Production Workflow section lists phases but provides no concrete commands, no validation checkpoints with actual tools/commands, and no feedback loops for error recovery. The 'Debugging Steps' are generic advice ('check pipeline logs'). For a skill involving complex multi-step pipeline operations, there are no explicit validation gates or retry mechanisms with actual implementation. | 1 / 3 |
Progressive Disclosure | The skill references external files in references/ and assets/ directories with clear signaling, and has a structured section layout. However, no bundle files are provided, so all those references are dead links. The content that is inline is mostly high-level description rather than useful overview content, and the 'Progressive Disclosure' section ironically just lists complexity levels without actually implementing progressive disclosure. | 2 / 3 |
Total | 5 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
34632bc
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.