CtrlK
BlogDocsLog inGet started
Tessl Logo

ml-pipeline-workflow

Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows.

58

0.98x
Quality

43%

Does it follow best practices?

Impact

73%

0.98x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/machine-learning-ops/skills/ml-pipeline-workflow/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is structurally sound with a clear 'what' and explicit 'when' clause, which is its strongest aspect. However, it operates at a high level of abstraction—listing pipeline phases rather than concrete actions—and could benefit from more specific trigger terms and concrete capabilities to better differentiate it from adjacent skills in data science, DevOps, or data engineering.

Suggestions

Add more specific concrete actions such as 'configure experiment tracking, set up model registries, implement A/B testing, create feature stores, build CI/CD for models'.

Expand trigger terms in the 'Use when' clause to include natural user phrases like 'machine learning', 'model registry', 'experiment tracking', 'model serving', 'ML CI/CD', or specific tools like 'MLflow', 'Kubeflow'.

DimensionReasoningScore

Specificity

Names the domain (MLOps) and lists high-level stages (data preparation, model training, validation, deployment), but these are broad phases rather than multiple specific concrete actions like 'fill forms, merge documents'. It doesn't specify concrete tools, formats, or granular operations.

2 / 3

Completeness

Clearly answers both 'what' (build end-to-end MLOps pipelines from data preparation through deployment) and 'when' with an explicit 'Use when...' clause covering ML pipelines, MLOps practices, and automating training/deployment workflows.

3 / 3

Trigger Term Quality

Includes relevant terms like 'ML pipelines', 'MLOps', 'model training', 'deployment workflows', but misses common user variations such as 'machine learning', 'CI/CD for models', 'model serving', 'experiment tracking', 'feature engineering', or specific framework names users might mention.

2 / 3

Distinctiveness Conflict Risk

The MLOps focus provides some distinctiveness, but terms like 'data preparation', 'model training', and 'deployment' could overlap with general data science skills, deployment/DevOps skills, or data engineering skills. The scope is broad enough to potentially conflict with more specialized skills.

2 / 3

Total

9

/

12

Passed

Implementation

20%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill reads like a high-level overview document or course outline rather than an actionable skill file. It extensively catalogs MLOps concepts, tools, and patterns that Claude already knows, while providing almost no executable code, specific commands, or concrete implementation details. The content would benefit enormously from being condensed to a lean overview with actual working examples and deferring detailed content to the referenced files.

Suggestions

Replace the abstract descriptions with executable, copy-paste-ready code examples — e.g., a minimal working Airflow DAG, a concrete MLflow experiment tracking snippet, or a real validation script with Great Expectations.

Cut sections that merely list concepts Claude already knows (Integration Points, Deployment Strategies bullet lists, 'When to Use This Skill') and replace with a concise overview pointing to reference files.

Add explicit validation checkpoints with concrete commands in the workflow (e.g., 'Run `great_expectations checkpoint run my_checkpoint` — only proceed if all expectations pass').

Move the bulk of the content into the referenced files (data-preparation.md, model-training.md, etc.) and keep SKILL.md as a lean entry point with one concrete quick-start example and clear navigation.

DimensionReasoningScore

Conciseness

Extremely verbose and padded with high-level descriptions Claude already knows. Sections like 'When to Use This Skill', 'What This Skill Provides', 'Integration Points', and 'Common Patterns' are largely enumerations of concepts (canary deployments, A/B testing, DAG orchestration) without adding actionable knowledge. The document reads like a table of contents or course syllabus rather than a lean skill file.

1 / 3

Actionability

Almost no executable code or concrete commands. The Python snippets are either trivial lists of strings, comments pointing to other files ('See assets/...'), or empty pseudocode placeholders. There are no copy-paste-ready examples of actually building a pipeline step, configuring an orchestrator, or deploying a model. The 'Real-time Feature Pipeline' and 'Continuous Training' sections are literally just comments.

1 / 3

Workflow Clarity

The Production Workflow section provides a clear four-phase sequence with sub-steps, and the Debugging Steps section offers a reasonable troubleshooting sequence. However, there are no explicit validation checkpoints with concrete commands, no feedback loops for error recovery, and the validation phase is described abstractly ('Run validation test suite') rather than with specific tools or commands.

2 / 3

Progressive Disclosure

References to external files (references/ directory, assets/ directory) are clearly signaled and appear to be one level deep, which is good. However, the main SKILL.md itself is a monolithic wall of text with extensive inline content that should be in those reference files. The 'Progressive Disclosure' section ironically describes levels of complexity rather than actually implementing progressive disclosure in the document structure.

2 / 3

Total

6

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
Dicklesworthstone/pi_agent_rust
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.