CtrlK
BlogDocsLog inGet started
Tessl Logo

ml-pipeline-workflow

Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows.

53

0.98x
Quality

37%

Does it follow best practices?

Impact

73%

0.98x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/machine-learning-ops/skills/ml-pipeline-workflow/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is competent with a clear structure including both 'what' and 'when' clauses, and it correctly uses third person voice. However, it stays at a relatively high level of abstraction for the MLOps domain without listing enough specific concrete actions or natural trigger terms that users might use. Adding more specific capabilities and user-facing keywords would strengthen it.

Suggestions

Add more specific concrete actions such as 'configure experiment tracking, manage model registries, set up feature stores, implement model monitoring and drift detection'.

Expand trigger terms in the 'Use when' clause to include natural variations like 'machine learning', 'CI/CD for models', 'model serving', 'experiment tracking', 'model monitoring', '.pkl files'.

DimensionReasoningScore

Specificity

Names the domain (MLOps) and lists some actions (data preparation, model training, validation, deployment), but these are fairly high-level stages rather than multiple specific concrete actions like 'create feature stores, version datasets, configure hyperparameter sweeps, set up A/B testing'.

2 / 3

Completeness

Clearly answers both 'what' (build end-to-end MLOps pipelines from data preparation through deployment) and 'when' (explicit 'Use when' clause covering ML pipelines, MLOps practices, and automating training/deployment workflows).

3 / 3

Trigger Term Quality

Includes relevant terms like 'ML pipelines', 'MLOps', 'model training', 'deployment workflows', but misses common user variations such as 'machine learning', 'CI/CD for models', 'model serving', 'experiment tracking', 'feature engineering', 'model registry'.

2 / 3

Distinctiveness Conflict Risk

The MLOps focus provides some distinctiveness, but terms like 'data preparation', 'model training', and 'deployment' could overlap with general data science skills, deployment/DevOps skills, or data engineering skills. The scope is broad enough to potentially conflict with more specialized skills.

2 / 3

Total

9

/

12

Passed

Implementation

7%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is essentially a high-level overview document that describes MLOps concepts Claude already knows, without providing any concrete, executable guidance. It reads like a course outline or documentation index rather than an actionable skill. The code examples are either trivial (a list of stage name strings) or empty placeholders pointing to non-existent reference files, making the entire skill non-actionable.

Suggestions

Replace abstract descriptions with complete, executable pipeline code examples — e.g., a working Airflow DAG or Dagster pipeline definition that can be adapted, not just a list of stage names.

Remove sections that explain concepts Claude already knows (tool comparisons, definitions of canary deployments, what idempotency means) and focus on project-specific conventions or non-obvious implementation details.

Add explicit validation checkpoints with concrete commands — e.g., 'Run `great_expectations checkpoint run data_validation` and only proceed to training if all expectations pass.'

Either provide the referenced bundle files (data-preparation.md, model-training.md, pipeline-dag.yaml.template, etc.) with real content, or inline the essential actionable content directly in the skill.

DimensionReasoningScore

Conciseness

Extremely verbose and padded with information Claude already knows. Lists of orchestration tools, deployment platforms, experiment tracking tools, and best practices like 'modularity' and 'idempotency' are all general knowledge. The 'When to Use This Skill' and 'What This Skill Provides' sections are meta-descriptions that waste tokens without adding actionable value. The entire document reads like a table of contents or course syllabus rather than a skill.

1 / 3

Actionability

Almost no executable code or concrete commands. The Python code block is just a list of strings. The YAML example is a skeleton. Multiple code blocks contain only comments pointing to other files (e.g., '# See references/model-training.md'). The troubleshooting section gives vague advice like 'Check dependencies and data availability.' Nothing is copy-paste ready or executable.

1 / 3

Workflow Clarity

While the production workflow lists four phases with sub-steps, these are entirely abstract descriptions with no concrete commands, validation checkpoints, or feedback loops. There's no explicit validation step that gates progression (e.g., 'only proceed if X passes'). For a skill involving multi-step pipeline operations, the absence of any concrete validation or error recovery mechanism is a significant gap.

1 / 3

Progressive Disclosure

The skill references external files in references/ and assets/ directories with clear descriptions, which is good structure. However, no bundle files are provided, so these references are broken. The main file itself is a monolithic wall of text that includes extensive content (best practices, integration points, troubleshooting) that should be in reference files. The 'Progressive Disclosure' section ironically describes levels of complexity rather than implementing progressive disclosure.

2 / 3

Total

5

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
Dicklesworthstone/pi_agent_rust
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.