CtrlK
BlogDocsLog inGet started
Tessl Logo

ml-pipeline-workflow

Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows.

Install with Tessl CLI

npx tessl i github:Dicklesworthstone/pi_agent_rust --skill ml-pipeline-workflow
What are skills?

66

0.98x

Quality

56%

Does it follow best practices?

Impact

73%

0.98x

Average score across 3 eval scenarios

Optimize this skill with Tessl

npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/machine-learning-ops/skills/ml-pipeline-workflow/SKILL.md
SKILL.md
Review
Evals

Evaluation results

64%

8%

Customer Churn Prediction: Data Preparation Pipeline

Data preparation pipeline with validation and lineage

Criteria
Without context
With context

Data validation library

0%

0%

Dataset versioning tool

0%

0%

Feature engineering documentation

100%

100%

Data lineage tracking

70%

80%

Stage-level metric logging

100%

100%

Pipeline stage ordering

100%

100%

Stage modularity

100%

100%

Idempotency design

0%

0%

Train/val/test split

0%

50%

Input/output validation at boundaries

100%

100%

Failure handling

62%

100%

Without context: $0.6694 · 3m 12s · 27 turns · 74 in / 11,651 out tokens

With context: $0.8624 · 3m 43s · 27 turns · 189 in / 14,591 out tokens

56%

-17%

House Price Model: Reproducible Training Pipeline

Model training pipeline with experiment tracking and registry

Criteria
Without context
With context

Experiment tracking tool

100%

0%

Model registry usage

100%

16%

Named pipeline stages

100%

100%

Per-stage metric logging

20%

30%

Data version tracking

100%

100%

Code version tracking

25%

87%

Model version tracking

100%

100%

Hyperparameter logging

100%

100%

Validation stage present

100%

100%

Failure handling

62%

62%

Stage idempotency

0%

0%

Without context: $1.1116 · 4m 47s · 41 turns · 50 in / 14,994 out tokens

With context: $0.7214 · 2m 55s · 31 turns · 349 in / 10,070 out tokens

100%

7%

Fraud Detection Model: Safe Production Rollout

Production deployment strategy with canary and rollback

Criteria
Without context
With context

Shadow deployment stage

100%

100%

Canary release stage

100%

100%

A/B testing infrastructure

100%

100%

Automated rollback trigger

100%

100%

Rollback mechanism

100%

100%

Latency monitoring

100%

100%

Throughput monitoring

37%

100%

Model performance drift monitoring

100%

100%

Separated training/serving infra

100%

100%

No direct hard cutover

100%

100%

Production traffic validation

100%

100%

Serving platform reference

50%

100%

Without context: $0.5862 · 3m 52s · 21 turns · 76 in / 12,146 out tokens

With context: $0.7417 · 3m 38s · 28 turns · 289 in / 12,261 out tokens

Evaluated
Agent
Claude Code
Model
Claude Sonnet 4.6

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.