ML engineering skill for productionizing models, building MLOps pipelines, and integrating LLMs. Covers model deployment, feature stores, drift monitoring, RAG systems, and cost optimization. Use when the user asks about deploying ML models to production, setting up MLOps infrastructure (MLflow, Kubeflow, Kubernetes, Docker), monitoring model performance or drift, building RAG pipelines, or integrating LLM APIs with retry logic and cost controls. Focused on production and operational concerns rather than model research or initial training.
87
78%
Does it follow best practices?
Impact
93%
1.57xAverage score across 6 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./engineering-team/senior-ml-engineer/SKILL.mdLLM integration with retry, fallback, and cost tracking
Abstract provider class
100%
100%
Concrete providers
100%
100%
Tenacity retry decorator
0%
100%
Retry parameters
0%
100%
Fallback implementation
0%
100%
tiktoken token counting
0%
0%
Cost calculation
100%
100%
Correct pricing values
50%
100%
Pydantic response model
0%
100%
Response validation used
0%
100%
Response caching
0%
0%
Cost summary output
100%
100%
Model deployment containerization and canary workflow
python:3.11-slim base image
100%
100%
Health check endpoint
37%
100%
Uvicorn CMD
62%
100%
Port 8080 exposed
0%
100%
Model export step
0%
0%
Canary at 5%
100%
100%
1 hour canary window
0%
100%
p95 latency threshold
0%
100%
Error rate threshold
100%
100%
K8s memory limits
100%
100%
K8s CPU limits
100%
100%
K8s readiness probe
100%
100%
RAG pipeline with hybrid search, reranking, and chunking
RecursiveCharacterTextSplitter
0%
0%
Chunking separators
0%
50%
Embedding cache using hash
100%
100%
Batch embedding support
100%
100%
BM25 sparse retrieval
100%
80%
Hybrid score combination
50%
100%
Alpha parameter
100%
100%
Reranking step
90%
100%
Reranker sorts results
100%
100%
Query function integration
100%
100%
Demo runs without errors
100%
100%
Drift detection and alert thresholds
KS test for drift
100%
100%
KS output fields
50%
100%
KS drift flag threshold
100%
100%
p95 latency warning
0%
100%
p95 latency critical
0%
100%
Error rate warning
0%
100%
Error rate critical
0%
100%
PSI warning threshold
0%
100%
PSI critical threshold
0%
100%
Accuracy drop thresholds
0%
100%
Retraining at PSI>0.2
0%
100%
Demo output
100%
100%
MLflow model registry and feature store
Feast Entity definition
0%
100%
Feast FeatureView
0%
100%
FileSource in FeatureView
0%
100%
TTL set on FeatureView
0%
100%
mlflow.start_run used
100%
100%
mlflow.log_metric called
100%
100%
Model logged to MLflow
100%
100%
mlflow.register_model called
40%
100%
PSI drift trigger
80%
100%
Scheduled retrain trigger
100%
100%
Performance drop trigger
100%
100%
Pipeline runs without error
100%
100%
A/B testing with deterministic traffic splitting
Hash-based assignment
100%
100%
Experiment key in hash input
100%
100%
Bucket conversion
100%
100%
Sticky assignment
100%
100%
Configurable control_pct
100%
100%
Returns control/treatment
100%
100%
Primary metric collection
100%
100%
Guardrail metric tracking
100%
100%
p-value < 0.05 threshold
100%
100%
Minimum sample size check
70%
100%
Demo runs without errors
100%
100%
a96cc20
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.