CtrlK
CommunityDocumentationLog inGet started
Tessl Logo

senior-ml-engineer

tessl i github:alirezarezvani/claude-skills --skill senior-ml-engineer

World-class ML engineering skill for productionizing ML models, MLOps, and building scalable ML systems. Expertise in PyTorch, TensorFlow, model deployment, feature stores, model monitoring, and ML infrastructure. Includes LLM integration, fine-tuning, RAG systems, and agentic AI. Use when deploying ML models, building ML platforms, implementing MLOps, or integrating LLMs into production systems.

52%

Overall

SKILL.md
Review
Evals

Validation

88%
CriteriaDescriptionResult

metadata_version

'metadata' field is not a dictionary

Warning

license_field

'license' field is missing

Warning

Total

14

/

16

Passed

Implementation

7%

This skill is a verbose collection of buzzwords and generic best practices rather than actionable ML engineering guidance. It lists technologies and concepts without providing concrete implementation details, executable code, or clear workflows. The content describes what a senior ML engineer should know rather than teaching Claude how to perform specific ML engineering tasks.

Suggestions

Replace fake script references with actual executable code examples for specific tasks (e.g., a real model deployment script, actual RAG implementation code)

Remove generic sections like 'Senior-Level Responsibilities', 'Best Practices', and 'Tech Stack' lists - Claude already knows these concepts

Add concrete step-by-step workflows with validation checkpoints for key tasks like 'deploying a PyTorch model to Kubernetes' or 'building a RAG pipeline'

Provide specific, copy-paste-ready code snippets for common ML operations instead of abstract pattern descriptions

DimensionReasoningScore

Conciseness

Extremely verbose with extensive lists of concepts Claude already knows (what TDD is, what code reviews are, generic best practices). The 'Senior-Level Responsibilities' section is entirely unnecessary padding about soft skills. Tech stack lists and generic performance targets add no actionable value.

1 / 3

Actionability

Despite showing bash commands, they reference non-existent scripts (model_deployment_pipeline.py, rag_system_builder.py). No actual executable code is provided - just abstract descriptions like 'Horizontal scaling architecture' and 'Fault-tolerant design' without concrete implementation guidance.

1 / 3

Workflow Clarity

No clear workflows for any ML task. Lists concepts like 'Model serving with low latency' and 'A/B testing infrastructure' without explaining how to actually implement them. No validation checkpoints, no step-by-step processes, no error handling guidance for complex ML operations.

1 / 3

Progressive Disclosure

References external files (references/mlops_production_patterns.md, etc.) which is good structure, but the main file itself is bloated with content that should either be in those references or removed entirely. The overview doesn't provide enough actionable quick-start content.

2 / 3

Total

5

/

12

Passed

Activation

92%

This is a strong, well-crafted description that clearly articulates capabilities with specific technologies and includes explicit trigger guidance. The main weakness is its broad scope spanning traditional MLOps through to agentic AI, which could create selection conflicts with more specialized skills in either domain.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions and technologies: 'productionizing ML models', 'model deployment', 'feature stores', 'model monitoring', 'ML infrastructure', 'LLM integration', 'fine-tuning', 'RAG systems', and 'agentic AI'.

3 / 3

Completeness

Clearly answers both what (productionizing ML models, MLOps, scalable ML systems, various technologies) AND when with explicit 'Use when...' clause covering deploying ML models, building ML platforms, implementing MLOps, or integrating LLMs.

3 / 3

Trigger Term Quality

Excellent coverage of natural terms users would say: 'PyTorch', 'TensorFlow', 'MLOps', 'ML models', 'feature stores', 'LLM', 'fine-tuning', 'RAG systems', 'agentic AI', 'ML platforms' - these are all terms practitioners naturally use.

3 / 3

Distinctiveness Conflict Risk

While specific to ML engineering, the broad scope covering both traditional ML and LLM/agentic AI could overlap with separate LLM-specific skills or general Python/data science skills. The combination of MLOps AND LLM integration in one skill creates potential conflict with more focused skills.

2 / 3

Total

11

/

12

Passed

Reviewed

Table of Contents

ValidationImplementationActivation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.