tessl i github:jeremylongshore/claude-code-plugins-plus-skills --skill setting-up-experiment-trackingImplement machine learning experiment tracking using MLflow or Weights & Biases. Configures environment and provides code for logging parameters, metrics, and artifacts. Use when asked to "setup experiment tracking" or "initialize MLflow". Trigger with relevant phrases based on skill purpose.
Validation
81%| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
metadata_version | 'metadata' field is not a dictionary | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 13 / 16 Passed | |
Implementation
7%This skill is a template/placeholder with no actionable content. It describes what it would do rather than providing executable code, concrete commands, or specific guidance. The content is padded with generic explanations and repeated information while lacking the actual implementation details needed to set up experiment tracking.
Suggestions
Replace the abstract 'How It Works' section with actual executable code for MLflow and W&B setup (pip install commands, initialization code, logging examples)
Remove redundant overview text and generic sections like 'Prerequisites', 'Instructions', 'Output', and 'Error Handling' that contain only placeholder content
Add concrete, copy-paste ready code examples showing parameter logging, metric logging, and artifact saving for both MLflow and W&B
Include validation steps such as verifying package installation and testing connection to tracking servers
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose with redundant explanations (overview repeated twice), generic filler content ('This skill provides automated assistance'), and explains concepts Claude already knows like what experiment tracking is and basic tool comparisons. | 1 / 3 |
Actionability | No executable code provided despite claiming to 'provide code snippets'. Examples describe what the skill 'will do' rather than showing actual commands or code. The Instructions section is completely generic placeholder text with no concrete guidance. | 1 / 3 |
Workflow Clarity | The 'How It Works' section describes abstract steps without any concrete commands, validation checkpoints, or actual workflow. No feedback loops or error recovery steps for environment configuration which can easily fail. | 1 / 3 |
Progressive Disclosure | Has section headers providing some structure, but content is monolithic with no references to external files. The organization exists but sections contain filler rather than appropriately split content. | 2 / 3 |
Total | 5 / 12 Passed |
Activation
68%The description effectively identifies its specific domain (ML experiment tracking) and names concrete tools and actions. However, it's weakened by the vague filler phrase 'Trigger with relevant phrases based on skill purpose' which adds no value, and it misses common trigger term variations like 'W&B', 'wandb', or 'track my experiments'.
Suggestions
Remove the vague filler 'Trigger with relevant phrases based on skill purpose' and replace with specific trigger terms like 'W&B', 'wandb', 'track experiments', 'log training runs'
Expand the 'Use when' clause to include more natural user phrases such as 'track my model training', 'log experiment results', or 'compare model runs'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'implement machine learning experiment tracking', 'configures environment', 'logging parameters, metrics, and artifacts'. Names specific tools (MLflow, Weights & Biases). | 3 / 3 |
Completeness | Has a clear 'what' (implement ML experiment tracking with specific tools) and includes a 'Use when' clause, but the final sentence 'Trigger with relevant phrases based on skill purpose' is vague filler that doesn't add explicit trigger guidance. | 2 / 3 |
Trigger Term Quality | Includes some natural keywords like 'setup experiment tracking', 'initialize MLflow', but missing common variations users might say like 'W&B', 'wandb', 'track experiments', 'log metrics', or 'ML logging'. | 2 / 3 |
Distinctiveness Conflict Risk | Clear niche focused specifically on ML experiment tracking with named tools (MLflow, Weights & Biases). Unlikely to conflict with general ML skills or other tracking/logging skills due to specific domain focus. | 3 / 3 |
Total | 10 / 12 Passed |
Reviewed
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.