Wandb Experiment Logger - Auto-activating skill for ML Training. Triggers on: wandb experiment logger, wandb experiment logger Part of the ML Training skill category.
36
3%
Does it follow best practices?
Impact
97%
1.02xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./planned-skills/generated/07-ml-training/wandb-experiment-logger/SKILL.mdQuality
Discovery
7%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is extremely thin—it essentially just names the skill and its category without describing any concrete capabilities or providing meaningful trigger guidance. The trigger terms are duplicated and miss common user phrasings. It fails to answer both 'what does this do' and 'when should Claude use it' in any meaningful way.
Suggestions
Add specific concrete actions the skill performs, e.g., 'Initializes wandb runs, logs training metrics, tracks hyperparameters, saves model artifacts, and creates experiment comparisons.'
Add an explicit 'Use when...' clause with natural trigger scenarios, e.g., 'Use when the user wants to track ML experiments, log training metrics to Weights & Biases, compare runs, or integrate wandb into their training pipeline.'
Expand trigger terms to include natural variations users would say: 'weights and biases', 'W&B', 'experiment tracking', 'log metrics', 'track training runs', 'wandb.init', '.wandb'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names the domain ('ML Training') and mentions 'wandb experiment logger' but does not describe any concrete actions. There are no specific capabilities listed like 'log metrics', 'track experiments', 'visualize runs', etc. | 1 / 3 |
Completeness | The 'what' is essentially absent—it only names itself without explaining what it does. The 'when' is limited to a redundant trigger phrase with no explicit 'Use when...' clause describing scenarios that should activate this skill. | 1 / 3 |
Trigger Term Quality | The trigger terms are just 'wandb experiment logger' repeated twice. Missing natural variations users would say like 'weights and biases', 'W&B', 'experiment tracking', 'log training runs', 'wandb logging', 'track metrics'. | 1 / 3 |
Distinctiveness Conflict Risk | The mention of 'wandb' provides some specificity that distinguishes it from generic ML skills, but the lack of concrete actions and the broad 'ML Training' category could cause overlap with other ML training or experiment tracking skills. | 2 / 3 |
Total | 5 / 12 Passed |
Implementation
0%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is an empty template with no substantive content. It repeatedly names 'wandb experiment logger' without providing any actual instructions, code examples, API usage patterns, or workflows for using Weights & Biases for experiment tracking. It fails on every dimension because it contains zero actionable information.
Suggestions
Add concrete, executable code examples showing wandb.init(), wandb.log(), wandb.config usage, and artifact logging with real Python snippets.
Define a clear workflow for setting up experiment tracking: initialization, metric logging, artifact saving, and run comparison, with validation steps.
Remove all boilerplate sections (Purpose, When to Use, Example Triggers, Capabilities) and replace with actual technical content like a quick-start guide and common patterns.
If advanced topics exist (sweeps, custom charts, model versioning), reference them via clearly signaled links to separate files rather than listing vague capabilities.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is entirely filler and boilerplate. It explains nothing Claude doesn't already know, repeats 'wandb experiment logger' excessively, and provides zero actual technical content about how to use wandb for experiment logging. | 1 / 3 |
Actionability | There is no concrete code, no commands, no API examples, no configuration snippets—nothing actionable whatsoever. The skill describes what it could do rather than providing any executable guidance. | 1 / 3 |
Workflow Clarity | No workflow, steps, or process is defined. The content only lists vague 'capabilities' like 'provides step-by-step guidance' without actually providing any steps. | 1 / 3 |
Progressive Disclosure | The content is a flat, monolithic block of generic text with no references to external files, no structured sections with real content, and no navigation to deeper resources. | 1 / 3 |
Total | 4 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
3076d78
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.