Wandb Experiment Logger - Auto-activating skill for ML Training. Triggers on: wandb experiment logger, wandb experiment logger Part of the ML Training skill category.
36
Quality
3%
Does it follow best practices?
Impact
97%
1.02xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./planned-skills/generated/07-ml-training/wandb-experiment-logger/SKILL.mdQuality
Discovery
7%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is severely underdeveloped, functioning more as a label than a useful skill description. It lacks any concrete actions, has redundant and limited trigger terms, and provides no guidance on when Claude should select this skill. The description would be nearly useless for skill selection among multiple ML-related options.
Suggestions
Add specific actions the skill performs, e.g., 'Logs training metrics, tracks hyperparameters, saves model artifacts, visualizes training curves to Weights & Biases'
Include a 'Use when...' clause with natural trigger terms: 'Use when the user mentions wandb, W&B, weights and biases, experiment tracking, logging training runs, or ML metrics visualization'
Remove the redundant trigger term and expand to include variations users would naturally say like 'track my training', 'log metrics', 'experiment logging'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description only names the tool ('Wandb Experiment Logger') and category ('ML Training') without describing any concrete actions. No verbs or specific capabilities are listed. | 1 / 3 |
Completeness | The description fails to answer 'what does this do' beyond naming itself, and the 'when' clause is just a duplicate trigger term rather than meaningful usage guidance. No explicit 'Use when...' clause exists. | 1 / 3 |
Trigger Term Quality | The trigger terms are redundant ('wandb experiment logger' repeated twice) and overly specific. Missing natural variations users would say like 'log experiments', 'track training', 'weights and biases', 'W&B', or 'experiment tracking'. | 1 / 3 |
Distinctiveness Conflict Risk | The specific mention of 'Wandb' provides some distinctiveness from generic logging skills, but 'ML Training' is broad and could overlap with other ML-related skills. The lack of specific capabilities makes differentiation harder. | 2 / 3 |
Total | 5 / 12 Passed |
Implementation
0%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is an empty template with no actual wandb content. It contains only generic placeholder text that describes what a skill should do without providing any concrete guidance on experiment logging, metric tracking, or wandb API usage. The skill fails on all dimensions as it provides zero actionable information.
Suggestions
Add executable Python code showing wandb.init(), wandb.log(), and wandb.finish() with realistic ML training examples
Include a clear workflow: 1) Initialize run with config, 2) Log metrics during training loop, 3) Log artifacts/models, 4) Finish run with validation
Remove all generic template text (Purpose, When to Use, Capabilities sections) and replace with concise, wandb-specific guidance
Add concrete examples of logging different data types: scalars, images, tables, and model checkpoints
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is entirely filler text with no actual wandb-specific information. It explains generic concepts Claude already knows (what triggers are, what capabilities mean) without providing any concrete wandb logging guidance. | 1 / 3 |
Actionability | No executable code, no wandb API examples, no concrete commands. The content describes rather than instructs - phrases like 'Provides step-by-step guidance' without actually providing any steps. | 1 / 3 |
Workflow Clarity | No workflow is defined. There are no steps for setting up wandb, initializing experiments, logging metrics, or any validation checkpoints. The skill promises guidance but delivers none. | 1 / 3 |
Progressive Disclosure | No structure beyond generic headings. No references to detailed documentation, no links to examples or API references. The content is a template shell with no actual organization of wandb-specific material. | 1 / 3 |
Total | 4 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
f17dd51
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.