CtrlK
BlogDocsLog inGet started
Tessl Logo

wandb-experiment-logger

Wandb Experiment Logger - Auto-activating skill for ML Training. Triggers on: wandb experiment logger, wandb experiment logger Part of the ML Training skill category.

36

1.02x
Quality

3%

Does it follow best practices?

Impact

97%

1.02x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./planned-skills/generated/07-ml-training/wandb-experiment-logger/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

7%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is extremely thin—it essentially just names the skill and its category without describing any concrete capabilities or providing meaningful trigger guidance. The trigger terms are duplicated and miss common user phrasings. It fails to answer both 'what does this do' and 'when should Claude use it' in any meaningful way.

Suggestions

Add specific concrete actions the skill performs, e.g., 'Initializes wandb runs, logs training metrics, tracks hyperparameters, saves model artifacts, and creates experiment comparisons.'

Add an explicit 'Use when...' clause with natural trigger scenarios, e.g., 'Use when the user wants to track ML experiments, log training metrics to Weights & Biases, compare runs, or integrate wandb into their training pipeline.'

Expand trigger terms to include natural variations users would say: 'weights and biases', 'W&B', 'experiment tracking', 'log metrics', 'track training runs', 'wandb.init', '.wandb'.

DimensionReasoningScore

Specificity

The description names the domain ('ML Training') and mentions 'wandb experiment logger' but does not describe any concrete actions. There are no specific capabilities listed like 'log metrics', 'track experiments', 'visualize runs', etc.

1 / 3

Completeness

The 'what' is essentially absent—it only names itself without explaining what it does. The 'when' is limited to a redundant trigger phrase with no explicit 'Use when...' clause describing scenarios that should activate this skill.

1 / 3

Trigger Term Quality

The trigger terms are just 'wandb experiment logger' repeated twice. Missing natural variations users would say like 'weights and biases', 'W&B', 'experiment tracking', 'log training runs', 'wandb logging', 'track metrics'.

1 / 3

Distinctiveness Conflict Risk

The mention of 'wandb' provides some specificity that distinguishes it from generic ML skills, but the lack of concrete actions and the broad 'ML Training' category could cause overlap with other ML training or experiment tracking skills.

2 / 3

Total

5

/

12

Passed

Implementation

0%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is an empty template with no substantive content. It repeatedly names 'wandb experiment logger' without providing any actual instructions, code examples, API usage patterns, or workflows for using Weights & Biases for experiment tracking. It fails on every dimension because it contains zero actionable information.

Suggestions

Add concrete, executable code examples showing wandb.init(), wandb.log(), wandb.config usage, and artifact logging with real Python snippets.

Define a clear workflow for setting up experiment tracking: initialization, metric logging, artifact saving, and run comparison, with validation steps.

Remove all boilerplate sections (Purpose, When to Use, Example Triggers, Capabilities) and replace with actual technical content like a quick-start guide and common patterns.

If advanced topics exist (sweeps, custom charts, model versioning), reference them via clearly signaled links to separate files rather than listing vague capabilities.

DimensionReasoningScore

Conciseness

The content is entirely filler and boilerplate. It explains nothing Claude doesn't already know, repeats 'wandb experiment logger' excessively, and provides zero actual technical content about how to use wandb for experiment logging.

1 / 3

Actionability

There is no concrete code, no commands, no API examples, no configuration snippets—nothing actionable whatsoever. The skill describes what it could do rather than providing any executable guidance.

1 / 3

Workflow Clarity

No workflow, steps, or process is defined. The content only lists vague 'capabilities' like 'provides step-by-step guidance' without actually providing any steps.

1 / 3

Progressive Disclosure

The content is a flat, monolithic block of generic text with no references to external files, no structured sections with real content, and no navigation to deeper resources.

1 / 3

Total

4

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
jeremylongshore/claude-code-plugins-plus-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.