CtrlK
BlogDocsLog inGet started
Tessl Logo

wandb-primary

Primary W&B skill for broad or mixed Weights & Biases work: project overviews, W&B runs and artifacts, Weave traces and evaluations, Reports, Signal Builder, and Launch workflows. Use when the task spans multiple W&B surfaces or the user asks generally what is happening in a W&B project.

74

Quality

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Content

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

A highly actionable, well-structured skill body with executable recipes, clear workflow checkpoints, and solid progressive disclosure via real reference files. The main weakness is conciseness: the body is very long and contains some duplicated recipe blocks that could be consolidated.

Suggestions

Remove the duplicated run-counting code block (it appears both in 'Fast recipes' and again in 'Key patterns') — keep a single canonical version and cross-reference it.

Consolidate the duplicated W&B Report authoring code (appears in 'Create a W&B Report' and again in 'Report authoring (W&B Reports)') into one example, pointing to references/REPORTS.md for variants.

Consider moving some of the more specialized inline recipes (e.g. embedding-dimension analysis, model-name extraction) into a reference file to reduce the SKILL.md token footprint while keeping the fast-path recipes up top.

DimensionReasoningScore

Conciseness

Content is mostly efficient domain-specific operational guidance Claude does not already know, but the ~1461-line body contains duplicated recipe blocks (run-count code appears twice; report-authoring code appears twice) that could be tightened.

2 / 3

Actionability

Provides fully executable, copy-paste-ready Python and bash recipes with specific helper imports, concrete commands, and exact API call patterns throughout.

3 / 3

Workflow Clarity

Multi-step processes (Launch, eval analysis, trace counting) are clearly sequenced with explicit validation checkpoints and feedback loops such as 'check first', 'validate immediately', and re-run-on-error rules.

3 / 3

Progressive Disclosure

SKILL.md keeps fast recipes inline and offloads deep API surfaces to real, one-level-deep reference files (REPORTS.md, WANDB_SDK.md, WEAVE_SDK.md, SIGNALS.md, etc.) and helper scripts, all clearly signaled and verified to exist in the bundle.

3 / 3

Total

11

/

12

Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A strong, third-person description that concisely names the concrete W&B surfaces it covers and provides an explicit 'Use when...' trigger. It is specific, complete, and well-distinguished from narrower W&B skills.

DimensionReasoningScore

Specificity

Lists multiple concrete W&B surfaces and actions ('project overviews, W&B runs and artifacts, Weave traces and evaluations, Reports, Signal Builder, and Launch workflows') rather than vague language.

3 / 3

Completeness

Explicitly answers both what it does and when to use it via the 'Use when the task spans multiple W&B surfaces or the user asks generally what is happening in a W&B project' trigger clause.

3 / 3

Trigger Term Quality

Good coverage of natural terms users would say ('Weights & Biases', 'W&B', 'runs', 'artifacts', 'Weave traces', 'evaluations', 'Reports', 'Launch').

3 / 3

Distinctiveness Conflict Risk

Has a clear niche as the broad/mixed multi-surface router skill with a distinct 'spans multiple W&B surfaces' trigger unlikely to fire for narrower W&B skills.

3 / 3

Total

12

/

12

Passed

Validation

93%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation15 / 16 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (1461 lines); consider splitting into references/ and linking

Warning

Total

15

/

16

Passed

Repository
wandb/skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.