skill-creator

Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.

1.87x

Quality

—

Does it follow best practices?

Impact

88%

1.87x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Quality

Content

62%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

A well-sequenced, largely actionable skill body with strong workflow clarity, but it is weakened by chatty padding, a ~485-line length, and several references to bundle paths (eval-viewer/, agents/) that do not exist.

Suggestions

Tighten or remove the conversational asides (e.g. 'Cool? Cool.', the plumber/grandparent paragraph, the all-caps apology) and the end-of-file restatement of the core loop to reduce token cost — these drag down conciseness.

Fix the broken references that hurt actionability and navigation: either bundle the missing 'eval-viewer/generate_review.py' and 'agents/grader.md', 'agents/comparator.md', 'agents/analyzer.md' files, or update the paths to point at the scripts/ and references/ files that actually ship in the skill.

Consider moving the long Description Optimization and Claude.ai/Cowork environment-specific sections into a reference file to keep SKILL.md closer to a lean overview with well-signaled one-level-deep pointers.

Dimension	Reasoning	Score
Conciseness	The body is mostly efficient, actionable guidance, but is padded with chatty asides ('Cool? Cool.', the plumber/grandparent tangent, 'Sorry in advance but I'm gonna go all caps here') and a redundant restatement of the core loop at the end that could be tightened.	2 / 3
Actionability	Many concrete, copy-paste-ready commands with full flags and JSON examples, but the central viewer step repeatedly references 'eval-viewer/generate_review.py' and grading references 'agents/grader.md' — files that do not exist in the bundle, so the guidance is incomplete as written.	2 / 3
Workflow Clarity	Multi-step processes are clearly sequenced (Step 1–5 for running evals, the iteration loop, description-optimization Step 1–4) with explicit validation/feedback checkpoints (grader, user review, 'repeat until satisfied' stop conditions).	3 / 3
Progressive Disclosure	Good section structure and one genuine one-level-deep reference ('references/schemas.md'), but the body also points to a missing 'agents/' directory (grader.md, comparator.md, analyzer.md) and a missing 'eval-viewer/' directory — dead references that break navigation against the actual bundle.	2 / 3
	Total	9 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A strong, third-person description that states concrete capabilities and provides an explicit 'Use when...' trigger clause covering the main use cases. Trigger terms are natural and the niche is distinct.

Dimension	Reasoning	Score
Specificity	Lists multiple concrete actions — 'Create new skills, modify and improve existing skills, and measure skill performance' plus 'run evals', 'benchmark skill performance', 'optimize a skill's description' — rather than vague language.	3 / 3
Completeness	Explicitly answers both what it does and when to use it via the 'Use when users want to create a skill from scratch, edit, or optimize...' trigger clause.	3 / 3
Trigger Term Quality	Covers natural user phrasings like 'create a skill from scratch', 'edit, or optimize an existing skill', 'run evals', 'benchmark skill performance', and 'optimize a skill's description for better triggering accuracy'.	3 / 3
Distinctiveness Conflict Risk	Occupies a clear niche (skill authoring/optimization) with distinct triggers unlikely to collide with other skills.	3 / 3
	Total	12 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 16 / 16 Passed

Validation for skill structure

No warnings or errors.

Repository: anthropics/claude-plugins-official
Commit: 30a213f

Reviewed: 3 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.