Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, update or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
89
85%
Does it follow best practices?
Impact
95%
1.90xAverage score across 3 eval scenarios
Passed
No known issues
Skill authoring conventions
Pushy description
0%
100%
When-to-use in frontmatter only
100%
100%
Description covers what AND when
0%
100%
Imperative voice in body
100%
100%
SKILL.md under 500 lines
100%
100%
Evals without assertions
100%
0%
Realistic test prompts
100%
100%
Evals.json schema
0%
100%
Valid YAML frontmatter
0%
100%
Eval workflow orchestration
Descriptive eval names
0%
100%
Empty assertions array
0%
100%
Correct metadata schema
30%
100%
Same-turn spawn — both directions
0%
100%
Correct baseline type
0%
100%
Workspace sibling convention
40%
100%
generate_review.py for viewer
0%
100%
Viewer before revisions
0%
100%
Timing data capture noted
0%
100%
Description optimization query design
Query count ~20
100%
100%
Has both trigger types
100%
100%
Should-trigger count in range
100%
100%
Should-not-trigger count in range
100%
100%
Concrete query specificity
100%
100%
Near-miss negative queries
100%
100%
HTML eval data replaced
100%
100%
HTML placeholders replaced
100%
100%
run_loop model flag
0%
100%
48aa435
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.