tessl/skill-optimizer

Optimize your skills and tiles: review SKILL.md quality, generate eval scenarios, run evals, compare across models, diagnose gaps, and re-run until scores improve.

1.22x

Quality

91%

Does it follow best practices?

Impact

86%

1.22x

Average score across 29 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Verify Description Edits Still Route Correctly

Name: tessl/skill-optimizer
Rating: 86.50999999999999 (1 reviews)
Author: tessl

Problem Description

I rewrote the descriptions on two of my skills this morning, trying to make them clearer and add some natural trigger phrasings. The new wording feels better to me, but I'm worried I might have accidentally narrowed the trigger surface and broken activation for requests that used to route correctly.

I don't want to run a full content eval just to find this out — that's hours of agent time. I just want to know: do the new descriptions still pick up the right user requests?

What's the fastest way to verify this without rerunning the slow scored evals?

Output Specification

Tell me the right command and workflow for verifying that description edits didn't break routing. Include how to compare against the prior state if possible.

evals

scenario-1

scenario-2

scenario-3

scenario-4

scenario-5

scenario-6

scenario-7

scenario-8

scenario-9

scenario-10

scenario-11

scenario-12

scenario-13

scenario-14

scenario-15

scenario-16

scenario-17

scenario-18

scenario-19

scenario-20

scenario-21

scenario-22

scenario-23

scenario-24

scenario-25

scenario-26

scenario-27

scenario-28

scenario-29

skills

tessl/skill-optimizer

task.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-29/

Verify Description Edits Still Route Correctly

Problem Description

Output Specification

task.mdevals/scenario-29/