tessl/skill-optimizer

Optimize your skills and tiles: review SKILL.md quality, generate eval scenarios, run evals, compare across models, diagnose gaps, and re-run until scores improve.

1.22x

Quality

91%

Does it follow best practices?

Impact

86%

1.22x

Average score across 29 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Pre-Publish Skill Reachability Check

Name: tessl/skill-optimizer
Rating: 86.50999999999999 (1 reviews)
Author: tessl

Problem Description

I'm about to publish my tile and I have 4 skills inside it. Before I commit to a slow content eval that will take a few hours, I want a fast sanity check: are my skills actually reachable from the kinds of questions real users would ask?

In other words — if someone phrases a request the way I expect them to, will Claude pick the right skill out of my tile, or will it just answer from scratch and ignore the tile entirely?

What's the fastest way to verify this, and what should I do with the result before I run the full eval?

Output Specification

Tell me the fastest way to run this sanity check, what command to use, how to read the output, and what counts as "passing" the check vs. needing fixes before I move on to the full eval.

evals

scenario-1

scenario-2

scenario-3

scenario-4

scenario-5

scenario-6

scenario-7

scenario-8

scenario-9

scenario-10

scenario-11

scenario-12

scenario-13

scenario-14

scenario-15

scenario-16

scenario-17

scenario-18

scenario-19

scenario-20

scenario-21

scenario-22

scenario-23

scenario-24

scenario-25

scenario-26

scenario-27

scenario-28

criteria.json

task.md

scenario-29

skills

README.md

tile.json

tessl/skill-optimizer

task.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-28/

Pre-Publish Skill Reachability Check

Problem Description

Output Specification

task.mdevals/scenario-28/