Optimize your skills and tiles: review SKILL.md quality, generate eval scenarios, run evals, compare across models, diagnose gaps, and re-run until scores improve.
86
91%
Does it follow best practices?
Impact
86%
1.22xAverage score across 29 eval scenarios
Advisory
Suggest reviewing before use
I just ran an activation eval on my tile and most of the scenarios show no skill fired at all (— across the routing table). Some of my skills never appear as activated in any scenario.
I'm not sure how to read this:
Help me work through this so I can decide what to do next — rewrite descriptions, add scenarios, or accept the result.
Walk me through how to interpret the zero-activation result. For each — row, help me decide whether it's a routing gap or out of scope. For any skill that never fired, propose what to do about it.
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
scenario-19
scenario-20
scenario-21
scenario-22
scenario-23
scenario-24
scenario-25
scenario-26
scenario-27
scenario-28
scenario-29
skills
compare-skill-model-performance
optimize-skill-instructions
references
optimize-skill-performance
optimize-skill-performance-and-instructions