Optimize your skills and tiles: review SKILL.md quality, generate eval scenarios, run evals, compare across models, diagnose gaps, and re-run until scores improve.
88
93%
Does it follow best practices?
Impact
88%
1.07xAverage score across 24 eval scenarios
Passed
No known issues
This plugin was archived by the owner on May 19, 2026
Reason: Tile archived: Superceded by tessl/skill-optimizer - go to https://tessl.io/registry/tessl/skill-optimizer
94%
Run the full optimization cycle for a tile — review best practices, generate eval scenarios, run evals, diagnose gaps, fix, and re-run until scores improve. Use when someone says "optimize my skill", "improve my tile", "run evals", "benchmark my tile", or wants to measure and improve how well a tile helps agents solve tasks.
90%
Generate eval scenarios from a tile, run baseline evals, and present results. Use when setting up evaluation pipelines, running benchmarks, generating test scenarios for a tile, or measuring how well a skill helps agents solve tasks.
90%
Run task evals, analyze results, diagnose failures, apply targeted fixes, and re-run to verify improvements. Use when debugging evaluation scores, fixing failing or regressed criteria, improving tile content after an eval run, or iterating on agent performance test results.
85%
Run task evals across multiple Claude models, compare results side-by-side, and optimise. Use when you want to understand how a skill performs across different models, identify model-specific gaps versus universal tile issues, or validate a skill before publishing it to the registry.
100%
Review and improve your SKILL.md with actionable recommendations. Reads skill bundle (SKILL.md + related docs), validates syntax, explains rubric, shows before/after scores. Use when reviewing skill quality, improving a skill file, checking skill scoring, making your skill better, or learning the skill rubric. This is the standalone review skill — for the full optimization cycle (review + evals + improve), use the `optimize-skill-performance-and-instructions` skill instead.
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that excels across all dimensions. It provides specific actions, natural trigger terms, explicit 'Use when...' guidance, and even proactively distinguishes itself from a related skill to prevent selection conflicts. The description is concise yet comprehensive.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'Reads skill bundle', 'validates syntax', 'explains rubric', 'shows before/after scores'. These are clear, actionable capabilities. | 3 / 3 |
Completeness | Clearly answers both what ('Review and improve your SKILL.md with actionable recommendations') and when ('Use when reviewing skill quality, improving a skill file...'). Includes explicit 'Use when...' clause with multiple triggers. | 3 / 3 |
Trigger Term Quality | Includes natural keywords users would say: 'reviewing skill quality', 'improving a skill file', 'checking skill scoring', 'making your skill better', 'learning the skill rubric'. Good coverage of variations. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with clear niche (SKILL.md review specifically). Explicitly differentiates from related skill ('optimize-skill-performance-and-instructions') to prevent conflicts. Unique domain with specific triggers. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
100%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is an excellent skill that practices what it preaches. It demonstrates strong conciseness by avoiding unnecessary explanations, provides highly actionable guidance with concrete commands and examples, has crystal-clear workflow with validation checkpoints and feedback loops, and appropriately references external documentation for detailed examples. The skill also includes thoughtful guidance about trade-offs and when to ask users, showing mature design.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is lean and efficient, avoiding explanations of concepts Claude already knows. Every section serves a purpose with no padding or unnecessary context about what skills are or how reviews work. | 3 / 3 |
Actionability | Provides concrete, executable commands (`tessl skill review`, `ast.parse`, bash commands for orphan detection), specific workflow phases, and copy-paste ready examples with clear before/after patterns. | 3 / 3 |
Workflow Clarity | Eight clearly sequenced phases with explicit validation checkpoints (Phase 4, Phase 8), feedback loops (re-run review to verify), and prioritization guidance (Critical → High → Medium → Low). | 3 / 3 |
Progressive Disclosure | Well-structured with clear sections, appropriate reference to external file ([references/REFERENCE.md]) for detailed validation examples, and the content itself teaches progressive disclosure best practices. | 3 / 3 |
Total | 12 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Reviewed
Table of Contents