CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/skill-optimizer

Optimize your skills and tiles: review SKILL.md quality, generate eval scenarios, run evals, compare across models, diagnose gaps, and re-run until scores improve.

88

1.07x
Quality

93%

Does it follow best practices?

Impact

88%

1.07x

Average score across 24 eval scenarios

SecuritybySnyk

Passed

No known issues

This plugin was archived by the owner on May 19, 2026

Reason: Tile archived: Superceded by tessl/skill-optimizer - go to https://tessl.io/registry/tessl/skill-optimizer

Overview
Quality
Evals
Security
Files

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid description that clearly communicates both what the skill does and when to use it, with an explicit 'Use when...' clause containing relevant trigger terms. The main weakness is the use of domain-specific jargon ('tile') without clarification, which slightly reduces specificity for those unfamiliar with the terminology. Overall, it performs well across all dimensions and would be distinguishable in a large skill library.

DimensionReasoningScore

Specificity

Names several actions ('generate eval scenarios', 'run baseline evals', 'present results') but uses domain-specific jargon like 'tile' without explanation, and the actions are somewhat high-level rather than deeply concrete.

2 / 3

Completeness

Clearly answers both 'what' (generate eval scenarios from a tile, run baseline evals, present results) and 'when' (explicit 'Use when...' clause covering evaluation pipelines, benchmarks, test scenarios, and measuring skill performance).

3 / 3

Trigger Term Quality

Includes strong natural trigger terms: 'evaluation pipelines', 'running benchmarks', 'test scenarios', 'tile', 'measuring', 'eval scenarios'. These cover multiple ways a user might phrase requests related to evaluation workflows.

3 / 3

Distinctiveness Conflict Risk

The combination of 'tile', 'eval scenarios', 'baseline evals', and 'skill helps agents solve tasks' creates a very specific niche that is unlikely to conflict with generic testing or benchmarking skills.

3 / 3

Total

11

/

12

Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured orchestration skill that excels at workflow clarity and progressive disclosure. The scope selection mechanism and phase mapping are particularly strong. The main weakness is that actionability depends almost entirely on the referenced phase files — the SKILL.md itself provides no executable examples or complete commands, making it hard to evaluate whether the full pipeline is truly copy-paste ready without seeing those references.

Suggestions

Include at least one complete CLI command example inline (e.g., a full `tessl scenario generate` invocation with typical flags) so the skill is partially actionable even without loading reference files.

DimensionReasoningScore

Conciseness

The content is lean and well-structured. It doesn't explain concepts Claude already knows, avoids unnecessary padding, and every section serves a clear purpose. Time expectations and scope mapping are genuinely useful additions, not filler.

3 / 3

Actionability

The skill provides a clear decision framework (scope table, phase mapping) and references specific CLI commands like `tessl scenario generate` and `tessl eval run`, but all actual procedures are delegated to reference files. The SKILL.md itself contains no executable code or complete command examples — it's an orchestration overview that depends entirely on external files for concrete guidance.

2 / 3

Workflow Clarity

The multi-step workflow is clearly sequenced across 6 numbered phases with explicit scope-to-phase mapping. Phase 3 includes quality-check validation (rubric anti-patterns), and the overall flow has clear entry/exit conditions and a 'when to stop' section. The scope selection table elegantly handles partial runs.

3 / 3

Progressive Disclosure

Excellent progressive disclosure — the SKILL.md serves as a concise overview with one-level-deep references to 6 phase-specific files. Each reference is clearly signaled with a brief description of what the phase covers, and the instruction to skip loading unused reference files for partial runs is a thoughtful touch.

3 / 3

Total

11

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents