CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/skill-optimizer

Optimize your skills and tiles: review SKILL.md quality, generate eval scenarios, run evals, compare across models, diagnose gaps, and re-run until scores improve.

88

1.07x
Quality

94%

Does it follow best practices?

Impact

88%

1.07x

Average score across 24 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-8/

{
  "context": "Tests whether the agent recommends and applies progressive disclosure principles: identifying content from SKILL.md that duplicates detail already in REFERENCE.md and recommending linking instead of inlining. The revised SKILL.md should be significantly shorter by leveraging the existing reference file.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Linking over inlining",
      "description": "Recommendations explicitly state that content should link to REFERENCE.md rather than being inlined in SKILL.md",
      "max_score": 15
    },
    {
      "name": "Reference file identified",
      "description": "Recommendations specifically identify REFERENCE.md (by name) as the file to link to for detailed content",
      "max_score": 10
    },
    {
      "name": "Severity mappings removed",
      "description": "Revised SKILL.md removes inline severity color/urgency mapping tables (they exist in REFERENCE.md) and links there instead",
      "max_score": 10
    },
    {
      "name": "Flag tables removed",
      "description": "Revised SKILL.md removes detailed flag reference tables (these are in REFERENCE.md) — tables like the Slack/Email/PagerDuty flag lists",
      "max_score": 10
    },
    {
      "name": "Template list removed",
      "description": "Revised SKILL.md removes the inline list of available email templates (already in REFERENCE.md) and links there instead",
      "max_score": 8
    },
    {
      "name": "SKILL.md substantially shorter",
      "description": "Output SKILL.md is at least 40% shorter in line count than the input SKILL.md",
      "max_score": 12
    },
    {
      "name": "Core examples preserved",
      "description": "Output SKILL.md retains the main quickstart command examples for each channel (the bash code blocks showing basic usage)",
      "max_score": 10
    },
    {
      "name": "Before/after shown",
      "description": "recommendations.md includes before/after text for at least two of the proposed changes",
      "max_score": 10
    },
    {
      "name": "WHY explained",
      "description": "recommendations.md explains why progressive disclosure (linking vs inlining) improves the skill — not just what to change",
      "max_score": 10
    },
    {
      "name": "REFERENCE.md not modified",
      "description": "REFERENCE.md content is not changed or recreated — only SKILL.md is produced as modified output",
      "max_score": 5
    }
  ]
}

evals

README.md

tile.json