CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/skill-optimizer

Optimize your skills and plugins: review SKILL.md quality, generate eval scenarios, run evals, compare across models, diagnose gaps, and re-run until scores improve.

85

1.08x
Quality

88%

Does it follow best practices?

Impact

85%

1.08x

Average score across 29 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-27/

{
  "context": "Tests whether the agent correctly performs pre-flight verification before running multi-model plugin evals, including finding the plugin safely, verifying scenarios and login, confirming the model configuration, and communicating time expectations.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Excludes .tessl cache",
      "description": "The plugin search command or script explicitly excludes paths containing .tessl/ (e.g., uses -not -path '*/.tessl/*' or equivalent)",
      "max_score": 10
    },
    {
      "name": ".tessl/plugins warning",
      "description": "The output warns the user if their plugin path is inside a .tessl/plugins/ directory and explains this is the local install cache (not usable for evals)",
      "max_score": 10
    },
    {
      "name": "Scenario existence check",
      "description": "The verification checks for the presence of eval scenario files under the plugin's evals/ directory (e.g., checks for evals/*/task.md or equivalent)",
      "max_score": 10
    },
    {
      "name": "Scenario generation guidance",
      "description": "If no scenarios are found, the output provides the tessl scenario generate command (not just a generic message) with the plugin path argument",
      "max_score": 10
    },
    {
      "name": "Login verification",
      "description": "The output includes a step to run tessl whoami to verify login status before proceeding",
      "max_score": 10
    },
    {
      "name": "No --workspace flag",
      "description": "For plugin evals, --workspace is not used. The criterion checks the agent correctly omits it for plugin eval workflows. The output does NOT mention or include a --workspace flag in any tessl eval context.",
      "max_score": 8
    },
    {
      "name": "Default model names",
      "description": "The output lists all three default model identifiers: claude-haiku-4-5, claude-sonnet-4-6, and claude-opus-4-8",
      "max_score": 10
    },
    {
      "name": "Model subset confirmation",
      "description": "The output asks the user to confirm whether to use all three models or a subset (not just assumes all three)",
      "max_score": 8
    },
    {
      "name": "Time estimate provided",
      "description": "The output includes a time estimate per scenario per model (10-15 minutes) or a total estimate formula (e.g., N scenarios × 30-45 minutes)",
      "max_score": 12
    },
    {
      "name": "Run count option",
      "description": "The output asks whether to run each scenario once or multiple times, and mentions that 3 runs is recommended before publishing",
      "max_score": 12
    }
  ]
}

README.md

tile.json