CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/skill-optimizer

Optimize your skills and tiles: review SKILL.md quality, generate eval scenarios, run evals, compare across models, diagnose gaps, and re-run until scores improve.

91

1.10x
Quality

91%

Does it follow best practices?

Impact

92%

1.10x

Average score across 25 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-19/

{
  "context": "Tests whether the agent uses the correct download strategy when adding scenarios to a tile that already has existing scenarios. The correct strategy is --strategy merge (not --strategy replace and not omitting the flag). The script should also verify the downloaded structure.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Uses --strategy merge",
      "description": "Script includes --strategy merge in the tessl scenario download command",
      "max_score": 25
    },
    {
      "name": "Does NOT use --strategy replace",
      "description": "Script does NOT use --strategy replace anywhere in the download command",
      "max_score": 20
    },
    {
      "name": "Correct base command",
      "description": "Script uses tessl scenario download as the base command for downloading scenarios (not tessl download scenarios or other variants)",
      "max_score": 10
    },
    {
      "name": "Output directory specified",
      "description": "Script includes -o ./shopify-connector/evals/ (or equivalent path to the tile's evals directory) as the output destination",
      "max_score": 12
    },
    {
      "name": "Verification step present",
      "description": "Script includes a command to list files in the evals directory after download — e.g., ls ./shopify-connector/evals/*/task.md or equivalent — to verify the download succeeded",
      "max_score": 15
    },
    {
      "name": "Run ID or --last used",
      "description": "Script references the run using --last or specifies the run ID scen-gen-7742 in the download command",
      "max_score": 10
    },
    {
      "name": "Existing scenarios preserved",
      "description": "Script or accompanying comments explicitly note that existing scenarios will be preserved (not overwritten), either via the --strategy merge flag explanation or a comment in the script",
      "max_score": 8
    }
  ]
}

evals

README.md

tile.json