CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/skill-optimizer

Optimize your skills and tiles: review SKILL.md quality, generate eval scenarios, run evals, compare across models, diagnose gaps, and re-run until scores improve.

88

1.07x
Quality

94%

Does it follow best practices?

Impact

88%

1.07x

Average score across 24 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-11/

{
  "context": "Tests whether the agent uses the correct `tessl skill review` command in the automation script, runs it both before and after changes, structures the workflow so validation happens before changes are applied, and incorporates multiple validation methods from Phase 4.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "tessl skill review command",
      "description": "Script uses `tessl skill review <path>` (exact command name) for the evaluation step — not a generic alternative",
      "max_score": 15
    },
    {
      "name": "Review before changes",
      "description": "Script runs `tessl skill review` BEFORE making any changes to capture the baseline score",
      "max_score": 10
    },
    {
      "name": "Review after changes",
      "description": "Script runs `tessl skill review` a SECOND TIME after changes are applied to verify improvement",
      "max_score": 10
    },
    {
      "name": "Validation before apply",
      "description": "Script includes a validation step that is placed BEFORE the change/edit step in the workflow",
      "max_score": 10
    },
    {
      "name": "Python ast.parse validation",
      "description": "Script includes Python syntax validation using `ast.parse` or `python -c 'import ast; ...'` — specifically the ast module",
      "max_score": 8
    },
    {
      "name": "node --check JS validation",
      "description": "Script includes JavaScript syntax validation using `node --check <file>` command",
      "max_score": 8
    },
    {
      "name": "Command --help flag validation",
      "description": "Script validates command flags by consulting the command's `--help` output",
      "max_score": 8
    },
    {
      "name": "File reference validation",
      "description": "Script checks that file references in the SKILL.md actually exist on disk",
      "max_score": 7
    },
    {
      "name": "Before/after score output",
      "description": "Script captures or outputs both the before and after scores side-by-side, enabling comparison",
      "max_score": 8
    },
    {
      "name": "Script accepts SKILL.md path",
      "description": "Script accepts the SKILL.md path as an argument (e.g. `$1` or script parameter), not hardcoded",
      "max_score": 8
    },
    {
      "name": "Phases are ordered",
      "description": "Script structure follows: baseline review → validation → (change step) → post-change review, in that order",
      "max_score": 8
    }
  ]
}

evals

README.md

tile.json