CtrlK
BlogDocsLog inGet started
Tessl Logo

mcollina/skill-optimizer

Optimizes AI skills for activation, clarity, and cross-model reliability. Use when creating or editing skill packs, diagnosing weak skill uptake, reducing regressions, tuning instruction salience, improving examples, shrinking context cost, or setting benchmark and release gates for skills. Trigger terms: skill optimization, activation gap, benchmark skill, with/without skill delta, regression, context budget, prompt salience.

87

1.14x
Quality

87%

Does it follow best practices?

Impact

87%

1.14x

Average score across 5 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-4/

{
  "context": "Tests whether the agent correctly applies the regression triage workflow: detecting regressions from benchmark data, classifying the root cause, applying the fix sequence (isolate → locate → rewrite with must/should → add example pair → document), and verifying exit criteria.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Regressions identified",
      "description": "regression-report.md identifies both module-outputs-scenario and variable-validation-scenario as regressions (negative deltas) on both ModelA and ModelB",
      "max_score": 10
    },
    {
      "name": "Regression cause classified",
      "description": "regression-report.md classifies the cause using one of the standard categories: ambiguous instruction collisions, optional language around mandatory behavior, over-broad rule suppressing detail, or examples implying wrong default",
      "max_score": 10
    },
    {
      "name": "Exact instruction lines cited",
      "description": "regression-report.md quotes or paraphrases the specific instruction lines in SKILL-v1.3.md that are believed to cause the regression (e.g. 'descriptions on outputs are nice to have but not strictly required' or 'you may want to add validation')",
      "max_score": 10
    },
    {
      "name": "Fuzzy language replaced in fix",
      "description": "SKILL-fixed.md does NOT contain: 'may want to add validation', 'sometimes validation adds complexity', 'nice to have but not strictly required', 'feel free to adapt', 'might be nice'",
      "max_score": 10
    },
    {
      "name": "Must/should boundaries in fix",
      "description": "SKILL-fixed.md uses explicit mandatory language for output descriptions and variable validation (e.g. 'must include', 'required', 'do not omit', 'always')",
      "max_score": 10
    },
    {
      "name": "Positive example in fix",
      "description": "SKILL-fixed.md contains a code example showing the CORRECT pattern for output descriptions or variable validation (with description field present)",
      "max_score": 10
    },
    {
      "name": "Negative example in fix",
      "description": "SKILL-fixed.md or regression-report.md contains a counter-example or explicit 'do not do this' pattern showing the incorrect version (missing description, or skipping validation)",
      "max_score": 10
    },
    {
      "name": "Provider-config not broken",
      "description": "SKILL-fixed.md retains the provider configuration section with 'do not omit source' and 'do not omit version constraint' instructions (no regression introduced in non-affected scenario)",
      "max_score": 10
    },
    {
      "name": "Exit criteria documented",
      "description": "regression-report.md explicitly states that the fix should be verified by rerunning the same scenario (not a broad rerun first) before concluding",
      "max_score": 10
    },
    {
      "name": "Root cause + diff summary",
      "description": "regression-report.md includes both a root cause explanation AND a summary of the diff (what was changed in the skill)",
      "max_score": 10
    }
  ]
}

evals

SKILL.md

tile.json