CtrlK
BlogDocsLog inGet started
Tessl Logo

mcollina/skill-optimizer

Optimizes AI skills for activation, clarity, and cross-model reliability. Use when creating or editing skill packs, diagnosing weak skill uptake, reducing regressions, tuning instruction salience, improving examples, shrinking context cost, or setting benchmark and release gates for skills. Trigger terms: skill optimization, activation gap, benchmark skill, with/without skill delta, regression, context budget, prompt salience.

87

1.14x
Quality

87%

Does it follow best practices?

Impact

87%

1.14x

Average score across 5 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

task.mdevals/scenario-5/

Prepare a Skill Update for Release

Problem/Feature Description

The DevEx team at Novu has been iterating on their error-handling skill for Node.js services and is ready to ship version 2.0. The skill teaches models how to implement structured error handling patterns (custom error classes, centralized error middleware, logging hooks). A junior engineer ran the benchmarks but produced only raw results — no formatted report, no gate review, no PR documentation. The team lead wants the release properly prepared before the PR goes up.

Your job is to review the benchmark results, determine whether the update meets the bar for shipping, write the release checklist, and prepare the PR description that documents this release. Also flag any gaps that need follow-up after merge.

Output Specification

Produce:

  • release-assessment.md — a structured review document with your analysis of the benchmark results, any gaps or blockers you identified, and a clear go/no-go recommendation with justification
  • pr-description.md — a pull request description ready to be posted, including the PR checklist with all required items

Input Files

The following files are provided as inputs. Extract them before beginning.

=============== FILE: inputs/benchmark-results.json =============== { "skill": "error-handling", "version": "2.0", "run_date": "2026-04-16", "models_tested": ["ModelA", "ModelB", "ModelC"], "scenarios": [ "custom-error-classes", "centralized-middleware", "logging-integration", "async-error-propagation" ], "results": { "ModelA": { "custom-error-classes": { "without": 65, "with": 88 }, "centralized-middleware": { "without": 55, "with": 82 }, "logging-integration": { "without": 70, "with": 79 }, "async-error-propagation": { "without": 60, "with": 84 } }, "ModelB": { "custom-error-classes": { "without": 72, "with": 85 }, "centralized-middleware": { "without": 68, "with": 78 }, "logging-integration": { "without": 74, "with": 77 }, "async-error-propagation": { "without": 69, "with": 82 } }, "ModelC": { "custom-error-classes": { "without": 40, "with": 55 }, "centralized-middleware": { "without": 35, "with": 52 }, "logging-integration": { "without": 42, "with": 40 }, "async-error-propagation": { "without": 38, "with": 0 } } }, "context": { "skill_md_updated": true, "new_rule_file_added": "rules/async-patterns.md", "prior_run_log_entry": "2026-03-01: v1.5 baseline run", "open_issues": [], "validation_commands_run": false } }

evals

SKILL.md

tile.json