Optimizes AI skills for activation, clarity, and cross-model reliability. Use when creating or editing skill packs, diagnosing weak skill uptake, reducing regressions, tuning instruction salience, improving examples, shrinking context cost, or setting benchmark and release gates for skills. Trigger terms: skill optimization, activation gap, benchmark skill, with/without skill delta, regression, context budget, prompt salience.
87
87%
Does it follow best practices?
Impact
87%
1.14xAverage score across 5 eval scenarios
Passed
No known issues
The DevEx team at Novu has been iterating on their error-handling skill for Node.js services and is ready to ship version 2.0. The skill teaches models how to implement structured error handling patterns (custom error classes, centralized error middleware, logging hooks). A junior engineer ran the benchmarks but produced only raw results — no formatted report, no gate review, no PR documentation. The team lead wants the release properly prepared before the PR goes up.
Your job is to review the benchmark results, determine whether the update meets the bar for shipping, write the release checklist, and prepare the PR description that documents this release. Also flag any gaps that need follow-up after merge.
Produce:
release-assessment.md — a structured review document with your analysis of the benchmark results, any gaps or blockers you identified, and a clear go/no-go recommendation with justificationpr-description.md — a pull request description ready to be posted, including the PR checklist with all required itemsThe following files are provided as inputs. Extract them before beginning.
=============== FILE: inputs/benchmark-results.json =============== { "skill": "error-handling", "version": "2.0", "run_date": "2026-04-16", "models_tested": ["ModelA", "ModelB", "ModelC"], "scenarios": [ "custom-error-classes", "centralized-middleware", "logging-integration", "async-error-propagation" ], "results": { "ModelA": { "custom-error-classes": { "without": 65, "with": 88 }, "centralized-middleware": { "without": 55, "with": 82 }, "logging-integration": { "without": 70, "with": 79 }, "async-error-propagation": { "without": 60, "with": 84 } }, "ModelB": { "custom-error-classes": { "without": 72, "with": 85 }, "centralized-middleware": { "without": 68, "with": 78 }, "logging-integration": { "without": 74, "with": 77 }, "async-error-propagation": { "without": 69, "with": 82 } }, "ModelC": { "custom-error-classes": { "without": 40, "with": 55 }, "centralized-middleware": { "without": 35, "with": 52 }, "logging-integration": { "without": 42, "with": 40 }, "async-error-propagation": { "without": 38, "with": 0 } } }, "context": { "skill_md_updated": true, "new_rule_file_added": "rules/async-patterns.md", "prior_run_log_entry": "2026-03-01: v1.5 baseline run", "open_issues": [], "validation_commands_run": false } }