Optimize your skills and tiles: review SKILL.md quality, generate eval scenarios, run evals, compare across models, diagnose gaps, and re-run until scores improve.
88
94%
Does it follow best practices?
Impact
88%
1.07xAverage score across 24 eval scenarios
Passed
No known issues
A developer has a webhook-handler skill that needs improvement. Their team enforces a strict approval gate: no changes may be applied to the SKILL.md file until the tech lead explicitly approves them. The developer has review output and needs to produce a formal proposal document — but the hard part isn't identifying improvements, it's presenting them in a way that earns informed approval.
The tech lead cares about three things:
Given the review output and current SKILL.md below, create a proposal document that the tech lead can act on. Do NOT apply any changes to the SKILL.md itself — that file must remain untouched. The proposal must end with an explicit approval request asking the tech lead to confirm before any edits proceed.
Produce proposal.md — a structured change proposal designed for an approval workflow. Requirements:
The following files are provided as inputs. Extract them before beginning.
=============== FILE: review_output.txt =============== === Skill Review: webhook-handler === Overall Score: 61%
ERRORS (must fix): [ERROR] Description missing "Use when" trigger clause — reduces routing to 0% effectiveness
Dimension Scores: Completeness: 1/3 (33%) - Missing trigger clause Actionability: 2/3 (66%) - Some examples but they show output not commands Conciseness: 2/3 (66%) - Explains HMAC authentication in detail (agent already knows this) Robustness: 2/3 (66%) - No retry or idempotency patterns
Judge Suggestions:
Process incoming webhooks from external services using our internal webhook framework.
Webhooks use HMAC-SHA256 for signature validation. HMAC (Hash-based Message Authentication Code) is a cryptographic technique that uses a shared secret key and a hash function to verify message authenticity. The sender computes an HMAC over the request body using the shared secret, and the receiver recomputes it to verify the signature matches. This prevents tampering and confirms the request came from the expected sender.
To validate:
import hmac
import hashlib
def validate_webhook(body: bytes, signature: str, secret: str) -> bool:
expected = hmac.new(secret.encode(), body, hashlib.sha256).hexdigest()
return hmac.compare_digest(expected, signature)Parse the incoming event:
webhook-cli process --event-type push --payload @payload.jsonEvents are routed based on the event_type field in the payload. Supported types: push, pull_request, release, deployment.
=============== END FILE ===============
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
scenario-19
scenario-20
scenario-21
scenario-22
scenario-23
scenario-24
skills
compare-skill-model-performance
optimize-skill-instructions
references
optimize-skill-performance
optimize-skill-performance-and-instructions