Audit and improve skill collections with a 9-dimension scoring framework (Knowledge Delta, Mindset, Anti-Patterns, Specification Compliance, Progressive Disclosure, Freedom Calibration, Pattern Recognition, Practical Usability, Eval Validation), duplication detection, remediation planning, baseline comparison, and CI quality gates; use when evaluating skill quality, generating remediation plans, detecting duplicates, validating artifact conventions, or enforcing publication thresholds.
93
89%
Does it follow best practices?
Impact
99%
1.26xAverage score across 5 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent applies the 9-dimension framework correctly to a skill heavy with redundant content (SQL basics, installation steps, generic best-practices) and produces a scored report with actionable remediation.",
"type": "weighted_checklist",
"checklist": [
{
"name": "9-dimension framework applied",
"description": "Uses all 9 dimensions: Knowledge Delta, Mindset+Procedures, Anti-Pattern Quality, Specification Compliance, Progressive Disclosure, Freedom Calibration, Pattern Recognition, Practical Usability, Eval Validation",
"max_score": 12
},
{
"name": "Redundant content identified",
"description": "Flags SQL basics, installation instructions, and generic best-practices as redundant content not worth expert attention",
"max_score": 15
},
{
"name": "Knowledge Delta scored low",
"description": "Assigns D1 a low score (<=10/20) reflecting the high redundancy ratio of the input skill",
"max_score": 12
},
{
"name": "Numerical scores per dimension",
"description": "Provides specific numerical scores for each dimension with a brief justification",
"max_score": 15
},
{
"name": "A-grade threshold referenced",
"description": "States the A-grade target as >=126/140 or equivalent percentage",
"max_score": 10
},
{
"name": "Actionable remediation steps",
"description": "Remediation plan contains specific file-level changes (what to add/remove/rewrite) with S/M/L effort sizing",
"max_score": 20
},
{
"name": "Specification compliance issues noted",
"description": "Identifies the weak description field ('Help with SQL queries.') as a D4 compliance failure",
"max_score": 8
},
{
"name": "Progressive disclosure gap noted",
"description": "Notes absence of references/ directory and/or content frontloading as a D5 weakness",
"max_score": 8
}
]
}assets
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
references
scripts