CtrlK
BlogDocsLog inGet started
Tessl Logo

pantheon-ai/skill-quality-auditor

Audit and improve skill collections with a 9-dimension scoring framework (Knowledge Delta, Mindset, Anti-Patterns, Specification Compliance, Progressive Disclosure, Freedom Calibration, Pattern Recognition, Practical Usability, Eval Validation), duplication detection, remediation planning, baseline comparison, and CI quality gates; use when evaluating skill quality, generating remediation plans, detecting duplicates, validating artifact conventions, or enforcing publication thresholds.

93

1.26x
Quality

89%

Does it follow best practices?

Impact

99%

1.26x

Average score across 5 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

SKILL.md

name:
skill-quality-auditor
description:
Evaluate, score, and remediate agent skill collections using a 9-dimension quality framework (Knowledge Delta, Mindset, Anti-Patterns, Specification Compliance, Progressive Disclosure, Freedom Calibration, Pattern Recognition, Practical Usability, Eval Validation). Performs duplication detection, generates remediation plans with T-shirt sizing, enforces CI quality gates, validates artifact conventions, tracks score trends, and ensures tessl registry compliance. Use when evaluating skill quality, auditing SKILL.md files, scoring agent skills, generating remediation plans, detecting duplicate skills, validating skill format, enforcing quality gates, optimizing for A-grade publication, comparing audit baselines, batch skill assessments, or checking tessl compliance. Triggers: 'check my skills', 'skill audit', 'improve my SKILL.md', 'quality check', 'A-grade scoring', 'quality gates', 'eval validation', 'audit all skills', 'remediation plan', 'skill judge', 'dimension scoring'.

Skill Quality Auditor

Navigation hub for evaluating, maintaining, and improving skill quality with 9-dimension framework scoring.

Quick Start

Build once, then audit:

Build once:

bun run build:skill-auditor

Run audits:

# Single skill
skill-auditor evaluate <domain>/<skill-name> --json --store

# Batch with grade gate
skill-auditor batch <skill1> <skill2> --fail-below B --store

When to Use

  • Evaluate skills before merge or publication using 9-dimension scoring
  • Generate remediation plans, detect duplication (>20% threshold), or enforce CI quality gates
  • Validate eval scenario coverage and artifact conventions

When Not to Use

  • Write the skill first — do not audit an unfinished draft
  • Avoid using this as a substitute for peer review of logic or domain accuracy

Workflow

  1. Run skill-auditor evaluate <skill> --json --store
  2. Check artifacts and eval coverage using deterministic criteria
  3. Generate a remediation plan with T-shirt sizing and score delta estimates
  4. Run the auditor again to verify improvement; if score is below target, check remediation-plan.md and focus on the lowest-scoring dimension

Mindset

  • Use scores as directional signals, not absolute truth.
  • Apply deterministic, reproducible checks over manual review.
  • Use threshold-based evaluation rather than relative comparisons.
  • Keep audit rules strict for safety and consistency; stay flexible elsewhere.

Anti-Patterns (Summary)

  • NEVER skip baseline comparison in recurring audits — WHY: score regressions go undetected without a prior audit.json
  • NEVER ignore Knowledge Delta below 15/20 — WHY: low D1 means the skill adds no value over LLM baseline
  • NEVER apply subjective scoring — WHY: scores drift between evaluators and cannot be automated in CI
  • NEVER create kitchen-sink skills covering unrelated tasks — WHY: broad scope kills D7 and prevents correct triggering
  • NEVER use harness-specific paths in skill content — WHY: absolute paths break when installed in a different repo
  • NEVER list references without "When to Use" conditions — WHY: unconditional loading bloats context and penalises D5

Ensure you review Detailed Anti-Patterns for all WHY/BAD/GOOD failure modes including agent name references and D4 heading rules.

Examples

Remediation workflow:

skill-auditor evaluate documentation/markdown-authoring --json --store
# Score: 98/140 (C+) -> review remediation-plan.md -> fix -> re-audit -> 128/140 (A)

PR-scoped triage:

skills=$(git diff --name-only origin/main | grep "skills/.*/SKILL.md" | sed 's|skills/||;s|/SKILL.md||' | tr '\n' ' ')
skill-auditor batch $skills --fail-below B --store

Audit all skills:

skill-auditor batch $(find skills -name "SKILL.md" | sed 's|skills/||;s|/SKILL.md||' | tr '\n' ' ')

See Audit Workflow Examples for input/output pairs and CI quality gate examples.

Self-Audit

skill-auditor evaluate agentic-harness/skill-quality-auditor --json
# Expected: A grade, total >= 126/140

References

Framework

TopicReferenceWhen to Use
Per-dimension criteria and bonus rulesDimensionsEvaluating any dimension or understanding the rubric
Score thresholds and grade bandsScoring RubricCalculating a total score or assigning a grade
A-grade checklist and red flagsQuality StandardsTargeting A-grade or reviewing blockers
Trigger pattern density and keyword analysisPattern RecognitionScoring D7 or improving description keywords
Canonical SKILL.md structure and References table standardSKILL TemplateAuthoring or refactoring a skill

Operations

TopicReferenceWhen to Use
CI gate configuration and batch pass/fail logicQuality ThresholdsSetting up CI quality gates
NEVER/WHY/BAD/GOOD failure modes per dimensionAnti-PatternsExplaining low scores or writing remediation guidance
T-shirt sizing and remediation roadmapsRemediation PlanningWriting a remediation plan for a C/D-grade skill
Deduplication workflow and aggregation guidanceDuplication DetectionDetecting skill overlap or planning aggregations
skill-auditor evaluate/batch usage and output formatsScripts WorkflowRunning audits from the command line
Registry publication gates and tessl compliance checksTessl CompliancePreparing a skill for public registry submission

SKILL.md

tile.json