CtrlK
BlogDocsLog inGet started
Tessl Logo

uinaf/skill-audit

Audit existing skills with Tessl scoring, metadata and trigger-coverage checks, repo conventions, and skill-authoring best practices. Use when creating or revising a skill, triaging weak self-activation, or comparing a skill against source-repo guidance such as `AGENTS.md`, `CLAUDE.md`, or repo rules, plus external skill guidance. Do not use to verify general application code or to rewrite unrelated docs.

97

1.05x
Quality

98%

Does it follow best practices?

Impact

97%

1.05x

Average score across 3 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-1/

{
  "context": "Tests whether the agent audits discovery quality (name specificity, description completeness), flags missing overlap boundaries, identifies misplaced content that should be handed off to a different skill, uses Tessl per skill, writes descriptions in third person with what/when/boundary, and avoids calling the optimizer without approval.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Tessl run per skill",
      "description": "audit-log.sh contains a separate Tessl invocation for each of the four skills (four distinct Tessl commands targeting different skill paths)",
      "max_score": 8
    },
    {
      "name": "Score per skill reported",
      "description": "audit-report.md states a Tessl score for each of the four skills individually",
      "max_score": 7
    },
    {
      "name": "Vague names flagged",
      "description": "audit-report.md identifies at least two of the four skill names (notifier, alerter, comms-router, data-pipeline) as vague, generic, or forgettable",
      "max_score": 8
    },
    {
      "name": "Missing boundaries flagged",
      "description": "audit-report.md flags that the overlapping skills (notifier, alerter, comms-router) lack explicit boundary statements distinguishing them from each other",
      "max_score": 9
    },
    {
      "name": "Third-person descriptions proposed",
      "description": "All four proposed replacement descriptions in rewrite-suggestions.md are written in third person (no first-person pronouns)",
      "max_score": 8
    },
    {
      "name": "What and when in descriptions",
      "description": "Each proposed replacement description in rewrite-suggestions.md states both what the skill does and when to use it",
      "max_score": 9
    },
    {
      "name": "Overlap boundary in descriptions",
      "description": "At least two of the proposed replacement descriptions in rewrite-suggestions.md explicitly state when NOT to use that skill (boundary with overlapping skills)",
      "max_score": 8
    },
    {
      "name": "Misplaced code review content flagged",
      "description": "audit-report.md identifies the Code Review Guidance section in data-pipeline/SKILL.md as out of scope and recommends it be moved or handed off",
      "max_score": 9
    },
    {
      "name": "Correct handoff skill named",
      "description": "audit-report.md names `review` (or a `review` skill) as the appropriate destination for the code review guidance, rather than leaving it in data-pipeline",
      "max_score": 9
    },
    {
      "name": "Evidence over taste",
      "description": "audit-report.md grounds each finding in Tessl output, repo conventions, or the actual file content — not subjective preference alone",
      "max_score": 8
    },
    {
      "name": "Optimizer not invoked",
      "description": "audit-log.sh does NOT contain `--optimize`",
      "max_score": 8
    },
    {
      "name": "Scope respected",
      "description": "The agent does NOT rewrite the skill body or workflow sections — only metadata (name/description) improvements are proposed, and structural body changes are flagged as separate findings",
      "max_score": 9
    }
  ]
}

evals

scenario-1

criteria.json

task.md

SKILL.md

tile.json