or run

tessl search

skill-judge

tessl install https://github.com/softaworks/agent-toolkit --skill skill-judge

github.com/softaworks/agent-toolkit

Evaluate Agent Skill design quality against official specifications and best practices. Use when reviewing, auditing, or improving SKILL.md files and skill packages. Provides multi-dimensional scoring and actionable improvement suggestions.

Average Score

86%

Content

85%

Description

N/A

SKILL.md

Review

Evals

Generated

about 4 hours ago

Criteria	Score
skill_md_line_count SKILL.md is long (753 lines); consider splitting into references/ and linking
frontmatter_valid YAML frontmatter is valid
name_field 'name' field is valid: 'skill-judge'
description_field 'description' field is valid (240 chars)
description_voice 'description' uses third person voice
description_trigger_hint Description includes an explicit trigger hint
compatibility_field 'compatibility' field not present (optional)
allowed_tools_field 'allowed-tools' field not present (optional)
metadata_version 'metadata' field is not a dictionary
metadata_field 'metadata' field not present (optional)
license_field 'license' field is missing
frontmatter_unknown_keys No unknown frontmatter keys found
body_present SKILL.md body is present
body_examples Examples detected (code fence or 'Example' wording)
body_output_format Output/return/format terms detected
body_steps Step-by-step structure detected (ordered list)

Dimension	Score
conciseness The skill contains valuable expert knowledge but is verbose in places, with some redundant explanations (e.g., explaining what Skills are conceptually, the 'Traditional vs Skill' comparison). The core evaluation criteria are useful but could be more condensed.	2/3
actionability Provides highly concrete guidance: specific scoring rubrics with point values, explicit evaluation protocol steps, report template format, and detailed checklists. The 8 dimensions have clear scoring criteria with examples.	3/3
workflow_clarity Clear 5-step evaluation protocol with explicit sequence: First Pass (Knowledge Delta Scan) → Structure Analysis → Score Each Dimension → Calculate Total → Generate Report. Includes validation through the checklist and grade scale.	3/3
progressive_disclosure Well-organized with clear sections: Core Philosophy → 8 Evaluation Dimensions → NEVER list → Evaluation Protocol → Common Failure Patterns → Quick Reference. Self-contained with no external references needed, appropriate for the skill's scope.	3/3

Suggestions

Condense or remove the 'Core Philosophy' section (~lines 1-80) - Claude understands the concept of skills; focus on the evaluation criteria directly

Merge the 'Traditional vs Skill' cost comparison and 'Tool vs Skill' table into a brief 2-3 line summary, as these explain concepts rather than guide evaluation

Overall Assessment

This is a strong, actionable skill that provides genuine expert knowledge about evaluating Agent Skills. The evaluation framework with 8 dimensions, specific scoring criteria, and concrete examples is highly valuable. However, the philosophical preamble about 'what is a Skill' and the paradigm shift explanation adds ~200 tokens that Claude likely doesn't need, reducing token efficiency.

Something went wrong