tessl install https://github.com/softaworks/agent-toolkit --skill skill-judgegithub.com/softaworks/agent-toolkit
Evaluate Agent Skill design quality against official specifications and best practices. Use when reviewing, auditing, or improving SKILL.md files and skill packages. Provides multi-dimensional scoring and actionable improvement suggestions.
Average Score
86%
Content
85%
Description
N/A
Generated
Validations
Total score
13/16| Criteria | Score |
|---|---|
skill_md_line_count SKILL.md is long (753 lines); consider splitting into references/ and linking | |
frontmatter_valid YAML frontmatter is valid | |
name_field 'name' field is valid: 'skill-judge' | |
description_field 'description' field is valid (240 chars) | |
description_voice 'description' uses third person voice | |
description_trigger_hint Description includes an explicit trigger hint | |
compatibility_field 'compatibility' field not present (optional) | |
allowed_tools_field 'allowed-tools' field not present (optional) | |
metadata_version 'metadata' field is not a dictionary | |
metadata_field 'metadata' field not present (optional) | |
license_field 'license' field is missing | |
frontmatter_unknown_keys No unknown frontmatter keys found | |
body_present SKILL.md body is present | |
body_examples Examples detected (code fence or 'Example' wording) | |
body_output_format Output/return/format terms detected | |
body_steps Step-by-step structure detected (ordered list) |
Content
Suggestions 2
Total score
11/12| Dimension | Score |
|---|---|
conciseness The skill contains valuable expert knowledge but is verbose in places, with some redundant explanations (e.g., explaining what Skills are conceptually, the 'Traditional vs Skill' comparison). The core evaluation criteria are useful but could be more condensed. | 2/3 |
actionability Provides highly concrete guidance: specific scoring rubrics with point values, explicit evaluation protocol steps, report template format, and detailed checklists. The 8 dimensions have clear scoring criteria with examples. | 3/3 |
workflow_clarity Clear 5-step evaluation protocol with explicit sequence: First Pass (Knowledge Delta Scan) → Structure Analysis → Score Each Dimension → Calculate Total → Generate Report. Includes validation through the checklist and grade scale. | 3/3 |
progressive_disclosure Well-organized with clear sections: Core Philosophy → 8 Evaluation Dimensions → NEVER list → Evaluation Protocol → Common Failure Patterns → Quick Reference. Self-contained with no external references needed, appropriate for the skill's scope. | 3/3 |
Suggestions
Condense or remove the 'Core Philosophy' section (~lines 1-80) - Claude understands the concept of skills; focus on the evaluation criteria directly
Merge the 'Traditional vs Skill' cost comparison and 'Tool vs Skill' table into a brief 2-3 line summary, as these explain concepts rather than guide evaluation
Overall Assessment
This is a strong, actionable skill that provides genuine expert knowledge about evaluating Agent Skills. The evaluation framework with 8 dimensions, specific scoring criteria, and concrete examples is highly valuable. However, the philosophical preamble about 'what is a Skill' and the paradigm shift explanation adds ~200 tokens that Claude likely doesn't need, reducing token efficiency.
Description
Total score
N/ASomething went wrong