Audit Claude Code agents, skills, and commands for quality and production readiness. Use when evaluating skill quality, checking production readiness scores, or comparing agents against best-practice templates.
47
51%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./examples/skills/audit-agents-skills/SKILL.mdQuality
Discovery
75%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid description that clearly communicates both what the skill does and when to use it, with an explicit 'Use when...' clause. Its main weakness is that the capability descriptions are somewhat high-level and could benefit from more concrete action verbs, and the trigger terms could include more natural user phrasings. Overall it performs well for skill selection purposes.
Suggestions
Add more concrete actions to improve specificity, e.g., 'Scores skill descriptions, validates YAML frontmatter, checks for missing fields, and generates improvement recommendations.'
Expand trigger terms with natural user phrasings like 'review my skills', 'SKILL.md quality', 'lint agent config', or 'is my skill ready for production'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (Claude Code agents, skills, commands) and some actions (audit, evaluate quality, check production readiness scores, compare against templates), but the actions are somewhat high-level rather than listing multiple concrete operations like 'lint skill files, validate YAML frontmatter, score description quality'. | 2 / 3 |
Completeness | Clearly answers both 'what' (audit Claude Code agents, skills, and commands for quality and production readiness) and 'when' (Use when evaluating skill quality, checking production readiness scores, or comparing agents against best-practice templates) with an explicit 'Use when...' clause. | 3 / 3 |
Trigger Term Quality | Includes relevant terms like 'audit', 'skill quality', 'production readiness scores', 'agents', 'commands', and 'best-practice templates'. However, it misses common variations users might say such as 'review my skills', 'SKILL.md', 'lint', 'validate', or 'check my agent setup'. | 2 / 3 |
Distinctiveness Conflict Risk | The description targets a very specific niche—auditing Claude Code agents and skills against best-practice templates—which is unlikely to conflict with other skills. The combination of 'audit', 'production readiness scores', and 'best-practice templates' creates a distinct trigger profile. | 3 / 3 |
Total | 10 / 12 Passed |
Implementation
27%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is extremely verbose and tries to be a comprehensive reference document rather than an actionable instruction set. It includes substantial padding (industry context, methodology justifications, changelog, comparison tables) that doesn't help Claude execute the audit task. The core workflow is present but buried under excessive documentation, and critical external dependencies (scoring/criteria.yaml) are referenced but not provided.
Suggestions
Cut the content by 60-70%: remove the Industry Context section, Scoring Philosophy rationale, Comparison table, Changelog, and CI/CD integration examples — none of these help Claude execute the audit
Inline the actual scoring criteria rather than referencing an external scoring/criteria.yaml that isn't provided in the bundle, or provide that file as a bundle asset
Move detection patterns, output examples, and CI/CD integration into separate referenced files (e.g., DETECTION.md, EXAMPLES.md, CI-CD.md) to improve progressive disclosure
Add explicit validation checkpoints in the workflow: e.g., verify criteria.yaml loaded successfully, confirm file classification before scoring, validate JSON output schema before writing
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~500+ lines. Explains concepts Claude already knows (what PDF libraries do, what Jaccard similarity is, what CI/CD is). Includes extensive industry context, changelog, comparison tables, and methodology justifications that don't help Claude execute the task. The 'Industry Context' section alone is pure padding. | 1 / 3 |
Actionability | Provides some concrete code snippets (frontmatter parsing, keyword detection, overlap detection) and JSON output schemas, but much of the guidance is descriptive rather than executable. The actual audit workflow relies on external files (scoring/criteria.yaml) that aren't provided, and the 'Usage' section shows invocation patterns rather than implementation steps Claude can directly follow. | 2 / 3 |
Workflow Clarity | The 5-phase workflow (Discovery → Scoring → Comparative → Report → Fix Suggestions) is clearly sequenced, but lacks explicit validation checkpoints between phases. There's no error recovery guidance (e.g., what if criteria.yaml is missing, what if files can't be parsed). For a destructive-adjacent operation (generating reports that gate production), the absence of validation steps is notable. | 2 / 3 |
Progressive Disclosure | Monolithic wall of text with everything inline. References external files (scoring/criteria.yaml, examples/agents/, etc.) that aren't provided in the bundle. The skill dumps full JSON schemas, Python code, CI/CD configs, industry reports, and comparison tables all in one file when much of this should be in separate referenced documents. | 1 / 3 |
Total | 6 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (548 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
60a4372
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.