Audit Claude Code agents, skills, and commands for quality and production readiness. Use when evaluating skill quality, checking production readiness scores, or comparing agents against best-practice templates.
67
62%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./examples/skills/audit-agents-skills/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid description that clearly communicates both what the skill does and when to use it, with a well-defined niche. The trigger terms are natural and relevant. The main weakness is that the specific capabilities could be more granular — listing concrete audit actions (e.g., 'score skills on a rubric, flag missing fields, generate improvement recommendations') would strengthen specificity.
Suggestions
Add more concrete action verbs describing specific audit operations, e.g., 'score skills against rubrics, flag missing metadata fields, generate improvement recommendations, validate command configurations'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (Claude Code agents, skills, commands) and some actions (audit, evaluate quality, check production readiness scores, compare against templates), but the actions are somewhat high-level rather than listing multiple concrete specific operations like 'extract text, fill forms, merge documents'. | 2 / 3 |
Completeness | Clearly answers both 'what' (audit Claude Code agents, skills, and commands for quality and production readiness) and 'when' (Use when evaluating skill quality, checking production readiness scores, or comparing agents against best-practice templates) with explicit trigger guidance. | 3 / 3 |
Trigger Term Quality | Includes natural keywords users would say: 'audit', 'skill quality', 'production readiness', 'agents', 'skills', 'commands', 'best-practice templates', 'production readiness scores'. These are terms a user would naturally use when seeking this functionality. | 3 / 3 |
Distinctiveness Conflict Risk | Very specific niche — auditing Claude Code agents/skills/commands for production readiness is a distinct domain unlikely to conflict with other skills. The combination of 'audit', 'production readiness scores', and 'best-practice templates' creates a clear, unique identity. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
35%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is comprehensive in scope but severely over-engineered for a SKILL.md file. It reads more like a product requirements document or technical specification than actionable guidance for Claude. The core audit workflow is buried under layers of methodology justification, industry statistics, CI/CD integration patterns, and maintenance instructions that should either be in separate files or omitted entirely.
Suggestions
Cut the content by 60-70%: Remove 'Industry Context', 'Scoring Philosophy' rationale, 'Comparison: Command vs Skill', 'Maintenance', and 'Changelog' sections entirely—these don't help Claude execute the audit.
Move detection patterns, full JSON schema examples, and CI/CD integration into separate referenced files (e.g., DETECTION.md, SCHEMA.md, CI-CD.md) to keep SKILL.md as a concise overview.
Add explicit validation checkpoints in the workflow: e.g., 'Verify scoring/criteria.yaml exists before proceeding' and 'If no files found in discovery, report empty scan and stop'.
Clarify the actual execution mechanism—does Claude read criteria.yaml at runtime? Does it use the Python snippets directly? The skill should make the tooling requirements unambiguous.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~400+ lines. Includes extensive industry context (LangChain report statistics), methodology justifications, comparison tables, CI/CD integration examples, maintenance instructions, and changelog that Claude doesn't need. The 'Why These Criteria?' and 'Industry Context' sections alone are unnecessary padding. Much of this reads like a product specification document rather than an actionable skill. | 1 / 3 |
Actionability | Contains concrete code snippets (Python detection patterns, YAML schemas, JSON output examples) which are helpful, but the actual audit execution relies on external files (scoring/criteria.yaml) and unclear tooling. The 'Usage' section shows invocation patterns but the core workflow depends on infrastructure that isn't fully specified—it's unclear how Claude should actually perform the scoring without the criteria file. | 2 / 3 |
Workflow Clarity | The 5-phase workflow (Discovery → Scoring → Comparative → Report → Fix Suggestions) is clearly sequenced and well-labeled. However, there are no validation checkpoints between phases—no 'verify files were found before scoring' or 'confirm criteria loaded correctly' steps. For a multi-step process involving file scanning and scoring, the lack of error recovery or validation gates is a gap. | 2 / 3 |
Progressive Disclosure | References external files (scoring/criteria.yaml, command version, guide lines) which is good progressive disclosure in principle. However, the skill itself is monolithic—massive amounts of detail (detection patterns, full JSON output examples, CI/CD integration, industry context) are inlined rather than split into referenced files. The content that should be in separate reference docs is all crammed into one enormous file. | 2 / 3 |
Total | 7 / 12 Passed |
Validation
72%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 8 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (549 lines); consider splitting into references/ and linking | Warning |
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 8 / 11 Passed | |
4ef3dec
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.