audit-agents-skills

Audit Claude Code agents, skills, and commands for quality and production readiness. Use when evaluating skill quality, checking production readiness scores, or comparing agents against best-practice templates.

Quality

—

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Content

35%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The body is comprehensive and well-structured but overly verbose and monolithic, with the core scoring artifact only sketched rather than shipped. Its strongest aspects are the executable detection snippets and clear phase sequencing; weakest are token efficiency and the missing/inline progressive disclosure.

Suggestions

Trim or remove sections that restate knowledge Claude already has — the Industry Context stats (already cited under Purpose), the conceptual Jaccard/token-counting explanations, the Command-vs-Skill comparison, and the Changelog — to reclaim context-window budget.

Provide the real scoring/criteria.yaml as a bundled reference file with all 16 criteria fully defined, instead of the in-body '# ... (16 total criteria)' sketch, so the skill is actually executable.

Move the full audit-report.json schema and the per-type criteria tables into reference files and link to them one level deep, keeping SKILL.md a lean overview rather than a ~540-line monolith.

Dimension	Reasoning	Score
Conciseness	The ~540-line body is padded with content Claude already knows (re-explaining Jaccard similarity, token estimation, industry stats restated across Purpose and Industry Context) and redundant sections like the Command-vs-Skill table and Changelog. Few tokens earn their place.	1 / 3
Actionability	It supplies executable detection snippets (frontmatter parsing, keyword/Jaccard/token helpers) and a report JSON shape, but the central scoring engine — scoring/criteria.yaml — is only sketched with '# ... (16 total criteria)', so the skill cannot fully execute without an absent key artifact.	2 / 3
Workflow Clarity	Phases 1-5 are clearly sequenced, but for a batch scoring/fix-generation operation there are no explicit validation checkpoints or feedback loops (e.g., verify parsed frontmatter before scoring, re-run after a fix), which per the rubric caps workflow clarity at 2.	2 / 3
Progressive Disclosure	Section structure exists and external paths are signaled (scoring/criteria.yaml, examples/, guide/examples/), but no bundle files are present and the referenced criteria.yaml is not provided; meanwhile large blocks (full JSON report schema, criteria tables, industry stats) remain inline that would be better split out.	2 / 3
	Total	7 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is clear, third-person, and complete — it states concrete capabilities and provides an explicit 'Use when' trigger clause covering several natural invocation phrases. It is a strong, well-scoped description with no over-claims.

Dimension	Reasoning	Score
Specificity	Names the domain (Claude Code agents/skills/commands) and multiple concrete actions — audit, evaluate skill quality, check readiness scores, compare against templates — matching the 'lists multiple specific concrete actions' anchor.	3 / 3
Completeness	Explicitly answers both 'what' (audit agents/skills/commands for quality and production readiness) and 'when' via the 'Use when evaluating skill quality, checking production readiness scores, or comparing agents...' clause.	3 / 3
Trigger Term Quality	Phrases like 'evaluating skill quality', 'checking production readiness scores', and 'comparing agents against best-practice templates' are natural things a user would say when needing this skill.	3 / 3
Distinctiveness Conflict Risk	The niche is clearly scoped to auditing Claude Code agents/skills/commands with distinct triggers, making it unlikely to fire for unrelated skills.	3 / 3
	Total	12 / 12 Passed

Validation

87%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 14 / 16 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (548 lines); consider splitting into references/ and linking	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	14 / 16 Passed

Repository: FlorianBruniaux/claude-code-ultimate-guide
Commit: eb4fbbd

Reviewed: 2 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.