Implementation + audit loop using parallel agent teams with structured simplify, harden, and document passes. Spawns implementation agents to do the work, then audit agents to find complexity, security gaps, and spec deviations, then loops until code compiles cleanly, all tests pass, and auditors find zero issues or the loop cap is reached. Use when: implementing features from a spec or plan, hardening existing code, fixing a batch of issues, or any multi-file task that benefits from a build-verify-fix cycle.
68
81%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
85%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description that clearly articulates a sophisticated multi-agent workflow with explicit exit criteria and a well-defined 'Use when' clause. Its main weakness is that the trigger terms lean toward internal/technical jargon rather than the natural language a user would employ when requesting this kind of work. The description is detailed and distinctive but could benefit from more user-facing vocabulary.
Suggestions
Add more natural user-facing trigger terms such as 'code review', 'refactor', 'quality pass', 'clean up code', or 'implement and verify' to improve discoverability when users phrase requests casually.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: spawns implementation agents, audit agents, structured simplify/harden/document passes, loops until code compiles, tests pass, and auditors find zero issues. Very detailed about the mechanics. | 3 / 3 |
Completeness | Clearly answers both 'what' (implementation + audit loop with parallel agents, simplify/harden/document passes, looping until clean) and 'when' (explicit 'Use when:' clause covering implementing features from a spec, hardening code, fixing batches of issues, multi-file tasks needing build-verify-fix cycles). | 3 / 3 |
Trigger Term Quality | Includes some natural terms like 'implementing features', 'hardening existing code', 'fixing a batch of issues', 'multi-file task', and 'build-verify-fix cycle'. However, it leans heavily on specialized jargon ('parallel agent teams', 'structured simplify passes', 'audit loop', 'spec deviations') that users are unlikely to naturally say. Missing common variations like 'code review', 'refactor', 'quality check'. | 2 / 3 |
Distinctiveness Conflict Risk | The description carves out a very specific niche: a multi-agent implementation-then-audit loop pattern with explicit exit criteria. The combination of parallel agent teams, structured passes, and loop-until-clean semantics is highly distinctive and unlikely to conflict with simpler coding or review skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a high-quality, deeply actionable skill with excellent workflow clarity — the implement-audit-fix loop is precisely defined with clear exit conditions, validation checkpoints, and drift checks. Its main weakness is length: at 300+ lines, it could benefit from splitting reference material (auditor details, sizing guide, interoperability) into separate files, and the install section adds no value to the skill body. The content is well-organized but pushes the boundary of what should live in a single SKILL.md.
Suggestions
Remove the install section from the skill body — it's not instructional content and wastes tokens.
Move the detailed auditor descriptions, agent sizing guide, and interoperability section into separate reference files (e.g., references/auditor-prompts.md already exists — expand it, add references/sizing.md and references/pipeline.md) and link to them from the main file.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is quite long (~300+ lines) with some sections that could be tightened — the pipeline integration diagrams, interoperability section, and some explanatory text are somewhat verbose. However, most content is genuinely instructive and not explaining things Claude already knows. The ASCII diagrams and tables earn their space. The install section at the top is unnecessary filler for the skill body. | 2 / 3 |
Actionability | The skill provides concrete, executable guidance throughout: specific tool invocations (TeamCreate, TaskCreate, TaskUpdate), exact prompt templates for spawning agents, git commands for collecting file lists, structured finding formats, and a complete worked example. The step-by-step procedure is copy-paste ready with specific parameters and modes. | 3 / 3 |
Workflow Clarity | The multi-step workflow is exceptionally well-sequenced with explicit validation checkpoints (compile + tests between phases), clear exit conditions (3 distinct criteria), feedback loops (audit → fix → re-audit), drift checks between rounds, a refactor gate for evaluating findings, and budget guidance for scope control. The loop limits section and quality gates are explicit and non-ambiguous. | 3 / 3 |
Progressive Disclosure | The skill references `references/auditor-prompts.md` for full prompt templates, which is good progressive disclosure. However, the main file itself is quite long and could benefit from splitting — the agent sizing guide, tips, interoperability section, and detailed auditor descriptions could be in separate reference files. The inline content is well-structured with headers but borders on monolithic. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
fe0da1c
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.