Implementation + audit loop using parallel agent teams with structured simplify, harden, and document passes. Spawns implementation agents to do the work, then audit agents to find complexity, security gaps, and spec deviations, then loops until code compiles cleanly, all tests pass, and auditors find zero issues or the loop cap is reached. Use when: implementing features from a spec or plan, hardening existing code, fixing a batch of issues, or any multi-file task that benefits from a build-verify-fix cycle.
85
81%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
85%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description that clearly articulates a complex multi-agent workflow with explicit trigger conditions. Its main weakness is that some trigger terms are overly technical ('parallel agent teams', 'spec deviations') rather than using natural language users would employ. The description is thorough and distinctive but could benefit from more user-facing vocabulary.
Suggestions
Add more natural trigger terms users might say, such as 'code review', 'refactor', 'quality check', 'clean up code', or 'improve code quality' to improve discoverability.
Consider simplifying some jargon-heavy phrases (e.g., 'structured simplify, harden, and document passes') into plainer language while keeping the technical detail, to better match how users naturally describe their needs.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: spawns implementation agents, audit agents, structured simplify/harden/document passes, loops until code compiles, tests pass, and auditors find zero issues. Very detailed about the mechanics. | 3 / 3 |
Completeness | Clearly answers both 'what' (implementation + audit loop with parallel agents, simplify/harden/document passes, looping until clean) and 'when' (explicit 'Use when:' clause covering implementing features from a spec, hardening code, fixing batches of issues, multi-file tasks needing build-verify-fix cycles). | 3 / 3 |
Trigger Term Quality | Includes some natural terms like 'implementing features', 'hardening existing code', 'fixing a batch of issues', 'multi-file task', and 'build-verify-fix cycle'. However, it leans heavily on specialized jargon ('parallel agent teams', 'structured simplify passes', 'audit loop', 'spec deviations') that users are unlikely to naturally say. Missing common variations like 'code review', 'refactor', 'quality check'. | 2 / 3 |
Distinctiveness Conflict Risk | The description carves out a very specific niche: parallel agent-based implementation with audit loops. The combination of spawning implementation agents, audit agents, and the loop-until-clean pattern is highly distinctive and unlikely to conflict with simpler coding or review skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, highly actionable skill that clearly defines a complex multi-agent workflow with proper validation checkpoints, exit conditions, and feedback loops. Its main weakness is length — there's meaningful redundancy between the step-by-step procedure, the example walkthrough, and repeated explanations of pipeline position and exit conditions. The content would benefit from trimming duplicated sections and moving reference material (sizing guide, interoperability details) into separate files.
Suggestions
Remove the example walkthrough (steps 0-17) or significantly compress it, since it closely mirrors the step-by-step procedure and adds ~20 lines of near-duplicate content.
Move the 'Interoperability with Other Skills' and 'Agent Sizing Guide' sections into a referenced file (e.g., references/integration.md) to reduce the main file length while preserving discoverability.
Consolidate the exit conditions — they're stated in 'Loop Limits and Exit Conditions', repeated in step 7, and again in 'Quality Gates'. Define once and reference.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is quite long (~350+ lines) and includes some redundancy — exit conditions are stated in multiple places, the pipeline position is explained twice, and the example walkthrough largely repeats the step-by-step procedure. However, most content is genuinely instructive and not explaining things Claude already knows. | 2 / 3 |
Actionability | The skill provides concrete, copy-paste-ready code blocks for team creation, task creation, agent spawning with full prompt templates, bash commands for file list collection, and structured output formats. The audit dimensions table, severity handling rules, and refactor gate criteria are all specific and executable. | 3 / 3 |
Workflow Clarity | The multi-step workflow is clearly sequenced (steps 0-9) with explicit validation checkpoints (compile + tests between phases), well-defined exit conditions with three distinct paths, a drift check between rounds, a refactor gate for evaluating findings, and budget guidance for scope control. Feedback loops are explicit and thorough. | 3 / 3 |
Progressive Disclosure | The skill references `references/auditor-prompts.md` for full prompt templates, which is good progressive disclosure. However, the main file itself is quite long and could benefit from moving the detailed auditor descriptions, the agent sizing guide, or the interoperability section into separate reference files. The inline example walkthrough also duplicates the procedure. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
d6c68fa
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.