Launch multiple sub-agents in parallel to execute tasks across files or targets with intelligent model selection, quality-focused prompting, and meta-judge → LLM-as-a-judge verification
47
36%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/sadd/skills/do-in-parallel/SKILL.mdQuality
Discovery
17%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is heavily implementation-focused, describing internal architecture (sub-agents, meta-judge, LLM-as-a-judge) rather than user-facing capabilities and use cases. It lacks natural trigger terms users would employ and has no explicit 'Use when...' guidance, making it difficult for Claude to know when to select this skill over others.
Suggestions
Add a 'Use when...' clause specifying concrete scenarios, e.g., 'Use when the user needs to apply the same operation across many files, perform bulk refactoring, or run parallel code transformations.'
Replace technical jargon like 'sub-agents', 'meta-judge', and 'LLM-as-a-judge verification' with natural user-facing terms like 'parallel processing', 'batch operations', 'bulk file editing', or 'multi-file tasks'.
List specific concrete actions the skill performs from the user's perspective, e.g., 'Applies code changes across multiple files simultaneously, runs batch transformations, and validates results for quality.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | It names some actions like 'launch sub-agents in parallel', 'intelligent model selection', 'quality-focused prompting', and 'meta-judge → LLM-as-a-judge verification', but these are more architectural/implementation details than concrete user-facing actions. It doesn't clearly list what end-user tasks it accomplishes. | 2 / 3 |
Completeness | The description partially addresses 'what' (launching parallel sub-agents) but has no 'Use when...' clause or equivalent explicit trigger guidance. The 'when' is entirely missing, which per the rubric caps completeness at 2, and the 'what' is also vague enough to warrant a 1. | 1 / 3 |
Trigger Term Quality | The description uses technical jargon like 'sub-agents', 'meta-judge', 'LLM-as-a-judge verification', and 'intelligent model selection' — terms users would almost never naturally say. It lacks natural trigger terms a user would use when needing parallel task execution. | 1 / 3 |
Distinctiveness Conflict Risk | The parallel sub-agent concept with LLM-as-a-judge is somewhat distinctive, but 'execute tasks across files or targets' is broad enough to overlap with many file-processing or multi-file editing skills. The niche is partially defined but not clearly bounded. | 2 / 3 |
Total | 6 / 12 Passed |
Implementation
55%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill is exceptionally thorough and actionable, providing complete prompt templates, decision trees, and verification protocols for a complex multi-agent orchestration pattern. However, it is severely undermined by extreme verbosity — the three near-identical examples alone account for roughly 60% of the content and could be condensed dramatically. The monolithic structure with no progressive disclosure means the entire ~1000+ line document must be loaded into context, wasting significant token budget on repeated boilerplate.
Suggestions
Extract the three full examples into a separate EXAMPLES.md file referenced from the main skill, keeping only one abbreviated example inline to illustrate the pattern
Extract prompt templates (meta-judge, implementor, judge, retry) into a TEMPLATES.md file, referencing them by name in the main workflow rather than inlining the full text multiple times
Remove redundant explanations — the examples repeat nearly all process steps verbatim; instead, annotate examples with brief notes showing only what differs from the process description
Cut explanatory text that Claude already knows (e.g., 'The primary benefit is parallel execution - multiple independent tasks run concurrently rather than sequentially, dramatically reducing total execution time') and focus only on the novel orchestration-specific instructions
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | This skill is extremely verbose at ~1000+ lines. It repeats the same prompt templates multiple times across three full examples with near-identical boilerplate. The requirement grouping logic, meta-judge templates, judge templates, and retry logic are explained in the process section and then repeated verbatim in each example. Massive amounts of content could be condensed — the three examples alone could be reduced to one annotated example with brief variations noted. It also explains concepts Claude already understands (e.g., what parallel execution is, what independence means in computing). | 1 / 3 |
Actionability | The skill provides highly concrete, copy-paste-ready prompt templates for every agent type (meta-judge, implementor, judge), specific decision trees for model selection and requirement grouping, exact structured output formats, and detailed dispatch patterns. Every phase has executable templates with placeholder variables clearly marked. | 3 / 3 |
Workflow Clarity | The multi-phase workflow is clearly sequenced (Parse → Analyze → Meta-Judge → Implement → Judge → Retry → Summarize) with explicit validation checkpoints at every stage. The independence validation checklist, judge verification protocol, retry logic with max 3 attempts, shared group retry isolation, and failure handling table all provide robust feedback loops. The ASCII flow diagrams clearly illustrate execution patterns. | 3 / 3 |
Progressive Disclosure | The entire skill is a monolithic wall of text with no references to external files. Content that should be split out — such as the three lengthy examples (each 100+ lines), prompt templates, model selection tables, and the requirement grouping decision framework — is all inline. With no bundle files provided, the skill dumps everything into one massive document, making it extremely difficult to navigate and consuming enormous context window space. | 1 / 3 |
Total | 8 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (2209 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
dedca19
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.