Delegate coding tasks to Codex, Claude Code, or Pi agents via background process. Use when: (1) building/creating new features or apps, (2) reviewing PRs (spawn in temp dir), (3) refactoring large codebases, (4) iterative coding that needs file exploration. NOT for: simple one-liner fixes (just edit), reading code (use read tool), or any work in ~/clawd workspace (never spawn agents here). Requires a bash tool that supports pty:true.
89
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Critical
Do not install without reviewing
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly communicates what the skill does (delegates coding tasks to background agents), when to use it (four specific scenarios), and when NOT to use it (three exclusions). The inclusion of specific agent names, use-case triggers, anti-patterns, and technical requirements makes it highly actionable and distinctive.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: building/creating new features or apps, reviewing PRs, refactoring large codebases, iterative coding with file exploration. Also specifies anti-patterns (one-liner fixes, reading code) and technical requirements (bash tool with pty:true). | 3 / 3 |
Completeness | Clearly answers both 'what' (delegate coding tasks to agents via background process) and 'when' with an explicit 'Use when:' clause listing four trigger scenarios, plus a 'NOT for:' section clarifying boundaries. This is exemplary completeness. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'coding tasks', 'Codex', 'Claude Code', 'agents', 'building', 'creating', 'new features', 'apps', 'reviewing PRs', 'refactoring', 'large codebases', 'spawn', 'background process'. Good coverage of terms a user delegating work would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive — focuses specifically on delegating to background agents (Codex, Claude Code, Pi), which is a clear niche. The NOT-for clauses further reduce conflict risk by explicitly excluding simple edits and code reading, which other skills would handle. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a highly actionable skill with excellent executable examples and clear multi-step workflows for spawning and managing coding agents. Its main weaknesses are moderate verbosity — including tool parameter tables Claude already knows, repeated PTY reminders, and humor/anecdotes that consume tokens — and a monolithic structure that could benefit from splitting reference material into separate files. The workflow patterns (especially parallel PR reviews and git worktree issue fixing) are genuinely valuable and well-sequenced.
Suggestions
Move the bash tool parameter table and process action table to a separate REFERENCE.md — Claude already knows its own tool parameters, and this saves ~30 lines of tokens.
Remove or drastically trim the 'Learnings' section since most points (PTY, git repo, exec, submit vs write) are already covered inline; the haiku anecdote adds no instructional value.
Reduce repeated PTY reminders — state it once prominently at the top and in the rules, rather than in comments on nearly every code block.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is mostly efficient but includes unnecessary content: the emoji/humor ('like your soul.md 😅', the haiku anecdote, 'space lobster'), the full bash tool parameter table (Claude already knows its own tools), and some repetitive reminders about PTY. The learnings section partially restates what was already covered. | 2 / 3 |
Actionability | Excellent actionability throughout — every section provides copy-paste-ready bash commands with correct flags, concrete examples for one-shot tasks, background monitoring, PR reviews, parallel workflows, and git worktree patterns. Commands are fully executable, not pseudocode. | 3 / 3 |
Workflow Clarity | Multi-step workflows are clearly sequenced (e.g., the parallel issue fixing section has numbered steps: create worktrees → launch agents → monitor → create PRs → cleanup). Background session monitoring with poll/log/kill provides validation checkpoints. The progress updates section adds explicit feedback loops for error recovery and user communication. | 3 / 3 |
Progressive Disclosure | The content is well-structured with clear headers and sections, but it's a long monolithic file (~200+ lines) with no references to external files for detailed content like the full tool parameter tables, agent-specific guides, or the learnings section. The bash tool parameters and process actions tables could be in a reference file. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
metadata_field | 'metadata' should map string keys to string values | Warning |
Total | 9 / 11 Passed | |
a5bf5e0
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.