Delegate coding tasks to Codex, Claude Code, or Pi agents via background host sessions. Use when: (1) building or creating new features or apps, (2) reviewing PRs (spawn in temp dir), (3) refactoring large codebases, (4) iterative coding that needs file exploration. NOT for: simple one-liner fixes (just edit), reading code (use read tool), thread-bound ACP harness requests in chat (for example spawn or run Codex or Claude Code in a Discord thread; use sessions_spawn with runtime:"acp"), or any work in ~/clawd workspace (never spawn agents here). Requires OpenClaw host tools with exec_command plus write_stdin.
89
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly defines its purpose, trigger conditions, and boundaries. The numbered use-cases and explicit exclusions make it highly actionable for skill selection. The inclusion of required tooling (OpenClaw host tools) and negative triggers (NOT for) demonstrates thorough design for disambiguation in a multi-skill environment.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: building/creating features or apps, reviewing PRs (with detail about temp dir), refactoring large codebases, iterative coding with file exploration. Also specifies what NOT to use it for, and names required tools (OpenClaw host tools with exec_command plus write_stdin). | 3 / 3 |
Completeness | Clearly answers both 'what' (delegate coding tasks to Codex/Claude Code/Pi agents via background host sessions) and 'when' with an explicit 'Use when:' clause listing four numbered scenarios, plus a 'NOT for:' section that further clarifies boundaries. This is comprehensive and explicit. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'coding tasks', 'Codex', 'Claude Code', 'Pi agents', 'building', 'creating new features', 'apps', 'reviewing PRs', 'refactoring', 'large codebases', 'iterative coding'. Also includes negative triggers to prevent misuse. These are terms users would naturally use when requesting these capabilities. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with clear niche: delegating to specific agent types (Codex, Claude Code, Pi) via background host sessions. The explicit 'NOT for' section with boundary conditions (simple fixes, reading code, ACP harness requests, ~/clawd workspace) significantly reduces conflict risk with other coding-related skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, highly actionable skill with excellent executable examples and clear multi-step workflows for spawning and managing coding agents. Its main weaknesses are moderate verbosity — the PTY warning is repeated many times, some explanatory asides are unnecessary for Claude, and the 'Learnings' section largely duplicates earlier content. The file would also benefit from splitting into a concise overview with references to detailed per-agent guides.
Suggestions
Reduce PTY repetition: state the tty:true requirement once prominently at the top and remove the ~10 inline reminders like '# remember PTY!' and '# with PTY!'
Move the 'Learnings' section content into the relevant sections above rather than repeating it, or remove it entirely since it restates earlier guidance
Consider splitting per-agent details (Codex flags, Claude Code, Pi, OpenCode) and the PR review workflow into separate referenced files to improve progressive disclosure
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Generally efficient but includes some unnecessary explanations (e.g., 'Why git init?' and 'Why workdir matters' are things Claude can infer, the 'Learnings' section repeats earlier content, and some commentary like 'it'll read your soul docs' is fluff). The PTY warning is repeated excessively throughout. | 2 / 3 |
Actionability | Excellent concrete, executable examples throughout — every agent (Codex, Claude Code, Pi, OpenCode) has copy-paste ready commands with correct parameters. The parallel worktree workflow, PR review flow, and background session monitoring are all fully specified with real commands. | 3 / 3 |
Workflow Clarity | Multi-step workflows are clearly sequenced (e.g., the parallel issue fixing flow: create worktrees → launch agents → monitor → create PRs → cleanup). Background session lifecycle is well-defined with start/monitor/input/kill steps. Progress update guidelines provide clear checkpoints for user communication. | 3 / 3 |
Progressive Disclosure | Content is well-structured with clear headers and sections, but it's a fairly long monolithic file (~200 lines) that could benefit from splitting agent-specific details (Codex flags, Pi options, PR review patterns) into separate reference files. No external file references are used despite the content length warranting it. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
metadata_field | 'metadata' should map string keys to string values | Warning |
Total | 9 / 11 Passed | |
af8bd5f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.