Delegate coding tasks to Codex, Claude Code, or Pi agents via background host sessions. Use when: (1) building or creating new features or apps, (2) reviewing PRs (spawn in temp dir), (3) refactoring large codebases, (4) iterative coding that needs file exploration. NOT for: simple one-liner fixes (just edit), reading code (use read tool), thread-bound ACP harness requests in chat (for example spawn or run Codex or Claude Code in a Discord thread; use sessions_spawn with runtime:"acp"), or any work in ~/clawd workspace (never spawn agents here). Requires OpenClaw host tools with exec_command plus write_stdin.
89
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly defines what the skill does, when to use it, and critically, when NOT to use it. The explicit positive and negative trigger conditions make it highly distinguishable from other coding-related skills. The description is detailed yet well-organized with numbered use cases and clear boundary conditions.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: building/creating features or apps, reviewing PRs (with detail about temp dir), refactoring large codebases, iterative coding with file exploration. Also specifies what NOT to use it for, and names required tools (OpenClaw host tools with exec_command plus write_stdin). | 3 / 3 |
Completeness | Clearly answers both 'what' (delegate coding tasks to Codex/Claude Code/Pi agents via background host sessions) and 'when' with an explicit 'Use when:' clause listing four numbered scenarios, plus a 'NOT for:' section clarifying boundaries. Exceptionally thorough. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'coding tasks', 'Codex', 'Claude Code', 'Pi agents', 'building', 'creating new features', 'apps', 'reviewing PRs', 'refactoring', 'large codebases', 'iterative coding'. Also includes negative triggers to prevent misuse. Good coverage of terms a user would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with clear niche: delegating to specific agent types (Codex, Claude Code, Pi) via background host sessions. The explicit 'NOT for' section with boundary conditions (simple fixes, reading code, ACP harness requests, ~/clawd workspace) strongly reduces conflict risk with other coding-related skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, highly actionable skill with excellent executable examples and clear multi-step workflows for delegating coding tasks to various agents. Its main weaknesses are moderate verbosity (repeated PTY warnings, explanatory asides, redundant 'Learnings' section) and a monolithic structure that could benefit from splitting detailed agent-specific content into separate files. The time-sensitive 'Jan 2026' learnings and Pi PR reference are minor concerns.
Suggestions
Remove the 'Learnings (Jan 2026)' section — its points are already covered inline (PTY warnings, git init, exec usage, append_newline). This saves ~15 lines of redundant content.
Move agent-specific details (Codex flags/modes, Pi provider options, batch PR review patterns) into separate reference files and link from the main skill to improve progressive disclosure.
Remove explanatory asides like 'Why git init?', 'Why workdir matters', and humorous comments — Claude can infer these, and they consume tokens without adding actionable value.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Generally efficient but includes some unnecessary explanations (e.g., 'Why git init?' and 'Why workdir matters' are things Claude can infer), the 'Learnings' section at the end repeats information already covered above, and some commentary like 'it'll read your soul docs and get weird ideas about the org chart' is flavor text that wastes tokens. | 2 / 3 |
Actionability | Excellent concrete, executable examples throughout — every agent (Codex, Claude Code, Pi, OpenCode) has copy-paste-ready commands with correct flags, parameters, and realistic workflows including PR review, parallel issue fixing, and background monitoring patterns. | 3 / 3 |
Workflow Clarity | Multi-step workflows are clearly sequenced (e.g., parallel issue fixing: create worktrees → launch agents → monitor → create PRs → cleanup). Background session lifecycle is well-defined with start/monitor/input/kill steps. The PR review workflow includes explicit safety steps (clone to temp, cleanup after). Progress update guidelines provide clear checkpoints. | 3 / 3 |
Progressive Disclosure | The content is well-structured with clear sections and headers, but it's a long monolithic file (~200 lines) that could benefit from splitting agent-specific details (Codex flags, Pi options, parallel workflows) into separate reference files. No external file references are provided for deeper content. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
metadata_field | 'metadata' should map string keys to string values | Warning |
Total | 9 / 11 Passed | |
fcc550d
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.