Delegate coding tasks to Codex, Claude Code, or Pi agents via background host sessions. Use when: (1) building or creating new features or apps, (2) reviewing PRs (spawn in temp dir), (3) refactoring large codebases, (4) iterative coding that needs file exploration. NOT for: simple one-liner fixes (just edit), reading code (use read tool), thread-bound ACP harness requests in chat (for example spawn or run Codex or Claude Code in a Discord thread; use sessions_spawn with runtime:"acp"), or any work in ~/clawd workspace (never spawn agents here). Requires OpenClaw host tools with exec_command plus write_stdin.
83
Quality
81%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
85%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description that clearly defines what the skill does and when to use it, with helpful exclusion criteria that reduce ambiguity. The main weakness is the use of technical jargon (ACP, OpenClaw, exec_command, write_stdin) that users wouldn't naturally use as trigger terms, and some missing natural language variations for common use cases.
Suggestions
Replace or supplement technical jargon with natural user language - instead of 'ACP harness requests', use terms like 'spawn agent in chat thread' or 'run coding assistant'
Add common natural trigger phrases users might say: 'delegate this task', 'run this in background', 'have another agent handle this', 'parallel coding work'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'building or creating new features or apps', 'reviewing PRs', 'refactoring large codebases', 'iterative coding that needs file exploration'. Also specifies what NOT to use it for with concrete examples. | 3 / 3 |
Completeness | Clearly answers both what ('Delegate coding tasks to Codex, Claude Code, or Pi agents via background host sessions') and when with explicit numbered triggers ('Use when: (1) building... (2) reviewing PRs... (3) refactoring... (4) iterative coding'). Also includes explicit NOT for cases which adds clarity. | 3 / 3 |
Trigger Term Quality | Includes some relevant terms like 'Codex', 'Claude Code', 'Pi agents', 'PRs', 'refactoring', but uses technical jargon ('ACP harness', 'OpenClaw host tools', 'exec_command', 'write_stdin') that users wouldn't naturally say. Missing common variations like 'delegate work', 'parallel tasks', 'background coding'. | 2 / 3 |
Distinctiveness Conflict Risk | Very clear niche with distinct triggers around agent delegation and background sessions. The explicit NOT for cases ('simple one-liner fixes', 'reading code', 'thread-bound ACP harness requests', '~/clawd workspace') significantly reduce conflict risk with other coding skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, actionable skill with excellent executable examples and clear multi-step workflows. The main weakness is moderate verbosity - the PTY warning is emphasized multiple times, and the learnings section largely repeats earlier content. The document could be more token-efficient by consolidating repeated warnings and potentially splitting agent-specific details into separate files.
Suggestions
Consolidate PTY warnings into a single prominent section rather than repeating throughout the document
Remove or significantly trim the 'Learnings' section since it repeats information already covered in the main content
Consider splitting agent-specific sections (Codex CLI, Claude Code, Pi) into separate reference files with just quick-start examples in the main skill
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Generally efficient with good use of tables and code blocks, but includes some redundant explanations (PTY warning repeated multiple times, some verbose commentary like 'it'll read your soul docs'). The learnings section at the end repeats information already covered. | 2 / 3 |
Actionability | Excellent executable examples throughout - copy-paste ready commands with proper flags, clear parameter tables, and concrete workflows for PR reviews, parallel issue fixing, and batch operations. Every major use case has working code. | 3 / 3 |
Workflow Clarity | Multi-step processes are clearly sequenced with numbered steps (worktree workflow, PR review workflow). Includes validation checkpoints (monitor with write_stdin, check if done) and explicit cleanup steps. The parallel issue fixing section is particularly well-structured. | 3 / 3 |
Progressive Disclosure | Content is well-organized with clear sections and a logical flow from quick start to advanced patterns. However, it's a long monolithic document (~200 lines) that could benefit from splitting detailed agent-specific sections (Codex, Claude Code, Pi) into separate reference files. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
metadata_field | 'metadata' should map string keys to string values | Warning |
Total | 9 / 11 Passed | |
50ef2f3
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.