Hand off a task to Codex CLI for autonomous execution. Use when a task would benefit from a capable subagent to implement, fix, investigate, or review code. Codex has full codebase access and can make changes.
83
76%
Does it follow best practices?
Impact
99%
2.67xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./data/skills-md/0xbigboss/claude-code/codex/SKILL.mdQuality
Discovery
75%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a reasonably well-structured description with a clear 'Use when' clause and a distinct niche around Codex CLI delegation. Its main weaknesses are that the listed actions are somewhat broad rather than highly specific, and it could include more natural trigger terms that users might say when they want to delegate work to a subagent.
Suggestions
Add more specific concrete actions beyond the broad categories, e.g., 'run tests', 'refactor modules', 'debug failing builds', 'generate patches'.
Include additional natural trigger terms users might say, such as 'delegate', 'run in background', 'parallel task', 'offload work', or 'let another agent handle this'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (autonomous code execution via Codex CLI) and lists some actions ('implement, fix, investigate, or review code'), but these are fairly broad categories rather than highly specific concrete actions like 'run tests', 'create pull requests', or 'apply patches'. | 2 / 3 |
Completeness | Clearly answers both 'what' (hand off a task to Codex CLI for autonomous execution, with full codebase access and ability to make changes) and 'when' ('Use when a task would benefit from a capable subagent to implement, fix, investigate, or review code') with an explicit 'Use when' clause. | 3 / 3 |
Trigger Term Quality | Includes some relevant terms like 'Codex CLI', 'subagent', 'implement', 'fix', 'investigate', 'review code', and 'codebase'. However, it misses natural user phrases like 'run in background', 'delegate', 'parallel task', 'autonomous agent', or 'hand off to another agent'. | 2 / 3 |
Distinctiveness Conflict Risk | The description is clearly about delegating to Codex CLI specifically, which is a distinct niche. The mention of 'subagent', 'autonomous execution', and 'Codex CLI' makes it unlikely to conflict with other coding or review skills. | 3 / 3 |
Total | 10 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-crafted skill with excellent actionability and workflow clarity — every step has concrete commands and clear decision points. The main weaknesses are moderate verbosity (model descriptions, complexity assessment heuristics, some explanatory text Claude doesn't need) and the monolithic structure that could benefit from splitting detailed sections into referenced files. Overall it's a strong, production-ready skill that would effectively guide Claude through Codex subagent delegation.
Suggestions
Trim the model descriptions to just model names and one-word descriptors (e.g., 'gpt-5.2-codex - default', 'o3 - deep reasoning') since Claude doesn't need marketing-style descriptions.
Consider extracting the CTCO prompt template and monitoring instructions into separate referenced files to improve progressive disclosure and reduce the main skill's token footprint.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly well-structured but includes some unnecessary verbosity. The model descriptions ('Flagship model, best for complex professional tasks', etc.) and the detailed complexity assessment section add tokens that Claude could infer. The CTCO prompt template is somewhat redundant given Claude knows how to structure prompts. However, most content is operationally relevant. | 2 / 3 |
Actionability | The skill provides concrete, executable bash commands throughout — from git state gathering, to mkdir, to the full codex exec command with heredoc syntax, to monitoring commands. Flag rules are specific and conditional. The output format template is copy-paste ready. | 3 / 3 |
Workflow Clarity | The multi-step workflow is clearly sequenced: parse arguments → assess complexity → gather context → generate prompt → execute → monitor → return result. Each step has explicit instructions. Monitoring includes validation checkpoints (check if summary exists, wc -l for activity, tail -n 3 for status). Background vs foreground decision criteria are explicit. The 'Do NOT' section provides important guardrails. | 3 / 3 |
Progressive Disclosure | The content is entirely self-contained in one file with no references to external documentation. While the sections are well-organized with clear headers, the skill is quite long (~200 lines) and some sections like the full CTCO prompt template and detailed monitoring instructions could potentially be split into referenced files. The examples at the end are appropriately brief. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
f772de4
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.