This skill should be used when the user asks to "fork", "race claude and codex", "dual implement", "run both models", "compare implementations", "implement with both", "fork it", or wants the same task implemented by both Claude Code and OpenAI Codex in parallel to pick the best result. Also triggered by the /fork command.
68
83%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description excels at trigger term coverage and completeness, providing extensive natural language triggers and a clear 'when to use' clause. Its main weakness is that the 'what it does' portion is somewhat thin — it describes the high-level goal (parallel implementation to pick the best result) but doesn't detail the specific mechanics or actions the skill performs (e.g., branching, diffing, merging). The description is also heavily front-loaded with trigger terms, making it read more like a trigger list than a balanced capability description.
Suggestions
Add 2-3 specific concrete actions the skill performs, e.g., 'Creates parallel branches, dispatches the same task to Claude Code and OpenAI Codex, diffs the outputs, and presents a comparison for the user to pick the best result.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names the domain (parallel implementation with two models) and the general action (implementing the same task with both Claude Code and OpenAI Codex), but it doesn't list specific concrete actions like 'creates branches', 'runs diffs', 'merges results', etc. The 'what it does' is somewhat vague beyond 'same task implemented by both'. | 2 / 3 |
Completeness | The description explicitly answers both 'what' (same task implemented by both Claude Code and OpenAI Codex in parallel to pick the best result) and 'when' (triggered by specific phrases like 'fork', 'race claude and codex', etc., and the /fork command). The 'Use when' equivalent is clearly stated. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms: 'fork', 'race claude and codex', 'dual implement', 'run both models', 'compare implementations', 'implement with both', 'fork it', and the /fork command. These are varied and natural phrases a user would actually say. | 3 / 3 |
Distinctiveness Conflict Risk | This skill occupies a very clear niche — parallel implementation across two specific AI models (Claude Code and OpenAI Codex). The trigger terms like 'fork', 'race claude and codex', and 'dual implement' are highly distinctive and unlikely to conflict with other skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, highly actionable skill with an exceptionally clear multi-step workflow including proper validation, error handling, and user confirmation gates. The main weakness is its length — while most content earns its place, some sections could be tightened, and the monolithic structure could benefit from splitting detailed reference material (like the inline comparison fallback criteria) into separate files. The executable code examples and explicit constraints make this immediately usable.
Suggestions
Consider extracting the inline comparison fallback criteria (Step 8) and the Error Handling section into a separate reference file to reduce the main skill's token footprint.
Tighten the suffix generation step — the hashlib approach could be replaced with a simpler `mktemp`-style note or `date +%s | sha256sum | head -c8` one-liner without the Python explanation.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly detailed and well-structured, but includes some verbosity that could be tightened — e.g., the suffix generation explanation, some redundant restatements of constraints already implied by the workflow. However, it largely avoids explaining concepts Claude already knows and most content is task-specific. | 2 / 3 |
Actionability | Excellent actionability throughout — every step includes concrete, executable bash commands with specific flags, exact variable names, and copy-paste-ready code blocks. The prompt composition, parallel execution, error checking, and cleanup steps are all fully specified with real commands. | 3 / 3 |
Workflow Clarity | The 10-step workflow is clearly sequenced with explicit validation checkpoints: prerequisite checks (Step 1), dirty working tree warnings (Step 2), success/failure handling with branching logic (Step 6), user confirmation before merge (Step 9), and cleanup (Step 10). Error handling is covered in a dedicated section with feedback loops for conflicts and timeouts. | 3 / 3 |
Progressive Disclosure | The content is well-organized with clear section headers and numbered steps, but it's a long monolithic document (~180 lines of substantive content) with no references to external files. The fallback comparison criteria and constraints sections could potentially be split out. However, given no bundle files exist, inline content is the only option, which partially justifies this approach. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
8fe6eb4
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.