Planning pipeline with multi-mode routing (plan/verify/replan). Session discovery → context gathering (spawn_agent) → conditional conflict resolution → task generation (spawn_agent or N+1 parallel agents) → plan verification → interactive replan. Produces IMPL_PLAN.md, task JSONs, TODO_LIST.md.
44
33%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.codex/skills/workflow-plan/SKILL.mdQuality
Discovery
27%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is heavily implementation-focused, reading more like internal architecture documentation than a skill description meant to help Claude select the right skill. While it lists specific pipeline steps and output artifacts, it completely lacks natural trigger terms users would use and has no explicit 'Use when...' guidance. The technical jargon (spawn_agent, N+1 parallel agents, multi-mode routing) would not help Claude match this skill to user requests about planning or breaking down work.
Suggestions
Add a 'Use when...' clause with natural trigger terms like 'plan implementation', 'break down project', 'create task list', 'plan work', 'implementation plan'.
Replace implementation jargon (spawn_agent, N+1 parallel agents, multi-mode routing) with user-facing language describing the value: e.g., 'Generates implementation plans by analyzing project context, resolving conflicts, and producing parallel task breakdowns.'
Include common file/artifact references users might mention: 'implementation plan', 'task breakdown', 'TODO list', 'project plan'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: session discovery, context gathering, conflict resolution, task generation, plan verification, interactive replan. Also names specific outputs: IMPL_PLAN.md, task JSONs, TODO_LIST.md. | 3 / 3 |
Completeness | Describes what it does (planning pipeline steps and outputs) but has no 'Use when...' clause or equivalent explicit trigger guidance. Per rubric guidelines, missing 'Use when' caps completeness at 2, and the 'what' is described in implementation-detail terms rather than user-facing terms, making it weak overall. | 1 / 3 |
Trigger Term Quality | Uses highly technical jargon like 'multi-mode routing', 'spawn_agent', 'N+1 parallel agents', 'conditional conflict resolution' — these are not terms a user would naturally say. Missing natural keywords like 'plan', 'create plan', 'implementation plan', 'break down tasks', 'project planning'. | 1 / 3 |
Distinctiveness Conflict Risk | The specific output artifacts (IMPL_PLAN.md, task JSONs, TODO_LIST.md) and the pipeline terminology create some distinctiveness, but the core concept of 'planning' is broad enough to overlap with other planning or task management skills. | 2 / 3 |
Total | 7 / 12 Passed |
Implementation
39%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill has excellent workflow clarity with well-defined phases, conditional branching, validation gates, and error recovery. However, it is severely over-verbose — the monolithic structure dumps hundreds of lines of implementation detail into a single file without progressive disclosure. Code actionability is undermined by incomplete snippets and undefined variables in critical phases.
Suggestions
Split phase implementations into separate files (e.g., phases/phase-2-context.md, phases/phase-4-tasks.md) and keep SKILL.md as a concise overview with the pipeline diagram and links to phase details.
Fix incomplete code: define the `conflicts` variable in Phase 3, correct the syntax error in Phase 5's conditional, and ensure all code blocks are self-contained and executable.
Remove the ASCII pipeline diagram OR the data flow diagram — they convey overlapping information. Keep whichever is more compact.
Trim implementation boilerplate that Claude can infer (argument parsing, slug generation, mkdir commands) and focus on the unique orchestration logic and agent instructions.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~400+ lines. It includes extensive ASCII diagrams, data flow charts, session structure trees, and lengthy code blocks that could be significantly condensed. Much of the implementation detail (argument parsing, session discovery logic, multi-session selection) is boilerplate that Claude can infer. The Phase 3 conflict resolution includes incomplete pseudocode mixed with verbose comments. | 1 / 3 |
Actionability | The code blocks are mostly concrete JavaScript but several are incomplete or have gaps — Phase 3 references undefined `conflicts` variable, Phase 5 has incomplete conditional syntax (`if (mode === 'verify' || /* auto-verify from Phase 4 */)`), and CLI commands reference `ccw cli` with unclear tool availability. The spawn_agent/wait_agent APIs appear executable but the overall code isn't truly copy-paste ready due to these gaps. | 2 / 3 |
Workflow Clarity | The multi-phase pipeline is exceptionally well-sequenced with clear mode detection, conditional branching (conflict risk gating Phase 3), explicit validation checkpoints (Plan Confirmation Gate, Phase 5 verification with 10 dimensions), backup before replan, and error recovery table. The ASCII pipeline diagram and data flow chart make the sequence unambiguous. | 3 / 3 |
Progressive Disclosure | This is a monolithic wall of text with all implementation details inline. The entire multi-phase pipeline with full code for all 6 phases, session structure, data flow diagrams, and error handling is crammed into a single file. Phase implementations, the ASCII diagrams, and the detailed agent instructions could easily be split into separate reference files, but nothing is externalized. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
72%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 8 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (632 lines); consider splitting into references/ and linking | Warning |
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 8 / 11 Passed | |
0f8e801
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.