Converts a PRD or requirements document into a structured, phased implementation plan with individual phase files and granular per-task files written to .context/plans/. Also restructures existing monolithic planning documents into digestible, hierarchical directory structures. Creates a root plan index summarising all phases, a numbered phase file per phase, and a numbered task file per task inside each phase directory.
92
93%
Does it follow best practices?
Impact
91%
3.25xAverage score across 5 eval scenarios
Passed
No known issues
An implementation plan is a navigable contract between an agent and a codebase. Each file answers a single question: "What do I need to do next, how do I do it, and how do I prove it is done?"
Two failure modes to avoid:
The sweet spot is a tree where every node either navigates (README) or implements (task file), and every leaf has a runnable verification command.
Mode 1 — Create a new plan from a PRD:
sh scripts/new-plan.sh url-shortener-service
sh scripts/new-phase.sh url-shortener-service 01 workspace-bootstrap
sh scripts/new-task.sh url-shortener-service 01 01 initialise-npm-package
sh scripts/validate-plan.sh url-shortener-serviceMode 2 — Split a monolithic document:
# 1. Create hierarchy manually (steps below) or from JSON:
sh scripts/generate-structure.sh --plan plan.json
# 2. Validate before removing source
sh scripts/validate-structure.sh docs/refactoring/phases| Signal | Mode |
|---|---|
| User provides a PRD, spec, or requirements description | Mode 1 |
| User provides a single large planning document to split | Mode 2 |
| User provides flat phase files to reorganise | Mode 2 |
| User says "add a phase" to an existing plan | Mode 1 (additive) |
| User says "split", "organise", "refactor this plan" | Mode 2 |
Creating a new plan:
Restructuring an existing plan:
These show how to map real user input to the correct mode and expected output:
| User says | Mode | Expected output |
|---|---|---|
| "Here's the PRD — create a phased plan" | Mode 1 | .context/plans/plan-<slug>/ with README, phase READMEs, and task files |
| "Break this spec into tasks" | Mode 1 | Plan with tasks scoped to individual files + runnable verification commands |
| "I need a project roadmap for this feature" | Mode 1 | Plan with phases matching delivery milestones, gate criteria per phase |
| "Here is my big planning doc — split it into files" | Mode 2 | Hierarchical directory under docs/refactoring/phases/ |
| "Organise these phase files into a proper structure" | Mode 2 | Phase directories with READMEs, activities grouped, validate-structure.sh exits 0 |
| "Add phase 4 to the existing plan" | Mode 1 (additive) | New phase-04-<slug>/ directory with tasks; existing files untouched |
| "The PRD has auth, ingestion, pipeline, storage, query, viz, multi-tenancy, ops, DX" | Mode 1 (guardrail) | STOP — 9 phases detected; message user with A/B/C options; zero files created |
| Input | Description |
|---|---|
| PRD / spec | A document, inline description, or file path describing what to build |
| Phase count | Optional — infer from scope if not provided |
| Output path | Optional — defaults to .context/plans/ |
.context/plans/
plan-<slug>/
README.md # root index: goal, all phases, status table
phases/
phase-01-<slug>/
README.md # phase overview: goal, gate, tasks summary
tasks/
task-P01T01-<slug>.md # task: goal, file, implementation, verification
task-P01T02-<slug>.md
phase-02-<slug>/
README.md
tasks/
task-P02T01-<slug>.md
task-P02T02-<slug>.mdEach file is self-contained: an agent can work on a single task file without reading the rest of the plan.
Read the PRD in full. Identify:
FIRST: count the natural phases before designing anything.
Each distinct domain, capability area, or labelled section in the PRD counts as one phase unless two sections share a single atomic deliverable that cannot be tested independently. When in doubt, count them separately — err on the side of a higher count to avoid missing the guardrail.
Scan the PRD, count every distinct section or domain, and record the total. If the count is 9 or more, STOP immediately — do not design phases, do not run any scripts, do not create any files. Instead, message the user with the count and 2–3 concrete options:
I've identified N natural phases from the requirements. Before I create any files,
please choose one of these approaches:
A. Split into two plans: plan-<core-slug> (phases 1–5) and plan-<surface-slug> (phases 6–N)
B. Consolidate to 7 phases by merging [phase X] and [phase Y] into one
C. Proceed with all N phases in a single plan
Which would you prefer?Wait for the user's answer before doing anything else. Do not silently cap at 8 and omit scope. Do not proceed with any number of phases without the user choosing when N ≥ 9.
Once the phase count is confirmed to be ≤ 8 (either naturally or after consolidation), continue designing:
Group work into sequential phases where each phase delivers a testable, deployable increment. Each phase must have:
Typical phase progression for a greenfield project:
Adapt freely — fewer phases for small projects, more for large ones.
Each task must be:
Use the identifier format P{phase_number}T{task_number}, both zero-padded.
Example: P02T03 = phase 2, task 3. Use 1-based numbering (01, 02, …) for
consistent alphabetical sorting.
MUST use the scaffold scripts — never create directories or files manually.
The scripts stamp the correct stubs that validate-plan.sh expects. Using mkdir
or writing files from scratch will produce structures that fail validation.
MUST NOT modify existing files — all operations are additive only. When
appending phases to an existing plan, run new-phase.sh and new-task.sh for
the new content only. Never edit, rename, or delete files that already exist in
the plan. Existing phase directories, task files, and the root README must remain
byte-for-byte identical after the operation, except that the root README may have
new phase entries appended to its phase listing.
# 1. Create the plan root — MUST run first (skip if plan already exists)
sh scripts/new-plan.sh <plan-slug>
# 2. Create each phase directory — MUST use this for every phase
sh scripts/new-phase.sh <plan-slug> <phase-number> <phase-slug>
# 3. Create each task file inside its phase — MUST use this for every task
sh scripts/new-task.sh <plan-slug> <phase-number> <task-number> <task-slug>Slugs are lowercase kebab-case summaries of the title. Examples:
plan-ecommerce-checkout-redesignphase-01-workspace-bootstraptask-P01T02-root-package-json.mdAfter scaffolding, fill in the generated stub files following the structure
defined in references/templates/plan.yaml, references/templates/phase.yaml,
and references/templates/task.yaml.
Do not embed implementation detail in the root
README.md— keep it as a navigation index only. Detail belongs in phase and task files.
MUST run before reporting to the user — no exceptions.
sh scripts/validate-plan.sh <plan-slug>The script checks each file against its schema in references/schemas/:
| File | Schema |
|---|---|
plan-<slug>/README.md | references/schemas/plan.schema.json |
phases/phase-NN-<slug>/README.md | references/schemas/phase.schema.json |
phases/phase-NN-<slug>/tasks/task-*.md | references/schemas/task.schema.json |
If any file fails: fix the violation and re-run validate-plan.sh. Repeat
until exit 0. Do not report completion until exit 0 is confirmed.
Every task file MUST include a verification section with a concrete, runnable shell command. Gates MUST be exit-code-based (exit 0 / non-zero, file exists, URL returns 200). Gates MUST NOT use vague language like "works correctly" or "tests pass" without specifying the exact command.
After writing all files, output a summary:
Created implementation plan at .context/plans/plan-<slug>/
README.md
phases/phase-01-workspace-bootstrap/README.md (N tasks)
phases/phase-01-workspace-bootstrap/tasks/task-P01T01-*.md
...Use this mode when the source is a monolithic document or flat set of phase files that needs to be reorganised into a navigable hierarchy.
| Source type | Approach | Automation |
|---|---|---|
Flat .md phase files | Manual workflow (steps 1–7 below) | Run validate-structure.sh after |
| JSON plan definition | Automated | Run generate-structure.sh, then validate |
| Existing structure | Validation only | Run validate-structure.sh |
Load scripts and templates only when needed:
scripts/validate-structure.sh after completing a manual split or when asked to validatescripts/generate-structure.sh when the user provides a JSON plan file
references/templates/*.yaml only when using automation or customising outputreferences/schemas/*.json only when debugging validation errorsWhen to flatten vs subdivide:
| Signal | Action |
|---|---|
| Item has <3 children | Flatten — merge with sibling or parent |
| All children fit on one README screen (<20 lines) | Flatten |
| Navigation depth would exceed 4 levels | Flatten intermediate level |
| Item has >10 children | Subdivide |
| Children cluster into distinct semantic groups | Subdivide |
| Parallel work streams or different team ownership | Subdivide |
Rule of thumb: 3–7 items per group is the sweet spot.
Naming conventions that scale:
step-1-extract-movement-logic/ # GOOD: survives refactors
step-1-refactor-game-code/ # BAD: vague, becomes meaningless
activity-1-analysis-complete/ # GOOD: outcome-oriented
step-1-initial-setup/ # BAD: "initial" becomes misleading later
step-1-project-bootstrap/ # GOOD: timelessBefore splitting, confirm:
step-1.1, step-1.2 belong together)Ask yourself:
docs/refactoring/phases/
phase-{number}-{name}/
README.md
activities/ # OR steps/
README.md
activity-{number}-{description}/
README.md
activity-{number}.{sub}-*.mdMax depth: 4 levels (phase → activities/steps → group → leaf). Flatten if deeper.
Never use numeric-only directory names (
step-1/,activity-2/) and never use generic names (step-1-stuff/) — contributors cannot navigate without opening every file.
Each leaf file must contain: title, description, checklist, acceptance criteria, status.
See references/templates/step-file.yaml for the exact structure.
Every non-leaf directory needs a README explaining its purpose and listing its
children. Minimum 3 lines. See references/templates/phase-readme.yaml,
references/templates/group-readme.yaml, and references/templates/intermediate-readme.yaml.
Format: {type}-{number}-{kebab-description}. Use naming heuristics above.
Items with the same prefix (1.x, 2.x) go in the same parent directory.
Never mix groups (no step-2.1 inside step-1-extract/).
Update all README links after restructuring. Verify with validate-structure.sh.
Create the new hierarchy first, validate, then remove the old flat files — never delete source before the new structure is confirmed valid.
sh scripts/validate-structure.sh docs/refactoring/phases
# Exit 0 = valid, 1 = invalidBefore marking complete:
README.md1.1, 1.2 together; not 1.x with 2.x)sh scripts/validate-structure.sh <phases-dir>
# Exit 0 = valid, 1 = one or more violations (details printed to stdout)For automation (JSON plan generation), error recovery recipes, and legacy naming conventions see references/mode2-advanced.md.
<!-- BAD: unverifiable -->
### Gate
The API works correctly and all tests pass.
<!-- GOOD: runnable, exit-code-based -->
### Gate
```sh
npm test -- --reporter=tap | tap-parser --ok
curl -sf http://localhost:3000/health | jq -e '.status == "ok"'#### Over-bundled tasks
```markdown
<!-- BAD: one task, five unrelated files, no isolation -->
# P01T03 — Set up project
Implement src/server.ts, src/routes/users.ts, src/db/migrations/001.sql,
package.json, and tsconfig.json.
<!-- GOOD: one task, one file, independently verifiable -->
# P01T03 — Initialise tsconfig.json
File: tsconfig.json
Verification: npx tsc --noEmit && echo "ok"ALWAYS: one task = one independently verifiable unit of work.
<!-- BAD: implementation steps in the navigation index -->
# Plan: URL Shortener
## Phase 1
Install dependencies with `npm install`. Then create src/index.ts with the
following content: ...
<!-- GOOD: navigation index only -->
# Plan: URL Shortener
| Phase | Goal | Status |
|---|---|---|
| [01 — Bootstrap](phases/phase-01-workspace-bootstrap/README.md) | Runnable skeleton | pending |# BAD: breaks alphabetical sort, inconsistent
task-P1T1-setup.md
task-P1T10-final.md ← sorts before P1T2
# GOOD: consistent alphabetical sort
task-P01T01-setup.md
task-P01T10-final.mdNEVER report completion without running sh scripts/validate-plan.sh <slug> first.
ALWAYS fix every schema violation and re-run until exit 0 before reporting to the user.
WHY: Schema violations caught here prevent downstream agents from parsing task files correctly. A plan that looks complete but fails validation is unusable.
WHY: Plans with 9+ phases become unmanageable. Splitting or consolidating up-front is far cheaper than restructuring after files exist. Silently capping scope hides requirements from the user — a correctness bug, not a style preference.
When the PRD yields 9 or more natural phases, the ONLY correct action is to stop and ask. There is no other valid path.
NEVER silently cap at 8 phases — omitting scope without telling the user is a bug. NEVER silently create 9+ phases — proceeding without the user's choice is a bug. ALWAYS count phases as the very first action in Step 2, before designing anything. ALWAYS stop and message the user the moment the count reaches 9. ALWAYS wait for the user's answer before running any scripts or creating any files.
# BAD: silently limits plan to 8 phases without telling the user
# (agent designs 8 phases, leaves out the 9th domain entirely)
sh scripts/new-phase.sh my-plan 08 ... ← should have asked first
# BAD: creates all 12 phases without asking
sh scripts/new-phase.sh my-plan 12 ... ← should have stopped at count=9
# GOOD: counts first, stops immediately when count ≥ 9, asks before any files
"I've identified 9 phases. Before I create any files, which do you prefer?
A. Split into plan-core (phases 1–5) and plan-surface (phases 6–9)
B. Consolidate to 7 phases by merging ops and DX into one
C. Proceed with all 9 phases in a single plan"NEVER use mkdir -p to build the plan tree. ALWAYS use new-plan.sh, new-phase.sh,
and new-task.sh — they stamp the correct file stubs and naming conventions that
validate-plan.sh expects.
WHY: Hand-rolled directories miss required sections, use wrong naming conventions, and fail schema validation. The scripts are the single source of truth for the file contract.
WHY: Pre-existing files may be in active use by other agents or humans. Editing them causes conflicts, corrupts in-progress work, and breaks the additive-only contract that makes plans safe to extend incrementally.
NEVER edit, rename, or delete existing phase directories, task files, or the root README's existing phase entries when adding a new phase to a plan. ALWAYS treat all pre-existing files as read-only. The only permitted write to the root README is appending the new phase entry at the end of the phases list.
# BAD: editing an existing task file while adding phase-03
edit phases/phase-02-data-model/tasks/task-P02T01-schema.md ← MUST NOT touch
# GOOD: only new files are written
sh scripts/new-phase.sh my-plan 03 api-layer
sh scripts/new-task.sh my-plan 03 01 user-endpoints
# existing phase-01, phase-02 files are untouched# BAD: opaque — must open file to understand contents
phases/1/
phases/2/
phases/2/1/
# GOOD: self-documenting
phases/phase-1-codebase-analysis/
phases/phase-2-service-extraction-prep/# BAD: 2 items don't need their own group directory
phase-3-user-service/
activities/
group-a-database/
activity-3.1-create-schema.md # only 2 items — flatten!
activity-3.2-run-migrations.md
group-b-api/
activity-3.3-implement-crud.md # only 1 item — definitely flatten!
# GOOD: flat under activities/ when <3 children
phase-3-user-service/
activities/
activity-3.1-create-schema.md
activity-3.2-run-migrations.md
activity-3.3-implement-crud.mdNEVER delete the source document before validate-structure.sh exits 0.
ALWAYS treat the source as the ground truth until the new hierarchy is confirmed valid.
# BAD: data loss if hierarchy is invalid
rm -rf docs/old-plan.md
sh scripts/validate-structure.sh docs/refactoring/phases # too late
# GOOD: validate first, delete after
sh scripts/validate-structure.sh docs/refactoring/phases && rm docs/old-plan.mdNEVER apply Mode 2 restructuring to a new PRD, or Mode 1 scaffolding to an existing flat document. ALWAYS check the signal table in "When to use each mode" before deciding which mode applies.
Given: PRD for a URL shortener with REST API, SQLite storage, and a health check.
Scope analysis: 3 natural phases (bootstrap → core logic → API + health).
sh scripts/new-plan.sh url-shortener-service
sh scripts/new-phase.sh url-shortener-service 01 workspace-bootstrap
sh scripts/new-task.sh url-shortener-service 01 01 initialise-npm-package
sh scripts/new-task.sh url-shortener-service 01 02 configure-typescript
sh scripts/new-phase.sh url-shortener-service 02 database-layer
sh scripts/new-task.sh url-shortener-service 02 01 create-sqlite-schema
sh scripts/new-task.sh url-shortener-service 02 02 implement-url-repository
sh scripts/new-phase.sh url-shortener-service 03 http-api
sh scripts/new-task.sh url-shortener-service 03 01 post-shorten-endpoint
sh scripts/new-task.sh url-shortener-service 03 02 get-redirect-endpoint
sh scripts/new-task.sh url-shortener-service 03 03 health-check-endpoint
sh scripts/validate-plan.sh url-shortener-serviceTask task-P03T01-post-shorten-endpoint.md verification:
npm start &; sleep 1
curl -sf -X POST http://localhost:3000/shorten \
-H 'Content-Type: application/json' \
-d '{"url":"https://example.com"}' | jq -e '.code | length == 6'
kill %1PRD covers auth, ingestion, pipeline, storage, query, viz, multi-tenancy, ops, DX (9 domains).
Step 2, action 1: Count natural phases = 9. Count ≥ 9 → guardrail triggered.
Correct behaviour: Stop immediately. Zero scripts run. Zero files created. Send this message to the user and wait:
I've counted 9 natural phases from the requirements. I need your input before I create any files:
A. Split into two plans:
plan-platform-core(auth, ingestion, pipeline, storage, query) andplan-platform-surface(viz, multi-tenancy, ops, DX)B. Consolidate to 7 phases: merge storage+query into one phase and ops+DX into one (→ 7 phases total)
C. Proceed with all 9 phases in a single plan
Which would you prefer?
Wrong behaviour — do not do any of these:
new-plan.sh before receiving the user's answerSource: docs/migration-plan.md with 5 numbered sections, ~35 items.
Scope analysis: 5 phases, 4–9 items each — within the 3–7 sweet spot for most.
docs/refactoring/phases/
phase-1-codebase-analysis/ # 4 items → flat under activities/
README.md
activities/
activity-1.1-dependency-graph.md
activity-1.2-bounded-contexts.md
activity-1.3-shared-libraries.md
activity-1.4-data-ownership.md
phase-3-user-service-extraction/ # 9 items → consider grouping
README.md
activities/
group-a-implementation/ # 5 items (3.1–3.5)
activity-3.1-copy-domain.md
...
group-b-rollout/ # 4 items (3.6–3.9)
activity-3.6-deploy-staging.md
...| Topic | Location |
|---|---|
| Mode 2 automation, error recovery, legacy naming | references/mode2-advanced.md |
| Before/after structure transformation example | references/example-transformation.md |
| File format templates and schemas | below |
| Template | Schema | Purpose |
|---|---|---|
references/templates/plan.yaml | references/schemas/plan.schema.json | Root index structure (Mode 1) |
references/templates/phase.yaml | references/schemas/phase.schema.json | Phase overview structure (Mode 1) |
references/templates/task.yaml | references/schemas/task.schema.json | Individual task structure (Mode 1) |
references/templates/phase-readme.yaml | references/schemas/readme-file.schema.json | Phase directory README (Mode 2) |
references/templates/group-readme.yaml | references/schemas/readme-file.schema.json | Group/intermediate README (Mode 2) |
references/templates/intermediate-readme.yaml | references/schemas/readme-file.schema.json | Activities/steps dir README (Mode 2) |
references/templates/step-file.yaml | references/schemas/step-file.schema.json | Leaf step/activity file (Mode 2) |
See references/example-transformation.md for a before/after structure comparison. See references/mode2-advanced.md for automation, error recovery, and legacy naming guidance.
.context/plans/ content is never deleted; new files are additive.