CtrlK
BlogDocsLog inGet started
Tessl Logo

giuseppe-trisciuoglio/developer-kit

Comprehensive developer toolkit providing reusable skills for Java/Spring Boot, TypeScript/NestJS/React/Next.js, Python, PHP, AWS CloudFormation, AI/RAG, DevOps, and more.

90

Quality

90%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Risky

Do not use without reviewing

This version of the tile failed moderation
Moderation pipeline encountered an internal error
Overview
Quality
Evals
Security
Files

ralph-loop-guide.mdplugins/developer-kit-specs/docs/

Ralph Loop — Multi-Agent Automation Guide

The Ralph Loop automates the SDD implementation cycle across multiple tasks using different AI agents. It applies Geoffrey Huntley's "Ralph Wiggum as a Software Engineer" technique: one step per invocation, state persisted to disk.

Why Ralph Loop?

The problem: Implementing a 10-task specification in a single Claude Code session causes context window explosion. After 3-4 tasks, the agent loses track of earlier decisions and implementation details.

The solution: The Ralph Loop executes exactly one step per invocation and persists all state to fix_plan.json. Each invocation starts fresh with only the context it needs.

Traditional approach (single session):
  Session 1: TASK-001 → TASK-002 → TASK-003 → [context limit reached]

Ralph Loop approach:
  Invocation 1: choose_task → TASK-001
  Invocation 2: implement TASK-001
  Invocation 3: review TASK-001
  Invocation 4: cleanup TASK-001
  Invocation 5: choose_task → TASK-002
  ... (unlimited, state in fix_plan.json)

State Machine

init → choose_task → implementation → review → fix → cleanup → sync → update_done → choose_task
                          ↑                                            │
                          └────────── (if review failed, max 3 retries) ┘
StateActionNext State
initLoad spec, validate prerequisiteschoose_task
choose_taskPick next pending taskimplementation
implementationExecute task with assigned agentreview
reviewRun task-reviewcleanup (pass) or fix (fail)
fixApply review feedbackimplementation (retry ≤3)
cleanupRun code-cleanupsync
syncUpdate Knowledge Graph and contextupdate_done
update_doneMark task completed, commitchoose_task

Getting Started

Initialize

python3 plugins/developer-kit-specs/skills/ralph-loop/scripts/ralph_loop.py \
  --action=start \
  --spec=docs/specs/001-user-auth/

This creates docs/specs/001-user-auth/_ralph_loop/fix_plan.json with initial state.

Options:

# Process only a specific range of tasks
python3 plugins/developer-kit-specs/skills/ralph-loop/scripts/ralph_loop.py \
  --action=start \
  --spec=docs/specs/001-user-auth/ \
  --from-task=TASK-003 \
  --to-task=TASK-007

# Specify default agent
python3 plugins/developer-kit-specs/skills/ralph-loop/scripts/ralph_loop.py \
  --action=start \
  --spec=docs/specs/001-user-auth/ \
  --agent=codex

# Skip git commits (for testing)
python3 plugins/developer-kit-specs/skills/ralph-loop/scripts/ralph_loop.py \
  --action=start \
  --spec=docs/specs/001-user-auth/ \
  --no-commit

Run the Loop

python3 plugins/developer-kit-specs/skills/ralph-loop/scripts/ralph_loop.py \
  --action=loop \
  --spec=docs/specs/001-user-auth/

Each invocation:

  1. Reads fix_plan.json to determine current state
  2. Executes exactly one step
  3. Updates fix_plan.json with new state
  4. Prints a command for the user to execute next

Example output:

[ralph-loop] State: choose_task
[ralph-loop] Selected: TASK-003 (Implement JWT token service)
[ralph-loop] Agent: claude
[ralph-loop] Next: Execute the following command, then run loop again:

claude --print "/specs:task-implementation --lang=spring --task=docs/specs/001-user-auth/tasks/TASK-003.md"

After executing the shown command, run the loop again:

python3 plugins/developer-kit-specs/skills/ralph-loop/scripts/ralph_loop.py \
  --action=loop \
  --spec=docs/specs/001-user-auth/

Check Status

python3 plugins/developer-kit-specs/skills/ralph-loop/scripts/ralph_loop.py \
  --action=status \
  --spec=docs/specs/001-user-auth/

Example output:

[ralph-loop] Status for docs/specs/001-user-auth/
[ralph-loop] Current state: review
[ralph-loop] Current task: TASK-003
[ralph-loop] Retries: 0/3
[ralph-loop] Progress: 3/8 tasks completed
[ralph-loop] Completed: TASK-001 ✓, TASK-002 ✓, TASK-003 (in review)
[ralph-loop] Remaining: TASK-004, TASK-005, TASK-006, TASK-007, TASK-008

Multi-Agent Support

The Ralph Loop can dispatch different tasks to different AI agents. This is useful when:

  • Some tasks need deep reasoning (use Claude)
  • Some tasks are boilerplate (use Codex or Copilot)
  • You want to compare agent outputs

Per-Task Agent Assignment

Set the agent field in task frontmatter:

---
id: TASK-003
title: Implement JWT token service
agent: claude       # Use Claude for complex security logic
---

---
id: TASK-004
title: Create REST DTOs
agent: codex        # Use Codex for straightforward DTO generation
---

---
id: TASK-005
title: Write unit tests
agent: copilot      # Use Copilot for test generation
---

Supported Agents

AgentCLIBest For
claudeClaude CodeComplex logic, security, architecture
codexCodex CLICode generation, boilerplate, straightforward tasks
copilotGitHub Copilot CLITest generation, code completion
geminiGemini CLILarge-context analysis, documentation
glm4GLM-4 CLIGeneral-purpose coding
kimiKimi CLILong-context reasoning
minimaxMiniMax CLIGeneral-purpose coding

Default Agent

If no agent is specified in task frontmatter, the default is used:

# Set default agent at initialization
python3 plugins/developer-kit-specs/skills/ralph-loop/scripts/ralph_loop.py \
  --action=start \
  --spec=docs/specs/001-user-auth/ \
  --agent=codex

Real-World Scenario: Spring Boot Auth System

Here's a complete walkthrough for a 6-task specification:

# 1. Initialize with task range
python3 plugins/developer-kit-specs/skills/ralph-loop/scripts/ralph_loop.py \
  --action=start \
  --spec=docs/specs/001-user-auth/ \
  --from-task=TASK-001 \
  --to-task=TASK-006

# Output:
# [ralph-loop] Initialized fix_plan.json
# [ralph-loop] Tasks: TASK-001 through TASK-006
# [ralph-loop] Default agent: claude

# 2. Run loop (iteration 1: choose_task)
python3 .../ralph_loop.py --action=loop --spec=docs/specs/001-user-auth/
# → Selects TASK-001 (Create User entity)
# → Shows command: /specs:task-implementation --lang=spring --task=...TASK-001.md

# 3. Execute the shown command (manually or via script)
# ... implement TASK-001 ...

# 4. Run loop (iteration 2: review)
python3 .../ralph_loop.py --action=loop --spec=docs/specs/001-user-auth/
# → Reviews TASK-001
# → If PASSED: proceeds to cleanup
# → Shows command: /developer-kit-specs:specs-code-cleanup --lang=spring --task=...TASK-001.md

# 5. Execute cleanup
# ... cleanup TASK-001 ...

# 6. Run loop (iteration 3: sync + choose next)
python3 .../ralph_loop.py --action=loop --spec=docs/specs/001-user-auth/
# → Syncs Knowledge Graph
# → Marks TASK-001 completed
# → Commits changes
# → Selects TASK-002

# ... continue for remaining tasks ...

Review Failure Handling

When a task review fails, the Ralph Loop enters the fix state:

implementation → review (FAILED) → fix → implementation (retry) → review → ...
  • Max retries: 3 per task
  • On retry: The loop provides review feedback to the next implementation attempt
  • After 3 failures: The loop pauses and asks for manual intervention

State File Reference

The fix_plan.json file stores all loop state:

{
  "spec_path": "docs/specs/001-user-auth/",
  "state": "choose_task",
  "current_task": null,
  "task_range": {
    "from": "TASK-001",
    "to": "TASK-006"
  },
  "completed_tasks": ["TASK-001", "TASK-002"],
  "failed_tasks": [],
  "retries": {
    "TASK-003": 2
  },
  "default_agent": "claude",
  "no_commit": false,
  "started_at": "2026-04-10T10:00:00Z",
  "last_updated": "2026-04-10T11:30:00Z"
}

Important: Do not edit fix_plan.json manually. The Python script manages all state transitions.

Fully Automated Orchestration with agents_loop.py

The manual Ralph Loop requires you to run ralph_loop.py and execute each command yourself. The agents_loop.py script in scripts/ fully automates this cycle: it calls ralph_loop.py to get the next command, executes it with the chosen AI agent, advances the state, and repeats until all tasks are done.

Manual Ralph Loop:
  You: ralph_loop.py --action=loop  →  see command
  You: execute command manually
  You: ralph_loop.py --action=next
  ...repeat...

Automated agents_loop.py:
  Script: ralph_loop.py → get command → execute with agent → advance state → repeat
  You: sit back and monitor

Basic Usage

# Fully automated with a single agent
python3 scripts/agents_loop.py \
  --spec=docs/specs/001-user-auth/ \
  --agent=claude

# Auto-select the best agent per workflow phase
python3 scripts/agents_loop.py \
  --spec=docs/specs/001-user-auth/ \
  --agent=auto

# Use a specific reviewer agent
python3 scripts/agents_loop.py \
  --spec=docs/specs/001-user-auth/ \
  --agent=claude \
  --reviewer=glm4

Supported Agents

AgentCLIBest For
claudeClaude CodeComplex logic, security, architecture
codexCodex CLICode generation, boilerplate
geminiGemini CLILarge-context analysis
kimiKimi CLILong-context reasoning
glm4GLM-4 CLIGeneral-purpose coding
minimaxMiniMax CLIGeneral-purpose coding
openrouterOpenRouter CLIAccess to multiple models
copilotGitHub Copilot CLITest generation, code review
qwenQwen CodeCoding tasks with Qwen models
autoDynamic selectionBest agent per workflow phase

Key Parameters

ParameterDefaultDescription
--specrequiredPath to specification folder
--agentcodexAI agent to use (or auto)
--delay10Seconds between iterations
--max-iterations20Safety limit on iterations
--fastfalseSkip cleanup and sync steps
--verbosefalseEnable debug output with real-time streaming
--modelagent defaultModel override (e.g. sonnet, opus, gpt-5.4)
--kpi-checktrueEnable KPI quality gates after review
--kpi-threshold7.5Quality score threshold (0-10)
--max-quality-iterations5Max fix cycles based on KPI score
--reviewernoneDedicated agent for review steps
--agent-timeout1200Timeout per agent execution (seconds)
--dry-runfalsePrint commands without executing

Auto Mode (--agent=auto)

When using --agent=auto, the script selects the best agent for each workflow phase:

PhaseAgentRationale
reviewcodexCode review specialist
syncgeminiPowerful context analysis
implementationRotates: claudekimiglm4Diversity of approach
fixRotates: glm4minimaxopenrouterAlternative perspectives
cleanupRotates: claudekimicodexGeneral cleanup
Other stepsglm4Default fallback

Fast Mode (--fast)

Skip cleanup and sync steps for rapid iteration cycles:

python3 scripts/agents_loop.py \
  --spec=docs/specs/001-user-auth/ \
  --agent=claude \
  --fast
  • Normal flow: review → cleanup → sync → update_done
  • Fast flow: review → update_done

Use fast mode when you want rapid implementation-review cycles and will sync later.

KPI Quality Gates (--kpi-check)

When enabled (default), the script checks quality KPIs after each review step:

  1. Reads TASK-XXX--kpi.json (auto-generated by hooks)
  2. Compares overall_score against threshold (default: 7.5)
  3. If passed: proceeds normally
  4. If failed: forces state to fix for another iteration
  5. After max quality iterations (default: 5): marks task as failed

This enables data-driven iteration — the script keeps fixing until quality meets the threshold.

# Custom quality threshold
python3 scripts/agents_loop.py \
  --spec=docs/specs/001-user-auth/ \
  --agent=claude \
  --kpi-threshold=8.0 \
  --max-quality-iterations=3

Reviewer Override (--reviewer)

Use a different agent specifically for review steps, regardless of the main agent:

# Implementation with Claude, review with GLM-4
python3 scripts/agents_loop.py \
  --spec=docs/specs/001-user-auth/ \
  --agent=claude \
  --reviewer=glm4

This also works with --agent=auto, overriding the auto-selection for review phases only.

Real-World Example

# 1. Initialize the Ralph Loop first
python3 plugins/developer-kit-specs/skills/ralph-loop/scripts/ralph_loop.py \
  --action=start \
  --spec=docs/specs/001-user-auth/ \
  --from-task=TASK-001 \
  --to-task=TASK-006

# 2. Run fully automated with auto mode and KPI checks
python3 scripts/agents_loop.py \
  --spec=docs/specs/001-user-auth/ \
  --agent=auto \
  --kpi-check \
  --verbose

# The script will:
# - Auto-select agents per phase
# - Execute implementation, review, cleanup, sync
# - Check quality KPIs after each review
# - Fix issues if KPIs below threshold
# - Create git checkpoints after each iteration
# - Stop when all tasks are complete or failed

Monitoring and Debugging

# Enable verbose output for real-time agent streaming
python3 scripts/agents_loop.py \
  --spec=docs/specs/001-user-auth/ \
  --agent=claude \
  --verbose

# Dry run to see what would be executed
python3 scripts/agents_loop.py \
  --spec=docs/specs/001-user-auth/ \
  --agent=claude \
  --dry-run

# Logs are saved automatically to .agents_loop_logs/

Log files are written to .agents_loop_logs/<spec-name>/ with timestamps, agent output, and execution metrics.

Graceful Shutdown

Press Ctrl+C to stop the loop. The script:

  1. Finishes the current agent execution
  2. Prints a summary
  3. Preserves fix_plan.json state
  4. You can resume later by running the same command

Best Practices

  1. Start with clean git state — Uncommitted changes can cause conflicts
  2. One step per invocation — Never combine implementation + review + sync (manual mode)
  3. Check status between runs — Use --action=status to verify state (manual mode)
  4. Assign agents wisely — Use Claude for complex logic, Codex for boilerplate
  5. Monitor retries — If a task fails 3 times, investigate manually
  6. Commit between tasks — Each task completion triggers a git commit
  7. Use task ranges — Start with a small range to validate the workflow
  8. Use agents_loop.py for automation — Fully automates the manual loop cycle
  9. Use --verbose for debugging — See real-time agent output and execution metrics
  10. Leverage KPI quality gates — Let the script iterate on quality automatically

Troubleshooting

"fix_plan.json not found"

Run --action=start to initialize:

python3 .../ralph_loop.py --action=start --spec=docs/specs/001-user-auth/

"State is wrong"

Resume or reset the loop:

# Resume from current state
python3 .../ralph_loop.py --action=resume --spec=docs/specs/001-user-auth/

# Or reset and start over
rm -rf docs/specs/001-user-auth/_ralph_loop/
python3 .../ralph_loop.py --action=start --spec=docs/specs/001-user-auth/

"Max retries exceeded"

The task has failed review 3 times. Options:

  1. Implement the task manually
  2. Split the task into smaller subtasks: /specs:task-manage --action=split --task=...
  3. Review the task yourself and fix the issues

plugins

CHANGELOG.md

context7.json

CONTRIBUTING.md

README_CN.md

README_ES.md

README_IT.md

README.md

tessl.json

tile.json