CtrlK
BlogDocsLog inGet started
Tessl Logo

sharaf/agentic-harness-architect

Design, build, or audit a coding agent, agentic loop, tool-use harness, or autonomous coding system — covering loop architecture, action space, context strategy, observation formatting, evaluation, error handling, prompt engineering, and task decomposition. Use when the user wants to design an agent, build a coding agent, scaffold an agentic system, architect a tool-use loop, review an existing agent harness for improvements, fix context bloat or compaction problems, tune observation formatting or tool output handling, debug agent loop or termination issues, design a system prompt or evaluator prompt for an agent, set up or redesign an agent evaluation pipeline, plan multi-agent orchestration, or specify how an agent should manage context, tools, prompts, evaluation, or recovery (greenfield design or audit mode).

100

1.23x
Quality

100%

Does it follow best practices?

Impact

100%

1.23x

Average score across 4 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

phase-03-loop-design.mdreferences/

Phase 3: Loop Design

Select the execution loop pattern based on task profile and duration. Layer multiple termination mechanisms.

Loop architecture selection

PatternBest forDurationCostKey mechanism
ReActSimple tasks, 1-15 tool callsMinutesLowThought-Action-Observation cycle
Generator-CriticQuality-critical outputMinutes-hoursMediumSeparate evaluator with rubrics; cap at 3 iterations
Ralph LoopMechanical/verifiable tasks (refactoring, migration)Hours$5-150Outer verification wrapper; fresh context per iteration
Magentic-One Dual-LoopComplex/unpredictable multi-tool tasksHoursMedium-HighTask Ledger + Progress Ledger; strategic replanning
Orchestrator-WorkerParallelizable subtasksHoursHigh (15x)Fan-out with isolated contexts; Opus orchestrator, Sonnet workers
Build-Verify-FixAny coding task with test infrastructureMinutes-hoursMediumPlanning-Build-Verify-Fix phases with middleware interception

Termination — layer all applicable mechanisms

  1. Hard iteration cap (always): 15-25 inner loop, 5-50 for Ralph Loop
  2. Wall-clock timeout (always): ~300 seconds per step
  3. Objective verification: tests, lint, type-check for coding tasks
  4. Quality threshold: score stability across 2 consecutive iterations
  5. Convergence detection: >85% semantic similarity between iterations = stop
  6. Loop fingerprinting: 3 identical (tool, result) hashes = stuck, change strategy
  7. Cost budget: production deployments

Reasoning budget allocation (Reasoning Sandwich)

  • xhigh reasoning for planning and verification phases
  • high reasoning during implementation
  • Uniform xhigh causes timeouts and lower scores (53.9% vs. 66.5%)

README.md

SKILL.md

tile.json