sharaf/agentic-harness-architect

Design, build, or audit a coding agent, agentic loop, tool-use harness, or autonomous coding system — covering loop architecture, action space, context strategy, observation formatting, evaluation, error handling, prompt engineering, and task decomposition. Use when the user wants to design an agent, build a coding agent, scaffold an agentic system, architect a tool-use loop, review an existing agent harness for improvements, fix context bloat or compaction problems, tune observation formatting or tool output handling, debug agent loop or termination issues, design a system prompt or evaluator prompt for an agent, set up or redesign an agent evaluation pipeline, plan multi-agent orchestration, or specify how an agent should manage context, tools, prompts, evaluation, or recovery (greenfield design or audit mode).

100

1.23x

Quality

100%

Does it follow best practices?

Impact

100%

1.23x

Average score across 4 eval scenarios

Securityby

Passed

No known issues

Phase 7: Error Handling & Recovery

Name: sharaf/agentic-harness-architect
Rating: 100 (1 reviews)
Author: sharaf

Design error handling architecturally, not through prompts alone.

Core components

1. LoopGuard

Monitor for repetitive behavior across three dimensions:

Action signature fingerprints (deduplicate tool+args)
Normalized error pattern matching (cosine similarity on stripped messages)
Semantic similarity via embeddings
Graduated intervention: inject reflection prompt → force different approach → escalate/terminate
Reduces iteration counts from 30+ to ~8

2. Checkpoint strategy by deployment type

Deployment	Strategy
Multi-session/long-running	Stateless restore via progress files
Real-time recovery	Stateful restore (LangGraph, CRIU)
Code generation	Shadow git checkpoint (commit before every modification)

3. Retry vs. pivot decision

Retry when: transient error, actionable error info, quality trending up, within budget (max 3 for infrastructure)
Pivot when: quality flat/declining across 3+ iterations, same semantic error repeats, token consumption escalating without quality gain
Escalate to human when: irreversible side effects, 2-3 failed pivots, ambiguous error, security-sensitive operations

4. Context poisoning prevention

Failed code accumulates in context, causing probability drift toward the failing pattern
On doom loop detection: revert code, flush failed-attempt context, restart with original task + lessons learned only
Preserve error traces during compaction (they serve as implicit negative examples)

5. Multi-agent error isolation

Schema validation at every agent boundary (typed schemas, not natural language)
Circuit breakers between agent clusters using 95th percentile response times
SagaLLM compensation agents for rollback in multi-step workflows

6. Premature completion prevention

Never trust agent self-assessment for completion
Maintain structured feature list with items initially marked failing
QA agent validates against the list before accepting completion
Pre-completion checklist middleware intercepts exit and forces verification