CtrlK
BlogDocsLog inGet started
Tessl Logo

sharaf/agentic-harness-architect

Design, build, or audit a coding agent, agentic loop, tool-use harness, or autonomous coding system — covering loop architecture, action space, context strategy, observation formatting, evaluation, error handling, prompt engineering, and task decomposition. Use when the user wants to design an agent, build a coding agent, scaffold an agentic system, architect a tool-use loop, review an existing agent harness for improvements, fix context bloat or compaction problems, tune observation formatting or tool output handling, debug agent loop or termination issues, design a system prompt or evaluator prompt for an agent, set up or redesign an agent evaluation pipeline, plan multi-agent orchestration, or specify how an agent should manage context, tools, prompts, evaluation, or recovery (greenfield design or audit mode).

100

1.23x
Quality

100%

Does it follow best practices?

Impact

100%

1.23x

Average score across 4 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

phase-04-action-space.mdreferences/

Phase 4: Action Space Design

Design the tool set the agent will use. Tool granularity should match model capability.

Granularity by model tier

Model tierStrategyExample
Frontier (Opus 4+, GPT-5)Coarse — bash + file editorClaude Code minimal scaffold
Strong general (Sonnet 4, GPT-4o)Medium — 5-10 curated toolsSWE-agent ACI
Mid-tier (Haiku, GPT-4o-mini)Fine-grained with guardrailsStructured tools with validation
Small/open-source (<70B)Maximum structure, atomic operationsStrictly typed, no CodeAct

Tool count ceiling

8-12 tools per agent context. Accuracy drops from ~95% (4 tools) to ~71% (46 tools). Tool definitions must stay under 20% of context budget. Use dynamic tool loading or sub-agent architectures for broader needs.

File editing strategy

ConditionStrategy
File < 200 linesstr_replace with exact unique match
File > 200 lines, change < 20 linesstr_replace with expanded context
Change > 20 lines, AST availableSemantic editing via FQDN/AST node
Change > 20 lines, no ASTWhole-file replacement with syntax/lint validation

Poka-yoke (error-proofing) principles

  • Require absolute filepaths (eliminates working-directory errors)
  • Use content-based addressing, not line numbers (line numbers break between read and edit)
  • Validate parameters with schemas before execution
  • Return actionable errors with expected format, constraints, and correction examples

Sandboxing strategy

ContextStrategyTrade-off
Trusted code, internal teamHardened Docker + seccomp + AppArmorLow overhead
Untrusted code, single-tenantgVisor containers10-30% I/O overhead, strong isolation
Multi-tenant, untrustedFirecracker microVMs~125ms boot, <5 MiB/VM, hardware isolation
Regulated, zero-trustKata Containers on Kubernetes~200ms boot

Read-write separation

Separate observation tools (read-only) from mutation tools (write). Enable observation-only mode, graduated permission escalation, and audit trails (ask/allow/deny model).

README.md

SKILL.md

tile.json