CtrlK
BlogDocsLog inGet started
Tessl Logo

sharaf/agentic-harness-architect

Design, build, or audit a coding agent, agentic loop, tool-use harness, or autonomous coding system — covering loop architecture, action space, context strategy, observation formatting, evaluation, error handling, prompt engineering, and task decomposition. Use when the user wants to design an agent, build a coding agent, scaffold an agentic system, architect a tool-use loop, review an existing agent harness for improvements, fix context bloat or compaction problems, tune observation formatting or tool output handling, debug agent loop or termination issues, design a system prompt or evaluator prompt for an agent, set up or redesign an agent evaluation pipeline, plan multi-agent orchestration, or specify how an agent should manage context, tools, prompts, evaluation, or recovery (greenfield design or audit mode).

100

1.23x
Quality

100%

Does it follow best practices?

Impact

100%

1.23x

Average score across 4 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

phase-01-requirements.mdreferences/

Phase 1: Requirements Analysis

Gather or extract these inputs before making any architecture decisions. The downstream phases compose on top of these answers — skipping this phase produces a design that fails to fit its constraints.

Greenfield inputs

  • Task profile: What will agents do? (code generation, bug fixing, research, multi-file features, full-stack builds)
  • Quality target: Acceptable failure rate? Is "almost right" acceptable or must output be production-ready?
  • Duration profile: Seconds, minutes, hours, or multi-session?
  • Cost constraints: Per-task budget? Total monthly budget?
  • Model access: Which models are available? Frontier only, or tiered?
  • Verification infrastructure: Tests, linters, type checkers, CI pipelines available?
  • Security requirements: Sandboxing needs? Multi-tenant? Untrusted code execution?
  • Human-in-the-loop: Fully autonomous, approval gates, or pair programming?

Audit-mode additions

When auditing an existing harness, also gather:

  • Current architecture description or codebase
  • Known failure modes and pain points
  • Performance metrics if available (completion rate, token usage, cost per task)

How requirements drive later phases

  • Task profile + quality target → Phase 2 (single vs. multi-agent) and Phase 6 (evaluator architecture)
  • Duration profile → Phase 3 (loop selection) and Phase 5 (context management strategy)
  • Cost constraints → Phase 2 (multi-agent uses 15x more tokens) and Phase 3 (iteration caps)
  • Model access → Phase 4 (tool granularity tier) and Phase 9 (decomposition intensity)
  • Verification infrastructure → Phase 6 (deterministic vs. LLM evaluation)
  • Security requirements → Phase 4 (sandboxing strategy)
  • Human-in-the-loop posture → Phase 7 (escalation thresholds)

README.md

SKILL.md

tile.json