CtrlK
BlogDocsLog inGet started
Tessl Logo

sharaf/agentic-harness-architect

Design, build, or audit a coding agent, agentic loop, tool-use harness, or autonomous coding system — covering loop architecture, action space, context strategy, observation formatting, evaluation, error handling, prompt engineering, and task decomposition. Use when the user wants to design an agent, build a coding agent, scaffold an agentic system, architect a tool-use loop, review an existing agent harness for improvements, fix context bloat or compaction problems, tune observation formatting or tool output handling, debug agent loop or termination issues, design a system prompt or evaluator prompt for an agent, set up or redesign an agent evaluation pipeline, plan multi-agent orchestration, or specify how an agent should manage context, tools, prompts, evaluation, or recovery (greenfield design or audit mode).

100

1.23x
Quality

100%

Does it follow best practices?

Impact

100%

1.23x

Average score across 4 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

Agentic Harness Architect

A Tessl skill for designing, building, or auditing agentic coding harnesses — the surrounding architecture (not the model) that determines an agent's quality ceiling.

What this skill does

Walks through the full architecture of a coding agent across 11 phases — loop design, action space, context management, evaluation, error handling, prompt engineering, decomposition, and simplification — and produces a structured design document. Two modes:

  • greenfield — design a new harness from stated requirements
  • audit — analyze an existing harness and propose targeted improvements

Every decision is grounded in benchmarks from 781 sources across 10 sub-domains (agent loop design, action space design, observation formatting, context window management, multi-agent orchestration, evaluation, prompt engineering, error handling, task decomposition, iterative simplification).

When to use

Trigger phrases that activate the skill:

  • "Design an agent" / "build a coding agent" / "scaffold an agentic system"
  • "Architect a tool-use loop" / "review an agent harness"
  • "Fix context bloat or compaction problems"
  • "Tune observation formatting" / "fix tool output handling"
  • "Debug agent loop or termination issues"
  • "Specify how an agent should manage context, tools, or recovery"

Install

tessl install sharaf/agentic-harness-architect

Or pin a version:

tessl install sharaf/agentic-harness-architect@0.3.6

How it's organized

LayerWherePurpose
Entry pointSKILL.mdQuick Decision Triage + phase index + output format
Phase detailreferences/phase-*.mdOne reference file per architectural phase (1-10)
Guardrailsreferences/guardrails.mdMust-not-do rules across all phases
Decision flowchartsreferences/decision-flowcharts.mdArchitecture sizing, loop selection, context strategy, evaluation, decomposition, error recovery, simplification
Success criteriareferences/success-criteria.mdBenchmarks + quality gates the design must hit

The SKILL.md itself is intentionally short (~1.5K tokens) — progressive disclosure delegates depth to references, so the on-demand context cost stays modest.

Output format

The skill produces a structured design document. For greenfield mode, sections include Architecture Decision, Loop Design, Action Space, Observation Formatting, Context Management, Evaluation Design, Error Handling, Prompt Architecture, Decomposition Strategy, Complexity Budget, and Key Metrics. For audit mode, sections include Current Architecture Summary, Identified Issues, Improvement Sequence, Components to Remove, and Migration Path.

Each section includes the decision, the rationale citing benchmarks, alternatives considered, and conditions under which to revisit.

Benchmarks the skill targets

MetricTarget
Tool definitions<20% of context budget
Context utilization40-60% (FIC target)
Generator-Critic iterations2-3 max
Agent count2-4 (saturation threshold)
Decomposition depthMax 3 levels
Error recovery rate>70%

Eval results

Latest published registry run tested claude:claude-sonnet-4-6 across 4 scenarios:

MetricResult
Baseline average81%
With-context average99%
Uplift1.22x

The current bundle passes local Tessl lint and skill review at 100%.

Workspace
sharaf
Visibility
Public
Created
Last updated
Publish Source
CLI
Badge
sharaf/agentic-harness-architect badge