sharaf/agentic-harness-architect

Design, build, or audit a coding agent, agentic loop, tool-use harness, or autonomous coding system — covering loop architecture, action space, context strategy, observation formatting, evaluation, error handling, prompt engineering, and task decomposition. Use when the user wants to design an agent, build a coding agent, scaffold an agentic system, architect a tool-use loop, review an existing agent harness for improvements, fix context bloat or compaction problems, tune observation formatting or tool output handling, debug agent loop or termination issues, design a system prompt or evaluator prompt for an agent, set up or redesign an agent evaluation pipeline, plan multi-agent orchestration, or specify how an agent should manage context, tools, prompts, evaluation, or recovery (greenfield design or audit mode).

100

1.23x

Quality

100%

Does it follow best practices?

Impact

100%

1.23x

Average score across 4 eval scenarios

Securityby

Passed

No known issues

Agentic Harness Architect

Name: sharaf/agentic-harness-architect
Rating: 100 (1 reviews)
Author: sharaf

A Tessl skill for designing, building, or auditing agentic coding harnesses — the surrounding architecture (not the model) that determines an agent's quality ceiling.

What this skill does

Walks through the full architecture of a coding agent across 11 phases — loop design, action space, context management, evaluation, error handling, prompt engineering, decomposition, and simplification — and produces a structured design document. Two modes:

greenfield — design a new harness from stated requirements
audit — analyze an existing harness and propose targeted improvements

Every decision is grounded in benchmarks from 781 sources across 10 sub-domains (agent loop design, action space design, observation formatting, context window management, multi-agent orchestration, evaluation, prompt engineering, error handling, task decomposition, iterative simplification).

When to use

Trigger phrases that activate the skill:

"Design an agent" / "build a coding agent" / "scaffold an agentic system"
"Architect a tool-use loop" / "review an agent harness"
"Fix context bloat or compaction problems"
"Tune observation formatting" / "fix tool output handling"
"Debug agent loop or termination issues"
"Specify how an agent should manage context, tools, or recovery"

Install

tessl install sharaf/agentic-harness-architect

Or pin a version:

tessl install sharaf/agentic-harness-architect@0.3.6

How it's organized

Layer	Where	Purpose
Entry point	`SKILL.md`	Quick Decision Triage + phase index + output format
Phase detail	`references/phase-*.md`	One reference file per architectural phase (1-10)
Guardrails	`references/guardrails.md`	Must-not-do rules across all phases
Decision flowcharts	`references/decision-flowcharts.md`	Architecture sizing, loop selection, context strategy, evaluation, decomposition, error recovery, simplification
Success criteria	`references/success-criteria.md`	Benchmarks + quality gates the design must hit

The SKILL.md itself is intentionally short (~1.5K tokens) — progressive disclosure delegates depth to references, so the on-demand context cost stays modest.

Output format

The skill produces a structured design document. For greenfield mode, sections include Architecture Decision, Loop Design, Action Space, Observation Formatting, Context Management, Evaluation Design, Error Handling, Prompt Architecture, Decomposition Strategy, Complexity Budget, and Key Metrics. For audit mode, sections include Current Architecture Summary, Identified Issues, Improvement Sequence, Components to Remove, and Migration Path.

Each section includes the decision, the rationale citing benchmarks, alternatives considered, and conditions under which to revisit.

Benchmarks the skill targets

Metric	Target
Tool definitions	<20% of context budget
Context utilization	40-60% (FIC target)
Generator-Critic iterations	2-3 max
Agent count	2-4 (saturation threshold)
Decomposition depth	Max 3 levels
Error recovery rate	>70%

Eval results

Latest published registry run tested claude:claude-sonnet-4-6 across 4 scenarios:

Metric	Result
Baseline average	81%
With-context average	99%
Uplift	1.22x

The current bundle passes local Tessl lint and skill review at 100%.

Workspace: sharaf
Visibility: Public
Created: 2 months ago
Last updated: about 1 month ago
Publish Source: CLI
Badge