CtrlK
BlogDocsLog inGet started
Tessl Logo

sharaf/agentic-harness-architect

Design, build, or audit a coding agent, agentic loop, tool-use harness, or autonomous coding system — covering loop architecture, action space, context strategy, observation formatting, evaluation, error handling, prompt engineering, and task decomposition. Use when the user wants to design an agent, build a coding agent, scaffold an agentic system, architect a tool-use loop, review an existing agent harness for improvements, fix context bloat or compaction problems, tune observation formatting or tool output handling, debug agent loop or termination issues, design a system prompt or evaluator prompt for an agent, set up or redesign an agent evaluation pipeline, plan multi-agent orchestration, or specify how an agent should manage context, tools, prompts, evaluation, or recovery (greenfield design or audit mode).

100

1.23x
Quality

100%

Does it follow best practices?

Impact

100%

1.23x

Average score across 4 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that thoroughly covers the domain of coding agent design and development. It provides comprehensive specificity with numerous concrete actions, includes a rich set of natural trigger terms, and has an explicit and detailed 'Use when...' clause. The only minor concern is that the description is quite long, but the verbosity is justified by the breadth of the domain and each term adds genuine discriminative value.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'design, build, or audit a coding agent', 'loop architecture', 'action space', 'context strategy', 'observation formatting', 'evaluation', 'error handling', 'prompt engineering', 'task decomposition'. Very comprehensive enumeration of capabilities.

3 / 3

Completeness

Clearly answers both 'what' (design/build/audit coding agents covering loop architecture, action space, context strategy, etc.) and 'when' with an explicit 'Use when...' clause listing numerous specific trigger scenarios like debugging agent loops, fixing context bloat, designing evaluator prompts, and planning multi-agent orchestration.

3 / 3

Trigger Term Quality

Excellent coverage of natural terms users would say: 'coding agent', 'agentic loop', 'tool-use harness', 'context bloat', 'compaction', 'agent loop', 'termination issues', 'system prompt', 'multi-agent orchestration', 'greenfield design', 'audit'. These are terms practitioners naturally use when discussing agent development.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive niche focused specifically on coding agents and agentic systems. The triggers are domain-specific enough (agent loops, tool-use harness, context compaction, observation formatting) that they would be unlikely to conflict with general coding, prompt engineering, or architecture skills.

3 / 3

Total

12

/

12

Passed

Implementation

100%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is an exceptionally well-structured skill that serves as a masterclass in progressive disclosure and actionable architecture guidance. The quick decision triage provides dense, benchmark-grounded heuristics that respect Claude's intelligence while adding genuinely novel decision frameworks. The dual-mode workflow (greenfield/audit) with clear phase navigation and explicit output templates makes this immediately actionable for complex agentic system design.

DimensionReasoningScore

Conciseness

Every section earns its place. The quick decision triage provides dense, benchmark-backed heuristics (e.g., tool accuracy drops, context utilization targets, self-evaluation bias percentages) without explaining concepts Claude already knows. No padding or unnecessary exposition.

3 / 3

Actionability

Provides concrete decision thresholds (e.g., '8-12 tools per agent, <20% of context budget'), specific loop selection criteria by duration, exact output document templates for both modes, and explicit checklists of inputs to gather. The guidance is specific and directly executable as a design process.

3 / 3

Workflow Clarity

Clear 11-phase sequence with explicit mode-dependent navigation (greenfield: phases 1-11 in order; audit: start at Phase 1, jump to issue-targeted phases, always consult Phase 10 before additions). The workflow includes validation via cross-cutting guardrails, success criteria gates, and the Removal Test as a checkpoint before adding components.

3 / 3

Progressive Disclosure

Exemplary structure: the SKILL.md serves as a concise overview with a phase index table linking to 10 phase-specific reference files plus 3 cross-cutting references, all one level deep under references/. Explicit instruction to 'load only the phase file needed for the current decision' demonstrates intentional progressive disclosure. Navigation is clear via the table and inline links.

3 / 3

Total

12

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents