CtrlK
BlogDocsLog inGet started
Tessl Logo

benpiper-workspace/planning-execution-harness

Break down goals into multiple tasks and coordinate execution with gates and recovery. Based on Claw Code's agentic harness.

92

1.09x
Quality

90%

Does it follow best practices?

Impact

100%

1.09x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

Quality

Discovery

92%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong description that clearly communicates both what the skill does and when to use it, with natural trigger terms placed prominently at the beginning. The specificity of capabilities is excellent, covering planning, approval gating, failure classification, and logging. The main weakness is that its scope is broad enough (covering workflows, pipelines, failure recovery) that it could potentially overlap with other orchestration or error-handling skills.

DimensionReasoningScore

Specificity

The description lists multiple specific concrete actions: breaks down goals into tasks, presents plan for approval, enforces mandatory approval gate, separates planning from execution, classifies failures by type, applies type-specific recovery strategies, and produces timestamped event logs.

3 / 3

Completeness

Clearly answers both 'what' (breaks down goals, presents plan, enforces approval gate, classifies failures, produces event logs) and 'when' with explicit triggers at the start ('Use when you need to ask before executing...') and an 'Applies to' section listing specific scenarios.

3 / 3

Trigger Term Quality

Includes strong natural trigger terms users would say: 'ask before executing', 'don't run without permission', 'review steps before proceeding', 'confirm before executing', 'approval', 'human-in-the-loop'. These cover natural variations of how users would request this behavior.

3 / 3

Distinctiveness Conflict Risk

While the approval-gate and human-in-the-loop concept is fairly specific, terms like 'step-by-step workflows', 'agentic pipelines', and 'failure recovery' could overlap with general workflow orchestration or error-handling skills. The core niche of mandatory approval gating is distinct, but the scope is broad enough to potentially conflict.

2 / 3

Total

11

/

12

Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured skill that clearly defines a planning-execution pattern with strong workflow clarity and good progressive disclosure. Its main weakness is some verbosity in the execution format section, where multiple equivalent notation examples are shown and the flexibility is over-emphasized—Claude doesn't need this much hand-holding on output formatting. The recovery table and overall workflow design are excellent.

Suggestions

Consolidate the execution format section: show one canonical format example instead of multiple 'acceptable' alternatives, and trust Claude to adapt the notation style as needed.

DimensionReasoningScore

Conciseness

The skill is reasonably well-structured but includes redundant content—multiple acceptable format examples for progress notation are shown when one would suffice, and the flexibility note at the end of the example repeats what was already stated. The recovery table is efficient, but the execution format section is over-explained for Claude's capabilities.

2 / 3

Actionability

The skill provides concrete, specific guidance at every stage: exact notation formats, a detailed recovery classification table with detection patterns and max attempts, specific commands in the example (curl, grep, SQL), and clear rules for behavior. Claude would know exactly what to do.

3 / 3

Workflow Clarity

The five-stage workflow (PLAN → GATE → EXECUTE → RECOVER → LOG) is clearly sequenced with explicit validation checkpoints. The GATE stage is an explicit approval gate that blocks execution, recovery includes a classify-then-fix feedback loop with escalation paths, and the full example demonstrates the complete flow including a retry scenario.

3 / 3

Progressive Disclosure

The skill provides a clear overview with well-organized sections, then points to one-level-deep references (PROMPT.md, EXAMPLES.md, IMPLEMENTATION.md, REFERENCES.md) for detailed content. The main file stays focused on the pattern itself without becoming monolithic.

3 / 3

Total

11

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents