Break down goals into multiple tasks and coordinate execution with gates and recovery. Based on Claw Code's agentic harness.
92
90%
Does it follow best practices?
Impact
100%
1.09xAverage score across 3 eval scenarios
Passed
No known issues
Quality
Discovery
92%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description that clearly communicates both what the skill does and when to use it, with natural trigger terms placed prominently at the beginning. The specificity of capabilities is excellent, covering planning, approval gating, failure classification, and logging. The main weakness is that its scope is broad enough (covering workflows, pipelines, failure recovery) that it could potentially overlap with other orchestration or error-handling skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description lists multiple specific concrete actions: breaks down goals into tasks, presents plan for approval, enforces mandatory approval gate, separates planning from execution, classifies failures by type, applies type-specific recovery strategies, and produces timestamped event logs. | 3 / 3 |
Completeness | Clearly answers both 'what' (breaks down goals, presents plan, enforces approval gate, classifies failures, produces event logs) and 'when' with explicit triggers at the start ('Use when you need to ask before executing...') and an 'Applies to' section listing specific scenarios. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'ask before executing', 'don't run without permission', 'review steps before proceeding', 'confirm before executing', 'approval', 'human-in-the-loop'. These cover natural variations of how users would request this behavior. | 3 / 3 |
Distinctiveness Conflict Risk | While the approval-gate and human-in-the-loop concept is fairly specific, terms like 'step-by-step workflows', 'agentic pipelines', and 'failure recovery' could overlap with general workflow orchestration or error-handling skills. The core niche of mandatory approval gating is distinct, but the scope is broad enough to potentially conflict. | 2 / 3 |
Total | 11 / 12 Passed |
Implementation
85%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured skill that clearly defines a planning-execution pattern with strong workflow clarity and good progressive disclosure. Its main weakness is some verbosity in the execution format section, where multiple equivalent notation examples are shown and the flexibility is over-emphasized—Claude doesn't need this much hand-holding on output formatting. The recovery table and overall workflow design are excellent.
Suggestions
Consolidate the execution format section: show one canonical format example instead of multiple 'acceptable' alternatives, and trust Claude to adapt the notation style as needed.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably well-structured but includes redundant content—multiple acceptable format examples for progress notation are shown when one would suffice, and the flexibility note at the end of the example repeats what was already stated. The recovery table is efficient, but the execution format section is over-explained for Claude's capabilities. | 2 / 3 |
Actionability | The skill provides concrete, specific guidance at every stage: exact notation formats, a detailed recovery classification table with detection patterns and max attempts, specific commands in the example (curl, grep, SQL), and clear rules for behavior. Claude would know exactly what to do. | 3 / 3 |
Workflow Clarity | The five-stage workflow (PLAN → GATE → EXECUTE → RECOVER → LOG) is clearly sequenced with explicit validation checkpoints. The GATE stage is an explicit approval gate that blocks execution, recovery includes a classify-then-fix feedback loop with escalation paths, and the full example demonstrates the complete flow including a retry scenario. | 3 / 3 |
Progressive Disclosure | The skill provides a clear overview with well-organized sections, then points to one-level-deep references (PROMPT.md, EXAMPLES.md, IMPLEMENTATION.md, REFERENCES.md) for detailed content. The main file stays focused on the pattern itself without becoming monolithic. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Reviewed
Table of Contents