Content
64%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a solid, actionable skill with good executable examples for both tmux and PTY-based CLI testing harnesses. Its main weaknesses are the lack of explicit validation/error-recovery steps in the workflow and some content that could be tightened (the use-case list and profiling recipes are somewhat generic). The code examples are the strongest aspect, being complete and ready to adapt.
Suggestions
Add explicit validation checkpoints and error recovery to the Harness Loop (e.g., 'If the CLI does not produce the expected prompt within the deadline, capture the current screen state and report the timeout before killing the session').
Tighten or remove the 'What It Is Used For' section since it largely restates the skill description and doesn't add actionable guidance.
Make the Profiling Recipes more concrete with executable code snippets rather than prose descriptions (e.g., show the actual Node inspector commands for taking a heap snapshot).
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is mostly efficient but includes some sections that could be tightened. The 'What It Is Used For' bullet list is somewhat redundant given the description, and the profiling recipes section is fairly high-level without adding much beyond what Claude would already know. However, the code examples and harness loop are reasonably lean. | 2 / 3 |
Actionability | The skill provides fully executable bash and Python code examples for both tmux and PTY harnesses. The tmux harness is copy-paste ready with concrete commands, and the PTY script is a complete, runnable Python program with clear placeholders for customization. | 3 / 3 |
Workflow Clarity | The 'Harness Loop' provides a clear 8-step sequence, but it lacks explicit validation checkpoints and error recovery feedback loops. For operations involving terminal sessions and profiling (which can hang or fail silently), there's no 'if X fails, do Y' guidance. The guardrails section partially compensates but doesn't integrate into the workflow steps. | 2 / 3 |
Progressive Disclosure | The content is reasonably well-structured with clear section headers, but everything is inline in a single file. The profiling recipes and PTY harness could benefit from being split into referenced files. With no bundle files provided, there's no progressive disclosure structure, though for a skill of this size (~100 lines of content) it's borderline acceptable. | 2 / 3 |
Total | 9 / 12 Passed |