Build or adapt a local harness to drive, inspect, and profile an interactive CLI or TUI without external services. Use for CLI UX checks, startup regressions, memory leaks, hangs, prompt flows, or terminal demos.
72
88%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description that concisely communicates both what the skill does and when to use it. It uses third-person voice, lists concrete actions and specific trigger scenarios, and carves out a distinct niche around interactive CLI/TUI testing harnesses. The description is well-crafted with natural keywords that users would actually use when needing this capability.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'drive, inspect, and profile an interactive CLI or TUI', and enumerates specific use cases like 'CLI UX checks, startup regressions, memory leaks, hangs, prompt flows, terminal demos'. | 3 / 3 |
Completeness | Clearly answers both 'what' (build/adapt a local harness to drive, inspect, and profile interactive CLI/TUI) and 'when' (explicitly states 'Use for CLI UX checks, startup regressions, memory leaks, hangs, prompt flows, or terminal demos'). | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'CLI', 'TUI', 'memory leaks', 'hangs', 'startup regressions', 'terminal demos', 'prompt flows'. These cover a good range of terms a user working with interactive command-line tools would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Occupies a clear niche: local harness for interactive CLI/TUI testing and profiling without external services. The combination of 'CLI/TUI', 'harness', 'profile', and specific triggers like 'startup regressions' and 'memory leaks' makes it highly distinctive and unlikely to conflict with general testing or coding skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a solid, actionable skill with executable code examples and a clear workflow sequence. Its main weakness is moderate verbosity—the 'What It Is Used For' section and some explanatory text could be trimmed since Claude can infer these use cases. The lack of bundle files means all content is inline, which is acceptable for this length but could benefit from splitting profiling recipes and the PTY harness into separate references.
Suggestions
Remove or significantly trim the 'What It Is Used For' section—Claude can infer appropriate use cases from the skill title and harness instructions.
Consider extracting the PTY harness and profiling recipes into separate referenced files to improve progressive disclosure and reduce the main file's token footprint.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is mostly efficient but includes some unnecessary sections like 'What It Is Used For' which lists use cases Claude could infer. The 'Profiling Recipes' section is somewhat terse but the overall document could be tightened—e.g., the harness options list partially duplicates what's shown in the code examples. | 2 / 3 |
Actionability | Provides fully executable bash and Python code examples for both tmux and PTY harnesses. The tmux harness is copy-paste ready, the PTY script is complete and runnable with minimal placeholder substitution, and profiling recipes give concrete steps. | 3 / 3 |
Workflow Clarity | The 'Harness Loop' provides a clear 8-step sequence with explicit validation checkpoints (step 6: wait for concrete screen pattern before next action) and clean teardown (step 8). The guardrails section adds validation constraints like preferring deterministic waits and cleaning up sessions. | 3 / 3 |
Progressive Disclosure | The content is well-structured with clear section headers and progresses from overview to specific harnesses to profiling recipes. However, for a skill of this length (~100 lines of substantive content), the profiling recipes and PTY harness could be split into referenced files. No bundle files exist to offload detail, making this somewhat monolithic. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
b8f2564
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.