honeybadge/virtui

automatically control and record tui application sessions from the terminal

4.45x

Quality

94%

Does it follow best practices?

Impact

89%

4.45x

Average score across 3 eval scenarios

Securityby

Risky

Do not use without reviewing

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly articulates specific capabilities (pseudo-TTY launching, screen capture, keystroke sending, asciicast recording) and provides comprehensive trigger guidance with both a 'Use when' clause and a detailed 'Trigger on' list. It occupies a distinct niche that is unlikely to conflict with other skills, and uses appropriate third-person voice throughout.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'Launches programs via pseudo-TTY, captures screen output, sends keystrokes and control sequences, and records sessions to asciicast format.' These are clearly defined, actionable capabilities.	3 / 3
Completeness	Clearly answers both 'what' (launches programs via pseudo-TTY, captures screen output, sends keystrokes, records to asciicast) and 'when' with an explicit 'Use when' clause at the start and a detailed 'Trigger on:' list covering multiple scenarios.	3 / 3
Trigger Term Quality	Excellent coverage of natural trigger terms: 'interactive terminal programs', 'automating CLI workflows', 'testing TUI apps', 'recording terminal sessions', 'sending keystrokes', 'waiting for terminal output', 'controlling a terminal application non-interactively'. These cover many natural ways a user might describe this need.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive niche — pseudo-TTY control, TUI testing, asciicast recording, and non-interactive terminal automation are very specific domains unlikely to overlap with general CLI or coding skills. Terms like 'pseudo-TTY', 'asciicast', and 'TUI' clearly carve out a unique space.	3 / 3
	Total	12 / 12 Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a high-quality skill document that provides comprehensive, actionable guidance for automating TUI applications with virtui. The workflow is clearly sequenced with concrete examples at every step, and the pipeline section addresses a real gotcha with Claude Code's bash tool. Minor verbosity in some explanatory notes and caveats prevents a perfect conciseness score, but the content is well-structured and practically useful.

Dimension	Reasoning	Score
Conciseness	The content is mostly efficient and well-structured, but includes some unnecessary explanations (e.g., explaining what the daemon does with Unix sockets, the note about proto3 JSON encoding) and some sections could be tightened. The Claude Code caveats are valuable but slightly verbose. Overall it's reasonably lean for the complexity of the tool.	2 / 3
Actionability	Excellent actionability throughout — every command is fully executable with concrete examples, flags are documented in tables, JSON output examples are provided, and the pipeline section includes copy-paste ready JSON payloads. The skill provides specific commands for every step of the workflow.	3 / 3
Workflow Clarity	The workflow is clearly sequenced (start daemon → run session → interact → screenshot → kill session) with numbered steps, explicit validation patterns (check daemon status before starting, use screen_hash for change detection, take screenshot on timeout), and error recovery guidance (check retryable field, screenshot on timeout). The pipeline section provides a reliable alternative workflow with stop_on_error support.	3 / 3
Progressive Disclosure	Content is well-organized with clear sections progressing from prerequisites to core workflow to advanced features. References to external files (references/keys-and-errors.md) are one level deep and clearly signaled. The JSON output fields reference table at the end serves as a quick-reference without cluttering the main workflow sections.	3 / 3
	Total	11 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

25 days ago

Table of Contents

Discovery Implementation Validation