Use this skill to interactively debug Antithesis test runs using the multiverse debugger. Open a debugging-session URL, inspect container filesystems and runtime state, run shell commands, and extract evidence from inside the Antithesis environment. Supports both the simplified debugger (default) and the advanced notebook mode.
94
92%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly identifies a specific domain (Antithesis multiverse debugger), lists concrete actions (inspect filesystems, run shell commands, extract evidence), and provides clear trigger context. The only minor issue is the use of second-person-adjacent 'Use this skill' phrasing, though it reads more as imperative/instructional rather than second person, and the description is otherwise well-constructed.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: open a debugging-session URL, inspect container filesystems and runtime state, run shell commands, extract evidence from inside the Antithesis environment. Also mentions two modes (simplified debugger and advanced notebook mode). | 3 / 3 |
Completeness | Clearly answers both 'what' (inspect filesystems, run shell commands, extract evidence, open debugging URLs) and 'when' ('Use this skill to interactively debug Antithesis test runs using the multiverse debugger'). The opening clause serves as an explicit trigger guidance. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms: 'Antithesis', 'debug', 'multiverse debugger', 'debugging-session URL', 'container filesystems', 'shell commands', 'notebook mode'. A user needing this skill would likely mention 'Antithesis', 'debug', or 'multiverse debugger'. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive due to the specific domain of 'Antithesis test runs' and 'multiverse debugger'. This is a very clear niche that is unlikely to conflict with other debugging or testing skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
85%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured skill that provides clear, actionable guidance for debugging Antithesis test runs. Its strengths are excellent workflow clarity with three distinct paths, strong progressive disclosure via reference file tables, and fully executable code examples. The main weakness is moderate redundancy — runtime injection and sequential execution guidance are repeated across multiple sections, which could be tightened.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably efficient but includes some redundancy — runtime injection instructions are repeated across sections (general guidance, runtime injection, page loading checks), and the mode comparison table restates information that appears elsewhere. Some guidance like 'run commands sequentially' appears multiple times. | 2 / 3 |
Actionability | The skill provides concrete, executable bash commands throughout — runtime injection, mode detection, mode switching, waitForReady calls, simplified command execution, and session management are all copy-paste ready with specific `agent-browser` invocations. | 3 / 3 |
Workflow Clarity | Three distinct workflows (simplified command, file extraction, advanced investigation) are clearly sequenced with numbered steps. The skill includes validation checkpoints (waitForReady, loadingFinished, loadingStatus), error recovery (retry missing-runtime errors by reinjecting), and a self-review checklist at the end. | 3 / 3 |
Progressive Disclosure | The SKILL.md serves as a clear overview with well-organized tables pointing to one-level-deep reference files (setup-session.md, simplified-debugger.md, notebook.md, actions.md, common-inspections.md). Each reference is clearly signaled with 'when to read' guidance, and the main file contains just enough to orient without inlining detailed content. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
a851a75
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.