Test UI system (PanelUI, ScreenSpace) against the poke example using the iwsdk CLI.
52
58%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.claude/skills/test-ui/SKILL.mdQuality
Discovery
40%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description identifies a very specific niche (testing PanelUI/ScreenSpace with iwsdk CLI against the poke example), which gives it strong distinctiveness. However, it lacks an explicit 'Use when...' clause, provides only a single action rather than enumerating capabilities, and relies on project-specific jargon without natural language trigger terms that a user might employ.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user wants to run UI tests, validate PanelUI or ScreenSpace behavior, or test against the poke example using iwsdk.'
List more specific actions the skill performs, such as 'Runs iwsdk CLI commands, validates PanelUI rendering, checks ScreenSpace layout, reports test results against the poke example.'
Include natural language trigger terms a user might say, such as 'run UI tests', 'test panels', 'screen space validation', 'poke example test'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names a specific domain (UI testing with PanelUI, ScreenSpace) and mentions concrete tools (iwsdk CLI, poke example), but doesn't list multiple distinct actions—it's essentially one action: 'test UI system against the poke example.' | 2 / 3 |
Completeness | Describes what it does (test UI system against the poke example) but has no explicit 'Use when...' clause or equivalent trigger guidance, which per the rubric should cap completeness at 2—and since the 'what' is also fairly thin, this lands at 1. | 1 / 3 |
Trigger Term Quality | Includes some relevant keywords like 'PanelUI', 'ScreenSpace', 'iwsdk CLI', and 'poke example', but these are highly technical/project-specific terms. Missing natural language variations a user might say (e.g., 'run UI tests', 'test panels', 'screen space testing'). | 2 / 3 |
Distinctiveness Conflict Risk | The description is highly specific to a particular system (PanelUI, ScreenSpace, iwsdk CLI, poke example), making it very unlikely to conflict with other skills. It occupies a clear, narrow niche. | 3 / 3 |
Total | 8 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, highly actionable test procedure with clear step-by-step workflows, explicit assertions, and robust error recovery. Its main weakness is length — the inline MCPCALL helper and some explanatory sections add bulk that could be offloaded to supporting files. The hardcoded absolute paths reduce portability but are appropriate for a project-specific test skill.
Suggestions
Extract the MCPCALL shell function into a separate helper script file (e.g., mcpcall.sh) and reference it from SKILL.md to reduce inline verbosity.
Move the Known Issues & Workarounds section to a separate KNOWN_ISSUES.md file, keeping only a brief reference in the main skill.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly long but most content is necessary for the multi-step test workflow. However, the MCPCALL shell function is verbose (could be provided as a separate file), the Known Issues section includes some explanatory context Claude doesn't need (e.g., explaining what async loading means), and some assertions could be more compact (e.g., table format instead of bullet lists). | 2 / 3 |
Actionability | Every step has concrete, executable bash commands with exact tool names, JSON arguments, and specific assertion values. The MCPCALL helper is fully executable, assertions specify exact expected values (e.g., `maxWidth = 0.5`, `height = "50%"`), and the output format is clearly defined with a summary table template. | 3 / 3 |
Workflow Clarity | The workflow is clearly sequenced (install → start server → verify connectivity → run suites → cleanup) with explicit validation checkpoints at each step, sleep/wait instructions between async operations, a dedicated recovery section with retry logic, and clear fail-fast conditions (e.g., if server doesn't start in 60s, skip to Step 5). The pre-test setup includes verification of no error logs before proceeding. | 3 / 3 |
Progressive Disclosure | The content is a monolithic document with no references to external files. The MCPCALL shell function (~30 lines) and the Known Issues section could be split into separate files. However, for a test procedure skill, having everything in one place has some justification. No bundle files are provided, so there's no external structure to leverage. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
b3d1162
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.