Test level system (LevelRoot, LevelTag, default lighting, scene hierarchy) against the poke example using the iwsdk CLI.
52
58%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.claude/skills/test-level/SKILL.mdQuality
Discovery
40%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description targets a very specific niche (testing level systems with the iwsdk CLI), which gives it strong distinctiveness but limits its accessibility. Its main weaknesses are the lack of explicit 'Use when...' trigger guidance and the absence of natural user-facing keywords that would help Claude match this skill to user requests.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user asks to test or validate level systems, scene hierarchies, or lighting setups using the iwsdk CLI.'
Include natural trigger terms a user might say, such as 'level testing', 'scene validation', 'poke test', 'iwsdk test', or 'level hierarchy check'.
Expand the 'what' portion to list more specific actions, e.g., 'Validates LevelRoot and LevelTag configurations, checks default lighting setup, verifies scene hierarchy structure, and runs tests against the poke example.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names a specific domain (test level system) and mentions concrete components (LevelRoot, LevelTag, default lighting, scene hierarchy) and a specific tool (iwsdk CLI), but the actions are limited to 'test...against the poke example' without listing multiple distinct concrete actions. | 2 / 3 |
Completeness | The description addresses 'what' (test level system components against the poke example) but completely lacks a 'Use when...' clause or any explicit trigger guidance for when Claude should select this skill. Per the rubric, a missing 'Use when...' clause caps completeness at 2, and the 'what' itself is also somewhat weak, so this scores a 1. | 1 / 3 |
Trigger Term Quality | Includes some relevant technical keywords like 'LevelRoot', 'LevelTag', 'iwsdk CLI', 'scene hierarchy', and 'poke example', but these are highly domain-specific jargon. A user might naturally say 'test level system' or 'iwsdk', but common variations or more natural phrasings are missing. | 2 / 3 |
Distinctiveness Conflict Risk | The description is highly specific to a particular domain (iwsdk CLI, LevelRoot, LevelTag, poke example) making it very unlikely to conflict with other skills. It occupies a clear, narrow niche. | 3 / 3 |
Total | 8 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, highly actionable test procedure skill with clear sequential steps, explicit assertions, and robust error recovery. Its main weakness is length — the document is quite long and could be more concise by trimming explanatory notes and potentially splitting test suites into referenced files. The Known Issues section adds useful operational context but partially duplicates information already embedded in the test assertions.
Suggestions
Trim inline explanations that repeat assertion logic (e.g., remove 'The LevelSystem enforces identity transform on the level root every frame' since the test itself demonstrates this).
Consider moving the Known Issues section to a separate KNOWN_ISSUES.md file and referencing it, to reduce the main skill's token footprint.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly efficient but includes some unnecessary explanations (e.g., 'The LevelSystem enforces identity transform on the level root every frame', 'Entity 0 wraps the Three.js Scene object') that Claude could infer or doesn't need inline. The Known Issues section adds useful context but some items repeat what's already stated in the test assertions. | 2 / 3 |
Actionability | Every step provides concrete, executable CLI commands with specific assertion criteria (exact values, entity counts, component names). The commands are copy-paste ready with clear placeholders like `<root>` that are defined in prior steps, and expected outputs are precisely specified. | 3 / 3 |
Workflow Clarity | The workflow is clearly sequenced with explicit validation checkpoints (verify connectivity before testing, check for errors before proceeding, reload/sleep between steps). It includes a recovery section with retry logic, failure handling at each stage, and clear instructions to report FAIL and skip ahead when prerequisites fail. | 3 / 3 |
Progressive Disclosure | The content is a long monolithic document (~180 lines) that could benefit from splitting test suites into separate files or referencing external test definitions. However, for a test procedure skill, having everything inline is somewhat justified since it needs to be followed sequentially. No bundle files are provided to offload content to. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
e8f9df1
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.