CtrlK
BlogDocsLog inGet started
Tessl Logo

test-ui

Test UI system (PanelUI, ScreenSpace) against the poke example using the iwsdk CLI.

52

Quality

58%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/test-ui/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

40%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a very specific niche (testing PanelUI/ScreenSpace with iwsdk CLI against the poke example), which makes it distinctive but also quite narrow and jargon-heavy. It lacks an explicit 'Use when...' clause and doesn't enumerate multiple concrete actions, limiting its usefulness for skill selection. The technical specificity helps avoid conflicts but hurts discoverability for users who might phrase requests differently.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user wants to run UI tests, verify PanelUI or ScreenSpace behavior, or test against the poke example using iwsdk.'

Include natural language trigger terms a user might say, such as 'run UI tests', 'test panels', 'screen space rendering', 'poke example test'.

Expand the 'what' portion to list specific actions, e.g., 'Runs UI tests for PanelUI and ScreenSpace components, validates rendering output, and reports test results using the iwsdk CLI against the poke example.'

DimensionReasoningScore

Specificity

Names a specific domain (UI testing with PanelUI, ScreenSpace) and mentions concrete tools (iwsdk CLI, poke example), but doesn't list multiple distinct actions—it's essentially one action: 'test UI system against the poke example.'

2 / 3

Completeness

Describes what it does (test UI system against poke example) but has no explicit 'Use when...' clause or equivalent trigger guidance. Per the rubric, a missing 'Use when...' clause caps completeness at 2, and the 'what' is also fairly thin, so this scores a 1.

1 / 3

Trigger Term Quality

Includes some relevant keywords like 'PanelUI', 'ScreenSpace', 'iwsdk CLI', and 'poke example', but these are highly technical/project-specific terms. Missing natural language variations a user might say (e.g., 'run UI tests', 'test panels', 'screen space testing').

2 / 3

Distinctiveness Conflict Risk

The description is highly specific to a particular UI system (PanelUI, ScreenSpace), a specific example (poke), and a specific CLI tool (iwsdk). This is unlikely to conflict with other skills due to its narrow, project-specific scope.

3 / 3

Total

8

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-crafted test execution skill with highly actionable, concrete commands and specific assertions. The workflow is clearly sequenced with proper validation checkpoints, error recovery procedures, and failure handling. The main weakness is its length—the detailed assertion values for each test suite make it verbose, and some of the 'Known Issues' content is informational rather than directly actionable for the test workflow.

Suggestions

Consider moving the detailed per-suite assertion values into a separate TEST_ASSERTIONS.md reference file, keeping SKILL.md as a concise workflow overview with links to detailed specs.

Trim the 'Known Issues' section to only items that directly affect test execution (e.g., keep 'PanelDocument loading is async' and 'Entity indices change on reload', but remove 'Panel interaction' and 'ScreenSpace re-parenting in XR' which are informational).

DimensionReasoningScore

Conciseness

The skill is mostly efficient and avoids explaining concepts Claude already knows, but it's quite long (~180 lines) with some redundancy. The 'Known Issues & Workarounds' section adds useful context, but some items (like 'Panel interaction' and 'ScreenSpace re-parenting in XR') are informational rather than actionable for the test workflow. The repeated 'IMPORTANT' callouts and some verbose assertion descriptions could be tightened.

2 / 3

Actionability

Every step has concrete, executable CLI commands with exact flags and arguments. Assertions are specific with expected values (e.g., `maxWidth = 0.5`, `height = "50%"`). The commands are copy-paste ready with clear JSON input parameters, and the output format is specified for parsing.

3 / 3

Workflow Clarity

The workflow is clearly sequenced with numbered steps, explicit validation checkpoints (verify connectivity before testing, check for errors in pre-test setup, poll for server readiness), feedback loops (retry logic in Recovery section), and clear failure handling (skip to Step 5 if server fails). The pre-test setup includes sleep commands and assertion checks before proceeding to test suites.

3 / 3

Progressive Disclosure

The content is well-structured with clear section headers and logical grouping of test suites, but it's a monolithic document that could benefit from splitting detailed test assertions into a separate reference file. For a skill of this length (~180 lines of dense test specifications), the inline approach makes it harder to navigate. However, no bundle files exist to reference, and the section organization is reasonable.

2 / 3

Total

10

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

10

/

11

Passed

Repository
facebook/immersive-web-sdk
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.