Run SkyWalking E2E tests locally
63
51%
Does it follow best practices?
Impact
79%
1.58xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./.claude/skills/run-e2e/SKILL.mdQuality
Discovery
32%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is very terse—it identifies the tool (SkyWalking) and the action (run E2E tests locally) but lacks a 'Use when...' clause, detailed capabilities, and sufficient trigger terms. It would benefit significantly from elaboration on what the skill does and explicit guidance on when Claude should select it.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user asks to run, debug, or configure SkyWalking end-to-end tests in a local environment.'
List more specific actions the skill covers, such as setting up test environments, configuring test profiles, interpreting test results, or troubleshooting failures.
Include natural trigger term variations like 'end-to-end tests', 'e2e testing', 'SkyWalking test suite', 'local test execution' to improve matching.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (SkyWalking E2E tests) and a single action (run), but does not list multiple concrete actions or elaborate on what running these tests entails. | 2 / 3 |
Completeness | Provides a brief 'what' (run SkyWalking E2E tests locally) but completely lacks any 'when' clause or explicit trigger guidance, which caps this at a low score per the rubric. | 1 / 3 |
Trigger Term Quality | Includes relevant keywords like 'SkyWalking', 'E2E tests', and 'locally', which a user might say, but misses common variations such as 'end-to-end', 'integration tests', 'test runner', or specific test commands. | 2 / 3 |
Distinctiveness Conflict Risk | 'SkyWalking' is a fairly specific project name which helps distinguish it, but the description is so terse that it could overlap with other testing-related skills without clearer scoping. | 2 / 3 |
Total | 7 / 12 Passed |
Implementation
70%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a highly actionable and well-sequenced skill for running SkyWalking E2E tests, with excellent concrete commands and thoughtful validation checkpoints. Its main weakness is poor progressive disclosure — the extensive authoring traps and edge cases (steps 7-9) should be split into separate reference files rather than inlined, making the main skill significantly shorter and more scannable. The content is project-specific enough to justify its existence but could be ~40% shorter with better file organization.
Suggestions
Extract steps 7-9 (UI template changes, expected-file authoring, and authoring traps) into a separate reference file like AUTHORING_GUIDE.md and link to it from the main skill with a one-line summary of each topic.
Add a brief 'Quick Reference' or 'TL;DR' section at the top summarizing the core workflow (determine case → check rebuild → run → debug if failed) before diving into details.
The 'Expected-file authoring traps' section (step 9) is extremely detailed with 6+ sub-items — consider restructuring as a checklist or moving to a dedicated TRAPS.md file referenced from the main skill.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is quite long (~200+ lines) and includes substantial detail on edge cases and traps (steps 7-9) that, while valuable, make it verbose. Some sections like the OTLP payload drift and nested contains traps are very specialized and could be split into a reference file. However, it avoids explaining basic concepts Claude already knows and most content is project-specific knowledge Claude wouldn't have. | 2 / 3 |
Actionability | Excellent actionability throughout — every step has concrete, executable bash commands with real paths, specific environment variables, and exact tool invocations. The troubleshooting section provides copy-paste-ready docker and swctl commands, and the common pitfalls table maps broken flags to working alternatives. | 3 / 3 |
Workflow Clarity | The workflow is clearly sequenced (steps 1-9) with explicit validation checkpoints: step 2 checks if rebuild is needed before running, step 3 requires user confirmation, step 5 has a clear 'do NOT cleanup immediately' directive with ordered debugging steps, and step 6 provides a feedback loop for iterating on verify cases. The 'only cleanup when done debugging' instruction is a good safety checkpoint. | 3 / 3 |
Progressive Disclosure | The skill is a monolithic wall of text with no references to supporting files for the extensive trap/pitfall documentation in steps 7-9. Steps 8-9 alone contain ~80 lines of detailed edge cases that would be better in a separate TRAPS.md or AUTHORING.md file. The only external reference is to `test/e2e-v2/CLAUDE.md` in step 8. No bundle files are provided to offset this. | 1 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
bf0fe4b
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.