CtrlK
BlogDocsLog inGet started
Tessl Logo

webapp-testing

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

90

2.17x
Quality

71%

Does it follow best practices?

Impact

100%

2.17x

Average score across 7 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/anthropic-webapp-testing/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

57%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description adequately identifies its domain (Playwright-based local web app testing) and lists several capabilities, making it reasonably distinctive. However, it lacks an explicit 'Use when...' clause and could benefit from more natural trigger terms that users would actually say when needing this skill. Adding explicit trigger guidance and more user-facing keywords would significantly improve skill selection accuracy.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user asks to test a web app, open a browser, take a screenshot of a page, check UI elements, or debug frontend issues on localhost.'

Include more natural trigger terms users would say, such as 'e2e testing', 'end-to-end', 'localhost', 'open browser', 'click button', 'web page', 'check element', '.html'.

DimensionReasoningScore

Specificity

Names the domain (local web applications, Playwright) and several actions (verifying frontend functionality, debugging UI behavior, capturing screenshots, viewing browser logs), but the actions are somewhat general rather than highly concrete operations.

2 / 3

Completeness

Clearly describes what the skill does (interacting with and testing local web apps via Playwright, screenshots, logs), but lacks an explicit 'Use when...' clause specifying when Claude should select this skill. Per rubric guidelines, missing 'Use when' caps completeness at 2.

2 / 3

Trigger Term Quality

Includes useful terms like 'Playwright', 'browser screenshots', 'browser logs', 'frontend', and 'UI behavior', but misses common user phrases like 'test my app', 'open browser', 'click button', 'e2e testing', 'end-to-end', 'localhost', or 'web page'.

2 / 3

Distinctiveness Conflict Risk

The combination of 'Playwright', 'local web applications', 'browser screenshots', and 'browser logs' creates a clear niche that is unlikely to conflict with other skills. This is distinctly about browser-based testing of local apps.

3 / 3

Total

9

/

12

Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured skill that provides clear, actionable guidance for web application testing with Playwright. The decision tree is an effective way to guide approach selection, and the executable examples are concrete and useful. Minor verbosity in repeated advice about black-box script usage and some obvious best practices slightly reduce token efficiency.

Suggestions

Remove the redundant 'use bundled scripts as black boxes' bullet from Best Practices since it's already emphasized in the intro paragraph.

Trim Best Practices items that Claude already knows (e.g., 'Always close the browser when done', 'Use sync_playwright() for synchronous scripts') to improve conciseness.

DimensionReasoningScore

Conciseness

Mostly efficient but has some redundancy — the 'use scripts as black boxes' advice is repeated in the intro and Best Practices, and the Best Practices section includes some items Claude already knows (close browser, use sync_playwright). The decision tree and examples are well-structured though.

2 / 3

Actionability

Provides fully executable bash commands for with_server.py, complete Python Playwright code snippets that are copy-paste ready, and concrete examples for both single and multi-server setups. The reconnaissance pattern includes specific method calls.

3 / 3

Workflow Clarity

The decision tree clearly sequences the approach based on context (static vs dynamic, server running vs not). The reconnaissance-then-action pattern is a clear 3-step workflow. The common pitfall section serves as a validation checkpoint for the critical networkidle wait. For a non-destructive testing skill, this level of workflow clarity is appropriate.

3 / 3

Progressive Disclosure

The skill provides a concise overview with clear references to examples/ directory listing specific files and their purposes. Helper scripts are referenced but not inlined. Content is well-organized into logical sections with appropriate depth for a SKILL.md.

3 / 3

Total

11

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
boisenoise/skills-collections
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.