webapp-testing

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

2.17x

Quality

73%

Does it follow best practices?

Impact

100%

2.17x

Average score across 7 eval scenarios

Securityby

Passed

No findings from the security scan

Fix and improve this skill with Tessl

tessl review fix ./skills/anthropic-webapp-testing/SKILL.md

The canonical home for this skill is webapp-testing in anthropics/skills

Quality

Content

65%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The body is highly actionable with executable code and a useful decision tree, but it is held back by mild verbosity, a missing validation feedback loop, and a Reference Files section pointing to a non-existent examples/ directory.

Suggestions

Add an explicit validation/retry checkpoint to the workflow (e.g., screenshot or assert expected state, and if it fails, adjust selectors and re-run).

Fix the broken examples/ reference — either create the listed example files (element_discovery.py, static_html_automation.py, console_logging.py) or remove the Reference Files section.

Tighten the 'DO NOT read the source' paragraph to a single sentence and correct the 'abslutely' typo.

Dimension	Reasoning	Score
Conciseness	Mostly efficient with concrete code and a decision tree, but the 'DO NOT read the source ...' paragraph restates the context-window-pollution idea twice and contains a typo ('abslutely'), so it could be tightened.	2 / 3
Actionability	Provides fully executable, copy-paste-ready Playwright code and concrete with_server.py invocations with specific flags, matching the 'fully executable code/commands' anchor.	3 / 3
Workflow Clarity	The decision tree and reconnaissance-then-action pattern give a clear sequence, but there is no explicit validate→fix→retry feedback loop for testing operations, which caps workflow clarity at 2.	2 / 3
Progressive Disclosure	Content is well organized with a one-level-deep Reference Files section, but the referenced examples/ directory does not exist (only scripts/ is present), so the signaled references are partly broken.	2 / 3
	Total	9 / 12 Passed

Description

82%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A strong, specific description with concrete capabilities and natural trigger terms, but it lacks an explicit 'Use when...' clause, which caps completeness. Adding a trigger sentence would round it out.

Suggestions

Add an explicit 'Use when ...' clause naming natural triggers (e.g., 'Use when testing local web apps, verifying frontend behavior, or debugging UI with Playwright').

Include common user phrasings like 'test my web app' or 'browser automation' to broaden natural trigger coverage.

Dimension	Reasoning	Score
Specificity	Lists multiple concrete actions — 'verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs' — matching the anchor for listing several specific concrete actions.	3 / 3
Completeness	The 'what' is clearly stated but there is no 'Use when...' trigger clause, so 'when' is only implied; per the judging guidelines a missing explicit trigger caps completeness at 2.	2 / 3
Trigger Term Quality	Natural terms a user would say are well covered: 'web applications', 'Playwright', 'frontend functionality', 'UI behavior', 'browser screenshots', and 'browser logs'.	3 / 3
Distinctiveness Conflict Risk	The Playwright/local-web-app niche is distinct and unlikely to trigger for the wrong skill, matching the 'clear niche with distinct triggers' anchor.	3 / 3
	Total	11 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 16 / 16 Passed

Validation for skill structure

No warnings or errors.

Repository: boisenoise/skills-collections
Path: skills/anthropic-webapp-testing/SKILL.md
Commit: 04c2ce5

Reviewed: about 12 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.