Complete browser automation with Playwright. Auto-detects dev servers, writes clean test scripts to /tmp. Test pages, fill forms, take screenshots, check responsive design, validate UX, test login flows, check links, automate any browser task. Use when user wants to test websites, automate browser interactions, validate web functionality, or perform any browser-based testing. Do NOT use for quick page debugging or network inspection (use chrome-devtools instead).
85
81%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that covers all key dimensions well. It lists numerous specific actions, includes natural trigger terms users would use, explicitly states both what the skill does and when to use it, and even includes a negative boundary to prevent conflicts with a related skill (chrome-devtools). The description is comprehensive yet concise.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: auto-detects dev servers, writes test scripts to /tmp, test pages, fill forms, take screenshots, check responsive design, validate UX, test login flows, check links, automate browser tasks. | 3 / 3 |
Completeness | Clearly answers both 'what' (browser automation with Playwright, specific actions listed) and 'when' (explicit 'Use when user wants to test websites, automate browser interactions, validate web functionality'). Also includes a 'Do NOT use' clause for additional clarity. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'test websites', 'browser automation', 'Playwright', 'screenshots', 'responsive design', 'login flows', 'fill forms', 'check links', 'browser interactions'. Also includes a negative trigger distinguishing from chrome-devtools. | 3 / 3 |
Distinctiveness Conflict Risk | Clearly carved niche around Playwright browser automation with explicit boundary against chrome-devtools for quick debugging/network inspection. The mention of writing scripts to /tmp and the specific tool name 'Playwright' make it highly distinctive. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides excellent actionability with fully executable code examples and a clear workflow for server detection and script execution. However, it is significantly over-verbose with redundant patterns (responsive design shown twice, screenshots in multiple places), repeated instructions across sections, and too much inline content that should be split into reference files. The core workflow is strong but buried under excessive examples.
Suggestions
Move the 6+ common patterns (responsive, login, form, broken links, screenshot, etc.) to a separate PATTERNS.md or EXAMPLES.md file, keeping only 1-2 representative examples inline in SKILL.md
Remove the 'Notes' section entirely as it repeats information from 'How It Works' and 'Tips'; consolidate the 'Tips' section to remove duplicates of rules already stated in the critical workflow
Merge the two responsive design examples into one and remove the duplicate screenshot pattern to eliminate redundancy
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~300+ lines. Many patterns are near-duplicates (responsive design appears twice, screenshot appears multiple times). The 'How It Works', 'Notes', and 'Example Usage' sections repeat information already covered. The tips section restates rules from the critical workflow. Claude doesn't need this much repetition or the explanatory 'Notes' section. | 1 / 3 |
Actionability | All code examples are fully executable, copy-paste ready JavaScript with proper imports, async patterns, and concrete selectors. Commands are specific with exact syntax for execution, environment variables, and setup. The patterns cover a wide range of real tasks with complete, runnable scripts. | 3 / 3 |
Workflow Clarity | The critical workflow is clearly numbered (1-4) with explicit decision points (1 server vs multiple vs none). The execution pattern has clear Step 1/2/3 sequencing. Error handling patterns are shown with try-catch-finally. The server detection step serves as a validation checkpoint before writing test code. | 3 / 3 |
Progressive Disclosure | There is a good reference to API_REFERENCE.md for advanced usage, and helpers are summarized with a pointer to the full file. However, the main SKILL.md contains far too much inline content - the 6+ common patterns could be in a separate PATTERNS.md or EXAMPLES.md file, keeping the main skill lean. The inline content is a wall of code examples. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
906a57d
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.