Browser automation and E2E testing with Playwright. Auto-detects dev servers, writes clean test scripts. Test pages, fill forms, take screenshots, check responsive design, validate UX, test login flows, check links, automate any browser task. Use for cross-browser testing, visual regression, API testing, component testing in TypeScript/JavaScript and Python projects.
84
81%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that covers specific capabilities comprehensively, includes abundant natural trigger terms, and clearly delineates both what the skill does and when to use it. The Playwright focus gives it a distinct identity. The only minor weakness is that 'automate any browser task' is slightly vague/over-claiming, but it's balanced by the many specific actions listed.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'test pages, fill forms, take screenshots, check responsive design, validate UX, test login flows, check links, automate any browser task.' Also mentions auto-detecting dev servers and writing clean test scripts. | 3 / 3 |
Completeness | Clearly answers 'what' (browser automation, E2E testing, screenshots, form filling, etc.) and 'when' with an explicit 'Use for cross-browser testing, visual regression, API testing, component testing in TypeScript/JavaScript and Python projects.' The 'Use for' clause serves as an explicit trigger guidance. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'Playwright', 'E2E testing', 'browser automation', 'screenshots', 'responsive design', 'login flows', 'cross-browser testing', 'visual regression', 'API testing', 'component testing', 'TypeScript', 'JavaScript', 'Python'. These are terms users would naturally use when requesting these capabilities. | 3 / 3 |
Distinctiveness Conflict Risk | Clearly scoped to Playwright-based browser automation and E2E testing, which is a distinct niche. The mention of Playwright, browser automation, and specific testing types (visual regression, cross-browser) makes it unlikely to conflict with other skills like general testing or code generation skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill excels at actionability and workflow clarity with concrete, executable examples and a well-defined critical workflow. However, it suffers significantly from verbosity - it inlines extensive Playwright API reference material (selectors, assertions, actions, POM patterns, auth state, network mocking) that Claude already knows and that should either be omitted or moved to the referenced API_REFERENCE.md file. The core workflow (detect servers → write to /tmp → execute via run.js) is the genuinely novel content and could be conveyed in roughly one-third the current length.
Suggestions
Move the Selectors, Assertions, Actions, Network Mocking, Visual Testing, Authentication State, and Page Object Model sections to references/API_REFERENCE.md - these are standard Playwright patterns Claude already knows
Keep the SKILL.md focused on the novel workflow: server detection, /tmp file writing convention, run.js execution, helper utilities, and custom headers - the parts that are specific to this skill installation
Remove the E2E Testing Patterns section (running tests, writing tests, configuration) as these are standard Playwright knowledge that don't relate to the skill's custom workflow
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~350+ lines. It includes extensive reference material (selectors, assertions, actions, page object model, authentication state, network mocking, visual testing) that Claude already knows well. These are standard Playwright API patterns documented everywhere - they don't add novel knowledge. The skill tries to be both a workflow guide AND a comprehensive API reference, resulting in significant bloat. | 1 / 3 |
Actionability | The skill provides fully executable, copy-paste ready code examples throughout - from responsive testing to login flows to broken link checking. Commands are specific with exact syntax, and the workflow steps (detect servers, write to /tmp, execute via run.js) are concrete and executable. | 3 / 3 |
Workflow Clarity | The critical workflow section clearly sequences the steps: detect dev servers → write scripts to /tmp → use visible browser → parameterize URLs → execute via run.js. It includes decision points (1 server vs multiple vs none) and explicit constraints (never write to skill directory). The troubleshooting section provides error recovery guidance. | 3 / 3 |
Progressive Disclosure | The skill references `references/API_REFERENCE.md` at the end with clear guidance on when to load it, which is good. However, the main file itself contains extensive inline content (selectors, assertions, actions, POM, auth state, network mocking, visual testing) that should be in that reference file instead. The skill is a monolithic document with content that belongs in separate files. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
Total | 10 / 11 Passed | |
88da5ff
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.