Use when writing Playwright tests, fixing flaky tests, debugging failures, implementing Page Object Model, configuring CI/CD, optimizing performance, mocking APIs, handling authentication or OAuth, testing accessibility (axe-core), file uploads/downloads, date/time mocking, WebSockets, geolocation, permissions, multi-tab/popup flows, mobile/responsive layouts, touch gestures, GraphQL, error handling, offline mode, multi-user collaboration, third-party services (payments, email verification), console error monitoring, global setup/teardown, test annotations (skip, fixme, slow), test tags (@smoke, @fast, @critical, filtering with --grep), project dependencies, security testing (XSS, CSRF, auth), performance budgets (Web Vitals, Lighthouse), iframes, component testing, canvas/WebGL, service workers/PWA, test coverage, i18n/localization, Electron apps, or browser extension testing. Covers E2E, component, API, visual, accessibility, security, Electron, and extension testing.
87
83%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description excels at specificity and trigger term coverage, providing an exhaustive list of Playwright testing scenarios and natural developer terminology. However, it reads more like a feature list than a skill description, and lacks an explicit 'Use when...' clause that would help Claude understand when to select this skill. The description is also extremely long, which while comprehensive, could be more concise.
Suggestions
Add an explicit 'Use when...' clause at the end, e.g., 'Use when the user mentions Playwright, E2E testing, browser automation, or any of the specific testing scenarios listed above.'
Consider condensing the exhaustive list into categories with key examples, then add explicit trigger guidance for skill selection scenarios.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists numerous specific concrete actions including 'writing Playwright tests', 'fixing flaky tests', 'debugging failures', 'implementing Page Object Model', 'mocking APIs', 'testing accessibility (axe-core)', and many more specific testing scenarios. | 3 / 3 |
Completeness | The 'what' is extensively covered with detailed capabilities, but the 'when' is only implied through 'Use when writing Playwright tests...' at the start. There's no explicit 'Use when the user mentions...' clause with trigger guidance for skill selection. | 2 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'Playwright tests', 'flaky tests', 'Page Object Model', 'CI/CD', 'OAuth', 'file uploads', 'WebSockets', '@smoke', '@critical', '--grep', 'Lighthouse', 'Web Vitals', 'Electron apps'. These are terms developers naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with clear focus on Playwright specifically. The explicit mention of 'Playwright' throughout and specific testing terminology (axe-core, Page Object Model, --grep flags) makes it unlikely to conflict with generic testing or other automation framework skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
85%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured index/navigation skill that excels at organizing a large body of Playwright knowledge into discoverable categories. Its strengths are excellent organization, efficient token usage, and clear progressive disclosure. The main weakness is that actionability is deferred to linked files rather than providing inline executable examples, though the validation loop section partially addresses this.
Suggestions
Consider adding 2-3 inline quick-reference code snippets for the most common operations (e.g., basic test structure, common locator patterns) to improve immediate actionability without requiring file navigation
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is highly efficient - it's essentially a well-organized reference index with no unnecessary explanations. Every section serves as a navigation aid, and there's no padding or explanation of concepts Claude already knows. | 3 / 3 |
Actionability | The skill provides clear navigation to reference files and includes a concrete validation loop with executable commands, but the main content is a directory/index rather than directly actionable guidance. The actual executable code and specific examples are deferred to linked files. | 2 / 3 |
Workflow Clarity | The Test Validation Loop section provides a clear, explicit workflow with validation steps and feedback loops (run → fail → fix → re-run → pass → proceed). The decision tree also provides clear sequencing for choosing the right approach. | 3 / 3 |
Progressive Disclosure | Excellent progressive disclosure - the skill serves as a clear overview/index with well-organized tables and a decision tree, pointing to one-level-deep reference files. Navigation is intuitive with activity-based categorization and clear file references. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
ef329e7
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.