Use when writing E2E tests with Playwright, setting up test infrastructure, or debugging flaky browser tests. Invoke to write test scripts, create page objects, configure test fixtures, set up reporters, add CI integration, implement API mocking, or perform visual regression testing. Trigger terms: Playwright, E2E test, end-to-end, browser testing, automation, UI testing, visual testing, Page Object Model, test flakiness.
89
86%
Does it follow best practices?
Impact
90%
1.23xAverage score across 6 eval scenarios
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that excels across all dimensions. It provides specific concrete actions, comprehensive trigger terms that users would naturally use, explicit 'Use when' and 'Invoke to' clauses covering both what and when, and a clear niche focused on Playwright E2E testing that distinguishes it from other skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: write test scripts, create page objects, configure test fixtures, set up reporters, add CI integration, implement API mocking, perform visual regression testing. | 3 / 3 |
Completeness | Clearly answers both 'what' (write test scripts, create page objects, configure fixtures, etc.) and 'when' ('Use when writing E2E tests with Playwright, setting up test infrastructure, or debugging flaky browser tests') with explicit trigger guidance. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'Playwright', 'E2E test', 'end-to-end', 'browser testing', 'automation', 'UI testing', 'visual testing', 'Page Object Model', 'test flakiness'. These are terms users would naturally use when requesting help with Playwright testing. | 3 / 3 |
Distinctiveness Conflict Risk | Clearly scoped to Playwright and E2E browser testing specifically, with distinct triggers like 'Playwright', 'Page Object Model', 'visual regression testing', and 'test flakiness' that are unlikely to conflict with other testing or general coding skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
72%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured Playwright skill with strong actionability through complete, executable code examples and excellent progressive disclosure via the reference table. The main weaknesses are some unnecessary filler sections (Knowledge Reference, Output Templates) and a core workflow that lacks explicit validation checkpoints between steps. The constraints (MUST DO / MUST NOT DO) are practical and well-chosen.
Suggestions
Remove the 'Knowledge Reference' keyword list and the vague 'Output Templates' section — they add no actionable value and waste tokens.
Add validation checkpoints to the Core Workflow, e.g., after 'Setup' verify config with `npx playwright test --list`, and after 'Write tests' run a smoke check before CI integration.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Generally efficient but has some unnecessary elements: the 'Knowledge Reference' section at the bottom is just a keyword list that adds no value, the 'Output Templates' section is vague filler, and the opening description line restates what the skill title already conveys. The code examples and constraints are well-justified though. | 2 / 3 |
Actionability | Provides fully executable TypeScript code examples including a complete Page Object Model class, test file with assertions, debugging workflow with specific CLI commands, and correct/incorrect selector comparisons. All code is copy-paste ready and uses real Playwright APIs. | 3 / 3 |
Workflow Clarity | The core workflow is listed but is high-level and lacks validation checkpoints. The debugging workflow for flaky tests is well-sequenced with a clear verify step (run 10x), but the main workflow (analyze → setup → write → debug → integrate) has no validation gates between steps. For a skill involving test infrastructure setup, explicit verification steps would strengthen this. | 2 / 3 |
Progressive Disclosure | Excellent use of a reference table with clear 'Load When' triggers pointing to separate files for selectors, POM, API mocking, configuration, and debugging. The main skill provides a concise overview with actionable examples while deferring detailed guidance to one-level-deep references. | 3 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
3d95bb1
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.