CtrlK
BlogDocsLog inGet started
Tessl Logo

jbvc/e2e-testing

Playwright E2E testing patterns, Page Object Model, configuration, CI/CD integration, artifact management, and flaky test strategies.

44

Quality

56%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

Quality

Content

57%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides excellent, actionable Playwright code examples covering a wide range of E2E testing concerns. However, it suffers from being a monolithic reference document that could benefit significantly from splitting into focused sub-files with a concise overview. Some sections (test report template, Web3 testing) add bulk without being essential to the core skill, and the lack of an overarching workflow connecting the pieces reduces its effectiveness as a guide.

Suggestions

Split into a concise SKILL.md overview with references to separate files: POM.md, CONFIGURATION.md, FLAKY-TESTS.md, CI-CD.md, and optionally WEB3.md for domain-specific patterns.

Add an overarching workflow section at the top: 'When creating a new E2E test: 1. Create spec file in appropriate directory, 2. Build page objects, 3. Write tests with proper waits, 4. Run with --repeat-each=5 to verify stability, 5. Add to CI pipeline.'

Remove the test report template section — Claude can generate report formats on demand without needing a template stored in context.

Trim explanatory comments in good/bad comparisons (e.g., '// Bad: assumes element is ready' is self-evident from the code contrast) to save tokens.

DimensionReasoningScore

Conciseness

The skill is mostly efficient with good code examples, but it's quite long (~250 lines) and includes some sections that feel like padding (e.g., the test report template is a markdown template Claude could generate on its own, the Web3/wallet testing section is domain-specific and could be a separate file). Some patterns like the basic POM explanation are things Claude already knows well.

2 / 3

Actionability

Nearly all guidance is concrete and executable: full TypeScript code examples for POM, test structure, configuration, flaky test patterns with good/bad comparisons, CI/CD YAML, and artifact management. Code is copy-paste ready with realistic patterns.

3 / 3

Workflow Clarity

Individual sections are clear, but there's no overarching workflow tying the pieces together (e.g., 'when writing a new E2E test, follow these steps...'). The flaky test section has good diagnostic steps (repeat-each, retries) but lacks an explicit validation/feedback loop for the overall test creation process. The content reads more like a reference catalog than a guided workflow.

2 / 3

Progressive Disclosure

Everything is in a single monolithic file with no references to supporting files, despite the content being long enough to warrant splitting (POM patterns, CI/CD config, Web3 testing, flaky test strategies could each be separate files). The file organization diagram at the top suggests a multi-file structure but the skill itself doesn't leverage progressive disclosure at all.

1 / 3

Total

8

/

12

Passed

Description

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies the domain (Playwright E2E testing) and lists relevant topic areas, but it reads more like a table of contents than an actionable skill description. It lacks concrete action verbs describing what the skill does and entirely omits a 'Use when...' clause, making it difficult for Claude to know when to select this skill over others.

Suggestions

Add a 'Use when...' clause with explicit triggers, e.g., 'Use when the user asks about writing Playwright tests, debugging flaky E2E tests, setting up Playwright configuration, or integrating browser tests into CI/CD pipelines.'

Replace topic nouns with concrete action phrases, e.g., 'Generates Playwright E2E tests using Page Object Model patterns, configures Playwright for CI/CD pipelines, manages test artifacts, and diagnoses flaky test failures.'

Include additional natural trigger terms users might say, such as 'end-to-end tests', 'browser testing', 'test automation', 'playwright.config.ts', or 'test retries'.

DimensionReasoningScore

Specificity

Names the domain (Playwright E2E testing) and lists several topic areas (Page Object Model, configuration, CI/CD integration, artifact management, flaky test strategies), but these are more like categories than concrete actions. No verbs describing what the skill actually does (e.g., 'generates tests', 'configures pipelines').

2 / 3

Completeness

Describes the 'what' at a topic level but completely lacks any 'when' guidance — there is no 'Use when...' clause or equivalent explicit trigger guidance. Per the rubric, a missing 'Use when...' clause should cap completeness at 2, and since the 'what' is also weak (topics rather than actions), this scores a 1.

1 / 3

Trigger Term Quality

Includes relevant keywords like 'Playwright', 'E2E testing', 'Page Object Model', 'CI/CD', and 'flaky test' that users might naturally use. However, it misses common variations like 'end-to-end tests', 'browser testing', 'test automation', '.spec.ts', or 'Playwright config'.

2 / 3

Distinctiveness Conflict Risk

Mentioning 'Playwright' specifically helps distinguish it from generic testing skills, but the broad terms like 'CI/CD integration' and 'configuration' could overlap with other CI/CD or testing-related skills.

2 / 3

Total

7

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents