CtrlK
BlogDocsLog inGet started
Tessl Logo

jbvc/e2e-testing

Playwright E2E testing patterns, Page Object Model, configuration, CI/CD integration, artifact management, and flaky test strategies.

56

Quality

56%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

Quality

Discovery

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies the domain (Playwright E2E testing) and lists relevant topic areas, but reads more like a table of contents than an actionable skill description. It lacks concrete action verbs describing what the skill does and entirely omits 'when to use' guidance, making it difficult for Claude to reliably select this skill from a large pool.

Suggestions

Add a 'Use when...' clause with explicit triggers, e.g., 'Use when the user asks about writing Playwright tests, debugging flaky E2E tests, setting up browser test automation, or configuring Playwright in CI/CD pipelines.'

Replace topic nouns with concrete action phrases, e.g., 'Generates Playwright E2E tests using Page Object Model patterns, configures playwright.config.ts, sets up test artifact collection, and diagnoses flaky test failures.'

Include common user-facing keyword variations such as 'end-to-end tests', 'browser testing', 'test automation', '.spec.ts', and 'playwright.config' to improve trigger term coverage.

DimensionReasoningScore

Specificity

Names the domain (Playwright E2E testing) and lists several topic areas (Page Object Model, configuration, CI/CD integration, artifact management, flaky test strategies), but these are more like categories than concrete actions. No verbs describing what the skill actually does (e.g., 'generates tests', 'configures pipelines').

2 / 3

Completeness

Describes the 'what' at a topic level but completely lacks any 'when should Claude use it' guidance. There is no 'Use when...' clause or equivalent explicit trigger guidance, which per the rubric should cap completeness at 2, and since the 'what' is also weak (topics rather than actions), this scores a 1.

1 / 3

Trigger Term Quality

Includes relevant keywords like 'Playwright', 'E2E testing', 'Page Object Model', 'CI/CD', and 'flaky test' that users might naturally mention. However, it misses common variations like 'end-to-end tests', 'browser testing', 'test automation', '.spec.ts', or 'playwright.config'.

2 / 3

Distinctiveness Conflict Risk

The mention of 'Playwright' specifically helps distinguish it from generic testing skills, but terms like 'E2E testing', 'CI/CD integration', and 'configuration' are broad enough to overlap with other testing or CI/CD-related skills.

2 / 3

Total

7

/

12

Passed

Implementation

57%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides highly actionable, executable Playwright patterns with strong code examples covering POM, configuration, CI/CD, and flaky test handling. However, it suffers from being a monolithic document that tries to cover too many topics inline without progressive disclosure or a clear overarching workflow. Trimming domain-specific sections (Web3, financial flows) into separate files and adding a sequenced getting-started workflow would significantly improve it.

Suggestions

Add a brief 'Quick Start' section at the top that sequences the key steps (configure → create page objects → write tests → run → review artifacts) with validation checkpoints.

Split domain-specific sections (Wallet/Web3 Testing, Financial/Critical Flow Testing) and reference-heavy sections (Artifact Management, CI/CD Integration) into separate linked files to improve progressive disclosure.

Remove the Test Report Template section — Claude can generate report templates without explicit instruction, and this consumes tokens without adding unique value.

Add a validation step after the CI/CD section, e.g., 'Verify the workflow runs: check GitHub Actions tab for green status and confirm artifacts are uploaded.'

DimensionReasoningScore

Conciseness

The skill is mostly efficient with good code examples, but it's quite long (~250 lines) and includes sections like Wallet/Web3 testing and Financial/Critical Flow testing that are domain-specific and could be split out. The test report template section is boilerplate Claude can generate on its own. Some sections like the file organization tree are useful but the overall document could be tightened.

2 / 3

Actionability

Excellent actionability throughout — every section provides fully executable TypeScript code, complete configuration files, working CI/CD YAML, and concrete bash commands. The code examples are copy-paste ready with realistic patterns like POM classes, config files, and GitHub Actions workflows.

3 / 3

Workflow Clarity

While individual sections are clear, there's no overarching workflow tying the pieces together (e.g., 'first set up config, then create page objects, then write tests, then run and validate'). The flaky test section has good before/after patterns but lacks explicit validation checkpoints. For a skill involving test infrastructure setup, a sequenced workflow with verification steps would strengthen this.

2 / 3

Progressive Disclosure

This is a monolithic wall of content — everything is inline in a single file with no references to external files for detailed topics. The Wallet/Web3 testing, CI/CD integration, artifact management, and flaky test patterns could each be separate referenced documents. There's no quick-start overview followed by pointers to deeper content.

1 / 3

Total

8

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents