e2e-testing-patterns

Master end-to-end testing with Playwright and Cypress to build reliable test suites that catch bugs, improve confidence, and enable fast deployment. Use when implementing E2E tests, debugging flaky tests, or establishing testing standards.

1.27x

Quality

70%

Does it follow best practices?

Impact

93%

1.27x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/developer-essentials/skills/e2e-testing-patterns/SKILL.md

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid skill description that clearly identifies its domain (E2E testing with Playwright and Cypress) and includes an explicit 'Use when' clause with relevant trigger scenarios. Its main weakness is that the capability descriptions lean toward aspirational outcomes ('catch bugs, improve confidence, enable fast deployment') rather than listing concrete technical actions the skill enables. The trigger terms are well-chosen and naturally match what users would say.

Suggestions

Replace outcome-oriented phrases like 'catch bugs, improve confidence, and enable fast deployment' with concrete actions such as 'write page object models, configure test retries, set up CI integration, handle test selectors'.

Dimension	Reasoning	Score
Specificity	Names the domain (end-to-end testing) and tools (Playwright, Cypress), and mentions some actions like 'build reliable test suites', 'debugging flaky tests', 'establishing testing standards', but these are somewhat high-level rather than listing multiple concrete specific actions like 'write page object models, configure test retries, set up parallel execution'.	2 / 3
Completeness	Clearly answers both 'what' (build reliable test suites with Playwright and Cypress) and 'when' with an explicit 'Use when' clause covering implementing E2E tests, debugging flaky tests, and establishing testing standards.	3 / 3
Trigger Term Quality	Includes strong natural keywords users would say: 'Playwright', 'Cypress', 'E2E tests', 'flaky tests', 'testing standards', 'end-to-end testing'. These cover common variations of how users would describe their needs in this domain.	3 / 3
Distinctiveness Conflict Risk	The combination of specific frameworks (Playwright, Cypress) and the E2E testing focus creates a clear niche. Terms like 'flaky tests', 'E2E tests', and the named tools make it unlikely to conflict with unit testing, API testing, or other skill types.	3 / 3
	Total	11 / 12 Passed

Implementation

50%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill provides excellent, executable code examples across both Playwright and Cypress, making it highly actionable. However, it is significantly bloated with conceptual explanations Claude doesn't need (testing pyramid, what to/not to test, generic best practices), and the monolithic structure with ~400 lines of inline content undermines both conciseness and progressive disclosure. Referenced bundle files don't exist.

Suggestions

Remove the 'Core Concepts' section entirely (testing pyramid, what to test/not test, test philosophy) — Claude already knows these fundamentals. This alone would cut ~40 lines.

Move Cypress patterns and Advanced patterns into separate referenced files to reduce the main SKILL.md to a concise overview with Playwright as the primary example.

Add an explicit workflow sequence for setting up E2E testing from scratch (e.g., 1. Configure → 2. Write first test → 3. Run and validate → 4. Add to CI → 5. Debug failures) with validation checkpoints.

Consolidate the 'Best Practices' and 'Common Pitfalls' sections into a single compact checklist — they currently overlap significantly (e.g., selectors, flaky tests, test independence are mentioned in both).

Dimension	Reasoning	Score
Conciseness	Significant verbosity throughout: the 'When to Use This Skill' list, 'Core Concepts' section explaining the testing pyramid, 'What to Test / What NOT to Test' lists, and 'Best Practices' bullet points all explain concepts Claude already knows well. The ASCII testing pyramid and philosophical guidance add no actionable value. The document is ~400 lines when it could be under 150.	1 / 3
Actionability	The code examples are fully executable and copy-paste ready: complete Playwright config, Page Object Model implementation, fixtures, waiting strategies, network mocking, Cypress commands, visual regression, accessibility testing, and debugging commands are all concrete and specific.	3 / 3
Workflow Clarity	Individual patterns are clear, but there's no overarching workflow for setting up an E2E test suite from scratch. The debugging section provides a sequence but lacks validation checkpoints. There's no feedback loop for when tests fail in CI or guidance on iterating from flaky to stable tests.	2 / 3
Progressive Disclosure	The Resources section references several supporting files (references/playwright-best-practices.md, scripts/test-analyzer.ts, etc.), but no bundle files are provided, so these references are unverifiable. The main document itself is monolithic with extensive inline content that could be split into referenced files (e.g., Cypress patterns, advanced patterns).	2 / 3
	Total	8 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (545 lines); consider splitting into references/ and linking	Warning

	Total	10 / 11 Passed

Repository: Dicklesworthstone/pi_agent_rust
Commit: 99da384

Reviewed: 4 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.