Browser automation powers web testing, scraping, and AI agent interactions. The difference between a flaky script and a reliable system comes down to understanding selectors, waiting strategies, and anti-detection patterns.
38
37%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/browser-automation/SKILL.mdQuality
Discovery
32%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description reads more like an introductory paragraph for a tutorial than a skill description for selection purposes. It lacks concrete actions the skill performs, omits a 'Use when...' clause entirely, and uses an editorial/explanatory tone ('The difference between a flaky script and a reliable system...') rather than directly stating capabilities and triggers.
Suggestions
Add an explicit 'Use when...' clause with trigger scenarios, e.g., 'Use when the user asks about browser automation, Puppeteer, Playwright, Selenium, web scraping, headless browsers, or automated testing.'
Replace the editorial second sentence with concrete actions, e.g., 'Automates browser interactions including clicking elements, filling forms, navigating pages, handling authentication, and extracting page content.'
Include common tool names and file/technology references users would mention (Puppeteer, Playwright, Selenium, Cypress, headless Chrome) to improve trigger term coverage.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (browser automation) and some actions (web testing, scraping, AI agent interactions), and mentions technical concepts (selectors, waiting strategies, anti-detection patterns), but doesn't list concrete specific actions the skill performs—it reads more like a topic introduction than a capability list. | 2 / 3 |
Completeness | The description addresses 'what' at a high level but completely lacks a 'Use when...' clause or any explicit trigger guidance for when Claude should select this skill. Per the rubric, a missing 'Use when...' clause caps completeness at 2, and the 'what' is also weak, so this scores a 1. | 1 / 3 |
Trigger Term Quality | Includes some relevant keywords like 'browser automation', 'web testing', 'scraping', 'selectors', and 'anti-detection', but misses common user terms like 'Puppeteer', 'Playwright', 'Selenium', 'headless browser', 'web crawling', 'click', 'navigate', or 'fill form'. | 2 / 3 |
Distinctiveness Conflict Risk | The mention of browser automation, selectors, and anti-detection patterns provides some distinctiveness, but 'web testing' and 'scraping' are broad enough to overlap with general web development or data extraction skills. | 2 / 3 |
Total | 7 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill is highly actionable with excellent, executable code examples covering a wide range of browser automation scenarios. However, it is severely bloated — repeating concepts across Patterns, Sharp Edges, and Validation Checks sections, and including metadata-like content (Capabilities, Scope, When to Use) in the body. The monolithic structure with no progressive disclosure makes it a poor fit for token-efficient context loading.
Suggestions
Eliminate redundancy by consolidating repeated topics (waitForTimeout, stealth, locator priority) into single authoritative sections rather than restating them across Patterns, Sharp Edges, and Validation Checks.
Move detailed patterns (stealth, network interception, parallel execution, error recovery) into separate referenced files (e.g., STEALTH.md, NETWORK.md) and keep SKILL.md as a concise overview with links.
Move Capabilities, Scope, When to Use, Limitations, and Collaboration metadata into YAML frontmatter or a separate metadata file — these are not instructional content.
Add an explicit end-to-end workflow section (e.g., 'Setting up a Playwright test suite from scratch') with numbered steps and validation checkpoints to improve workflow clarity.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~600+ lines. It repeats the same concepts multiple times (e.g., waitForTimeout anti-pattern appears in Patterns, Sharp Edges, and Validation Checks; stealth pattern is duplicated similarly; locator priority is stated three times). Sections like Capabilities, Scope, When to Use, and Limitations are metadata that belong in frontmatter, not body content. Much of this content (what CSS selectors are, how iframes work, basic retry patterns) is knowledge Claude already possesses. | 1 / 3 |
Actionability | The skill provides extensive, executable code examples throughout — Playwright test configs, stealth setup, retry patterns, network interception, parallel execution, and iframe handling are all copy-paste ready with real imports and complete function signatures. Good and bad examples are clearly contrasted. | 3 / 3 |
Workflow Clarity | Individual patterns are well-explained with clear steps, but there's no overarching workflow connecting them (e.g., 'setting up a new browser automation project' or 'debugging a flaky test'). The Sharp Edges section provides good problem→symptom→fix sequences, but lacks explicit validation checkpoints for multi-step processes like scraping pipelines or test suite setup. | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of text with no references to external files and no bundle files to support it. All content — patterns, sharp edges, validation checks, collaboration notes — is inlined into a single massive document. Content like the detailed stealth patterns, network interception examples, and validation checks would benefit greatly from being split into separate referenced files. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (1117 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
8854d4e
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.