CtrlK
BlogDocsLog inGet started
Tessl Logo

browser

Web browser automation with AI-optimized snapshots for claude-flow agents

59

2.52x
Quality

51%

Does it follow best practices?

Impact

57%

2.52x

Average score across 3 eval scenarios

SecuritybySnyk

Risky

Do not use without reviewing

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/browser/SKILL.md
SKILL.md
Quality
Evals
Security

Browser Automation Skill

Web browser automation using agent-browser with AI-optimized snapshots. Reduces context by 93% using element refs (@e1, @e2) instead of full DOM.

Core Workflow

# 1. Navigate to page
agent-browser open <url>

# 2. Get accessibility tree with element refs
agent-browser snapshot -i    # -i = interactive elements only

# 3. Interact using refs from snapshot
agent-browser click @e2
agent-browser fill @e3 "text"

# 4. Re-snapshot after page changes
agent-browser snapshot -i

Quick Reference

Navigation

CommandDescription
open <url>Navigate to URL
backGo back
forwardGo forward
reloadReload page
closeClose browser

Snapshots (AI-Optimized)

CommandDescription
snapshotFull accessibility tree
snapshot -iInteractive elements only (buttons, links, inputs)
snapshot -cCompact (remove empty elements)
snapshot -d 3Limit depth to 3 levels
screenshot [path]Capture screenshot (base64 if no path)

Interaction

CommandDescription
click <sel>Click element
fill <sel> <text>Clear and fill input
type <sel> <text>Type with key events
press <key>Press key (Enter, Tab, etc.)
hover <sel>Hover element
select <sel> <val>Select dropdown option
check/uncheck <sel>Toggle checkbox
scroll <dir> [px]Scroll page

Get Info

CommandDescription
get text <sel>Get text content
get html <sel>Get innerHTML
get value <sel>Get input value
get attr <sel> <attr>Get attribute
get titleGet page title
get urlGet current URL

Wait

CommandDescription
wait <selector>Wait for element
wait <ms>Wait milliseconds
wait --text "text"Wait for text
wait --url "pattern"Wait for URL
wait --load networkidleWait for load state

Sessions

CommandDescription
--session <name>Use isolated session
session listList active sessions

Selectors

Element Refs (Recommended)

# Get refs from snapshot
agent-browser snapshot -i
# Output: button "Submit" [ref=e2]

# Use ref to interact
agent-browser click @e2

CSS Selectors

agent-browser click "#submit"
agent-browser fill ".email-input" "test@test.com"

Semantic Locators

agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "test@test.com"
agent-browser find testid "login-btn" click

Examples

Login Flow

agent-browser open https://example.com/login
agent-browser snapshot -i
agent-browser fill @e2 "user@example.com"
agent-browser fill @e3 "password123"
agent-browser click @e4
agent-browser wait --url "**/dashboard"

Form Submission

agent-browser open https://example.com/contact
agent-browser snapshot -i
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser fill @e3 "Hello, this is my message"
agent-browser click @e4
agent-browser wait --text "Thank you"

Data Extraction

agent-browser open https://example.com/products
agent-browser snapshot -i
# Iterate through product refs
agent-browser get text @e1  # Product name
agent-browser get text @e2  # Price
agent-browser get attr @e3 href  # Link

Multi-Session (Swarm)

# Session 1: Navigator
agent-browser --session nav open https://example.com
agent-browser --session nav state save auth.json

# Session 2: Scraper (uses same auth)
agent-browser --session scrape state load auth.json
agent-browser --session scrape open https://example.com/data
agent-browser --session scrape snapshot -i

Integration with Claude Flow

MCP Tools

All browser operations are available as MCP tools with browser/ prefix:

  • browser/open
  • browser/snapshot
  • browser/click
  • browser/fill
  • browser/screenshot
  • etc.

Memory Integration

# Store successful patterns
npx @claude-flow/cli memory store --namespace browser-patterns --key "login-flow" --value "snapshot->fill->click->wait"

# Retrieve before similar task
npx @claude-flow/cli memory search --query "login automation"

Hooks

# Pre-browse hook (get context)
npx @claude-flow/cli hooks pre-edit --file "browser-task.ts"

# Post-browse hook (record success)
npx @claude-flow/cli hooks post-task --task-id "browse-1" --success true

Tips

  1. Always use snapshots - They're optimized for AI with refs
  2. Prefer -i flag - Gets only interactive elements, smaller output
  3. Use refs, not selectors - More reliable, deterministic
  4. Re-snapshot after navigation - Page state changes
  5. Use sessions for parallel work - Each session is isolated
Repository
ruvnet/ruv-FANN
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.