Fetches web content with intelligent content extraction, converting HTML to clean markdown. Use for documentation, articles, and reference pages http/https URLs.
83
Quality
83%
Does it follow best practices?
Impact
74%
24.66xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Fetch web content using curl | html2markdown with CSS selectors for clean, complete markdown output.
Use site-specific selectors for best results:
# Anthropic docs
curl -s "<url>" | html2markdown --include-selector "#content-container"
# MDN Web Docs
curl -s "<url>" | html2markdown --include-selector "article"
# GitHub docs
curl -s "<url>" | html2markdown --include-selector "article" --exclude-selector "nav,.sidebar"
# Generic article pages
curl -s "<url>" | html2markdown --include-selector "article,main,[role=main]" --exclude-selector "nav,header,footer"| Site | Include Selector | Exclude Selector |
|---|---|---|
| platform.claude.com | #content-container | - |
| docs.anthropic.com | #content-container | - |
| developer.mozilla.org | article | - |
| github.com (docs) | article | nav,.sidebar |
| Generic | article,main | nav,header,footer,script,style |
For sites without known patterns, use the Bun script which auto-detects content:
bun ~/.claude/skills/web-fetch/fetch.ts "<url>"cd ~/.claude/skills/web-fetch && bun installWhen a site isn't in the patterns list:
# Check what content containers exist
curl -s "<url>" | grep -o '<article[^>]*>\|<main[^>]*>\|id="[^"]*content[^"]*"' | head -10
# Test a selector
curl -s "<url>" | html2markdown --include-selector "<selector>" | head -30
# Check line count
curl -s "<url>" | html2markdown --include-selector "<selector>" | wc -l--include-selector "CSS" # Only include matching elements
--exclude-selector "CSS" # Remove matching elements
--domain "https://..." # Convert relative links to absolute| Method | Anthropic Docs | Code Blocks | Complexity |
|---|---|---|---|
| Full page | 602 lines | Yes | Noisy |
--include-selector "#content-container" | 385 lines | Yes | Clean |
| Bun script (universal) | 383 lines | Yes | Clean |
Wrong content selected: The site may have multiple articles. Inspect the HTML:
curl -s "<url>" | grep -o '<article[^>]*>'Empty output: The selector doesn't match. Try broader selectors like main or body.
Missing code blocks: Check if the site uses non-standard code formatting.
Client-rendered content: If HTML only has "Loading..." placeholders, the content is JS-rendered. Neither curl nor the Bun script can extract it; use browser-based tools.
5342bca
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.