tessl install tessl/pypi-w3lib@2.3.0Library of web-related functions for HTML manipulation, HTTP processing, URL handling, and encoding detection
Agent Success
Agent success rate when using this tile
84%
Improvement
Agent success rate improvement when using this tile compared to baseline
0.91x
Baseline
Agent success rate without this tile
92%
Build a utility that extracts and validates meta refresh redirect information from HTML content. The tool should parse HTML documents to find meta refresh tags, extract the delay time and target URL, and handle various edge cases.
Your utility should extract redirect information from HTML meta refresh tags. The meta refresh tag can appear in different formats:
<meta http-equiv="refresh" content="0;url=https://example.com"><meta http-equiv="refresh" content="5; url=https://example.com/page"><meta http-equiv="Refresh" content="10;URL='https://example.com'">The function should:
None if no meta refresh tag is foundGiven HTML with <meta http-equiv="refresh" content="0;url=https://example.com/redirect"> and base URL https://original.com, returns (0, "https://example.com/redirect") @test
Given HTML with <meta http-equiv="refresh" content="5;url=relative/path.html"> and base URL https://example.com/page/, returns (5, "https://example.com/page/relative/path.html") @test
Given HTML with no meta refresh tag, returns None @test
Given HTML with <meta http-equiv="Refresh" content="3; URL='https://example.com/target'"> (case-insensitive and quoted URL), returns (3, "https://example.com/target") @test
@generates
def extract_meta_refresh(html_content: str, base_url: str) -> tuple[int, str] | None:
"""
Extracts meta refresh redirect information from HTML content.
Args:
html_content: The HTML document as a string
base_url: The base URL to resolve relative URLs against
Returns:
A tuple of (delay_seconds, target_url) if meta refresh is found,
None otherwise
"""
passProvides web utility functions for HTML processing and URL handling.
@satisfied-by