Ctrl + K
DocumentationLog inGet started

tessl/pypi-w3lib

tessl install tessl/pypi-w3lib@2.3.0

Library of web-related functions for HTML manipulation, HTTP processing, URL handling, and encoding detection

Agent Success

Agent success rate when using this tile

84%

Improvement

Agent success rate improvement when using this tile compared to baseline

0.91x

Baseline

Agent success rate without this tile

92%

task.mdevals/scenario-3/

URL Sanitizer Service

A Python module that provides URL sanitization functionality to ensure URLs are safe and properly formatted for use in web applications.

Capabilities

URL Sanitization

  • When given the URL http://example.com/path with spaces/file.html, returns a properly encoded URL @test
  • When given the URL http://münchen.de/page, handles the internationalized domain name correctly @test
  • When given the URL http://example.com/search?q=hello%20world&name=test, normalizes the encoding @test
  • When given an invalid input like 123 or not-a-url, returns None @test

Requirements

The module should provide a function that takes a URL string as input and returns a sanitized, safe version of the URL. The function should:

  • Handle URLs with special characters that need proper encoding
  • Support internationalized domain names (IDNA)
  • Handle mixed or incorrect encoding in URL components
  • Return None for invalid inputs (non-string or malformed URLs)
  • Ensure the output URL is valid according to web standards (RFC 2396, 2732, 3986, WHATWG)

Implementation

@generates

API

def sanitize_url(url: str) -> str | None:
    """
    Sanitizes a URL to ensure it is safe and properly formatted.

    Args:
        url: The URL string to sanitize

    Returns:
        A sanitized URL string, or None if the input is invalid
    """
    pass

Dependencies { .dependencies }

w3lib { .dependency }

Provides URL manipulation and validation utilities.

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/w3lib@2.3.x
tile.json