Ctrl + K
DocumentationLog inGet started

tessl/pypi-w3lib

tessl install tessl/pypi-w3lib@2.3.0

Library of web-related functions for HTML manipulation, HTTP processing, URL handling, and encoding detection

Agent Success

Agent success rate when using this tile

84%

Improvement

Agent success rate improvement when using this tile compared to baseline

0.91x

Baseline

Agent success rate without this tile

92%

task.mdevals/scenario-9/

File Encoding Detector

A utility that detects and identifies Byte Order Marks (BOMs) in binary data files to determine their encoding.

Capabilities

Detects UTF-8 BOM

  • When given binary data starting with bytes EF BB BF, returns the encoding "UTF-8" and the BOM bytes @test
  • When given binary data with UTF-8 BOM followed by text content, correctly identifies the BOM without consuming the text @test

Detects UTF-16 BOMs

  • When given binary data starting with bytes FF FE, returns the encoding "UTF-16-LE" and the BOM bytes @test
  • When given binary data starting with bytes FE FF, returns the encoding "UTF-16-BE" and the BOM bytes @test

Detects UTF-32 BOMs

  • When given binary data starting with bytes FF FE 00 00, returns the encoding "UTF-32-LE" and the BOM bytes @test
  • When given binary data starting with bytes 00 00 FE FF, returns the encoding "UTF-32-BE" and the BOM bytes @test

Handles data without BOM

  • When given binary data that does not start with any BOM sequence, returns None for encoding and an empty bytes object @test
  • When given an empty bytes object, returns None for encoding and an empty bytes object @test

Implementation

@generates

API

def detect_bom(data: bytes) -> tuple[str | None, bytes]:
    """
    Detects and reads Byte Order Marks (BOM) in binary data.

    Identifies BOM sequences for UTF-8, UTF-16 (BE/LE), and UTF-32 (BE/LE) encodings.

    Args:
        data: Binary data to check for BOM

    Returns:
        A tuple containing:
        - encoding name (str) if BOM detected, None otherwise
        - the BOM bytes that were detected (bytes), empty bytes if no BOM found
    """
    pass

Dependencies { .dependencies }

w3lib { .dependency }

Provides web-related utility functions including encoding detection support.

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/w3lib@2.3.x
tile.json