CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-w3lib

Library of web-related functions for HTML manipulation, HTTP processing, URL handling, and encoding detection

84

0.91x
Overview
Eval results
Files

task.mdevals/scenario-8/

File Encoding Detector

A utility that detects and identifies Byte Order Marks (BOMs) in binary data files to determine their encoding.

Capabilities

Detects UTF-8 BOM

  • When given binary data starting with bytes EF BB BF, returns the encoding "UTF-8" and the BOM bytes @test
  • When given binary data with UTF-8 BOM followed by text content, correctly identifies the BOM without consuming the text @test

Detects UTF-16 BOMs

  • When given binary data starting with bytes FF FE, returns the encoding "UTF-16-LE" and the BOM bytes @test
  • When given binary data starting with bytes FE FF, returns the encoding "UTF-16-BE" and the BOM bytes @test

Detects UTF-32 BOMs

  • When given binary data starting with bytes FF FE 00 00, returns the encoding "UTF-32-LE" and the BOM bytes @test
  • When given binary data starting with bytes 00 00 FE FF, returns the encoding "UTF-32-BE" and the BOM bytes @test

Handles data without BOM

  • When given binary data that does not start with any BOM sequence, returns None for encoding and an empty bytes object @test
  • When given an empty bytes object, returns None for encoding and an empty bytes object @test

Implementation

@generates

API

def detect_bom(data: bytes) -> tuple[str | None, bytes]:
    """
    Detects and reads Byte Order Marks (BOM) in binary data.

    Identifies BOM sequences for UTF-8, UTF-16 (BE/LE), and UTF-32 (BE/LE) encodings.

    Args:
        data: Binary data to check for BOM

    Returns:
        A tuple containing:
        - encoding name (str) if BOM detected, None otherwise
        - the BOM bytes that were detected (bytes), empty bytes if no BOM found
    """
    pass

Dependencies { .dependencies }

w3lib { .dependency }

Provides web-related utility functions including encoding detection support.

Install with Tessl CLI

npx tessl i tessl/pypi-w3lib

tile.json