tessl install tessl/pypi-w3lib@2.3.0Library of web-related functions for HTML manipulation, HTTP processing, URL handling, and encoding detection
Agent Success
Agent success rate when using this tile
84%
Improvement
Agent success rate improvement when using this tile compared to baseline
0.91x
Baseline
Agent success rate without this tile
92%
A utility that detects and identifies Byte Order Marks (BOMs) in binary data files to determine their encoding.
EF BB BF, returns the encoding "UTF-8" and the BOM bytes @testFF FE, returns the encoding "UTF-16-LE" and the BOM bytes @testFE FF, returns the encoding "UTF-16-BE" and the BOM bytes @testFF FE 00 00, returns the encoding "UTF-32-LE" and the BOM bytes @test00 00 FE FF, returns the encoding "UTF-32-BE" and the BOM bytes @test@generates
def detect_bom(data: bytes) -> tuple[str | None, bytes]:
"""
Detects and reads Byte Order Marks (BOM) in binary data.
Identifies BOM sequences for UTF-8, UTF-16 (BE/LE), and UTF-32 (BE/LE) encodings.
Args:
data: Binary data to check for BOM
Returns:
A tuple containing:
- encoding name (str) if BOM detected, None otherwise
- the BOM bytes that were detected (bytes), empty bytes if no BOM found
"""
passProvides web-related utility functions including encoding detection support.