or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

core-objects.md index.md streaming-processing.md string-processing.md utilities.md

tile.json

tessl/pypi-webencodings

Character encoding aliases for legacy web content implementing the WHATWG Encoding standard

Workspace: tessl
Visibility: Public
Created: 3 months ago
Last updated: 3 months ago
Describes: pkg:pypi/webencodings@0.5.x

To install, run

npx @tessl/cli install tessl/pypi-webencodings@0.5.0

webencodings

A Python implementation of the WHATWG Encoding standard that provides character encoding aliases for legacy web content. It addresses compatibility issues by providing standardized encoding labels, BOM detection, and proper handling of encoding declarations that follow web standards.

Package Information

Package Name: webencodings
Language: Python
Installation: pip install webencodings
Version: 0.5.1

Core Imports

import webencodings

Common specific imports:

from webencodings import lookup, decode, encode, UTF8

All encoding classes and streaming interfaces:

from webencodings import IncrementalDecoder, IncrementalEncoder

Basic Usage

import webencodings

# Look up an encoding by label
utf8_encoding = webencodings.lookup('utf-8')
windows_encoding = webencodings.lookup('windows-1252') 

# Decode bytes with BOM detection
text, encoding_used = webencodings.decode(b'\xef\xbb\xbfHello', 'utf-8')
print(text)  # "Hello"
print(encoding_used.name)  # "utf-8"

# Encode text to bytes
data = webencodings.encode("Hello", webencodings.UTF8)
print(data)  # b'Hello'

# Handle legacy web content encoding
legacy_data = b'caf\xe9'  # Latin-1 encoded "café"
text, encoding = webencodings.decode(legacy_data, 'iso-8859-1')
print(text)  # "café"

Architecture

The webencodings package follows the WHATWG Encoding standard architecture:

Encoding Objects: Canonical representations of character encodings with standardized names
Label Lookup: Maps encoding labels (including aliases) to canonical encoding names
BOM Detection: UTF-8/UTF-16 BOM detection that takes precedence over declared encodings
Streaming Interfaces: Both "pull" and "push" based processing for large data
Error Handling: Follows Python's codec error handling patterns

This design ensures consistent cross-implementation behavior for handling legacy web content.

Capabilities

Encoding Lookup and Core Objects

Core functionality for looking up encodings by label and the fundamental Encoding class that wraps Python codecs with WHATWG-compliant names and behavior.

def lookup(label: str) -> Encoding | None: ...
class Encoding:
    name: str
    codec_info: codecs.CodecInfo

Core Objects

Single String Processing

Simple encoding and decoding functions for processing individual strings with BOM detection and WHATWG-compliant encoding resolution.

def decode(input: bytes, fallback_encoding: Encoding | str, errors: str = 'replace') -> tuple[str, Encoding]: ...
def encode(input: str, encoding: Encoding | str = UTF8, errors: str = 'strict') -> bytes: ...

String Processing

Streaming Processing

Streaming interfaces for processing large amounts of data incrementally, supporting both "pull"-based (iterator) and "push"-based (incremental) processing patterns.

def iter_decode(input: Iterable[bytes], fallback_encoding: Encoding | str, errors: str = 'replace') -> tuple[Iterator[str], Encoding]: ...
def iter_encode(input: Iterable[str], encoding: Encoding | str = UTF8, errors: str = 'strict') -> Iterator[bytes]: ...
class IncrementalDecoder: ...
class IncrementalEncoder: ...

Streaming Processing

Utilities and Constants

Utility functions and pre-defined constants including the recommended UTF-8 encoding object and ASCII case-insensitive string operations.

def ascii_lower(string: str) -> str: ...
UTF8: Encoding
VERSION: str
LABELS: dict[str, str]

Utilities

Version

Tile

Files

tessl/pypi-webencodings

To install, run

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

webencodings

Package Information

Core Imports

Basic Usage

Architecture

Capabilities

Encoding Lookup and Core Objects

Single String Processing

Streaming Processing

Utilities and Constants

index.mddocs/