CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-webencodings

Character encoding aliases for legacy web content implementing the WHATWG Encoding standard

Overview
Eval results
Files

utilities.mddocs/

Utilities

Utility functions and constants that support the core webencodings functionality. These include ASCII case conversion for label matching and pre-defined encoding objects and mappings.

Capabilities

ASCII Case Conversion

Transform only ASCII letters to lowercase for encoding label matching according to WHATWG standards.

def ascii_lower(string: str) -> str:
    """
    Transform ASCII letters A-Z to lowercase a-z, leaving other characters unchanged.
    
    Args:
        string: Unicode string to process
        
    Returns:
        New Unicode string with ASCII letters converted to lowercase
        
    Note:
        This differs from str.lower() which affects non-ASCII characters.
        Used for ASCII case-insensitive matching of encoding labels and CSS keywords.
    """

This function is used internally for encoding label matching but is also available for applications that need ASCII-only case conversion following web standards.

Constants

Predefined Encoding Objects

UTF8: Encoding

The UTF-8 encoding object, recommended for new content and formats. This is a pre-constructed Encoding instance for UTF-8.

Version Information

VERSION: str

Package version string (currently '0.5.1').

Encoding Mappings

LABELS: dict[str, str]

Complete mapping of encoding labels to canonical names as defined by the WHATWG Encoding standard. This dictionary contains all standard encoding labels and their aliases.

Usage Examples

import webencodings

# ASCII case conversion
text = "Content-Type"
lower_text = webencodings.ascii_lower(text)
print(lower_text)  # "content-type"

# Comparison with str.lower() for non-ASCII
keyword = "Bacκground"  # Contains Greek kappa (κ)
print(keyword.lower())  # "bacκground" (κ unchanged in ASCII-only conversion)  
print(webencodings.ascii_lower(keyword))  # "bacκground"

# Use predefined UTF-8 encoding
text = "Hello World"
data = webencodings.encode(text, webencodings.UTF8)
print(data)  # b'Hello World'

# Check package version
print(webencodings.VERSION)  # '0.5.1'

# Inspect available encoding labels
print(len(webencodings.LABELS))  # Number of supported encoding labels
print('utf-8' in webencodings.LABELS)  # True
print('latin1' in webencodings.LABELS)  # True

# View some common label mappings
common_labels = ['utf-8', 'latin1', 'ascii', 'iso-8859-1']
for label in common_labels:
    canonical = webencodings.LABELS.get(label)
    print(f"{label} -> {canonical}")

# utf-8 -> utf-8
# latin1 -> windows-1252  
# ascii -> windows-1252
# iso-8859-1 -> windows-1252

Install with Tessl CLI

npx tessl i tessl/pypi-webencodings

docs

core-objects.md

index.md

streaming-processing.md

string-processing.md

utilities.md

tile.json