Character encoding aliases for legacy web content implementing the WHATWG Encoding standard
Utility functions and constants that support the core webencodings functionality. These include ASCII case conversion for label matching and pre-defined encoding objects and mappings.
Transform only ASCII letters to lowercase for encoding label matching according to WHATWG standards.
def ascii_lower(string: str) -> str:
"""
Transform ASCII letters A-Z to lowercase a-z, leaving other characters unchanged.
Args:
string: Unicode string to process
Returns:
New Unicode string with ASCII letters converted to lowercase
Note:
This differs from str.lower() which affects non-ASCII characters.
Used for ASCII case-insensitive matching of encoding labels and CSS keywords.
"""This function is used internally for encoding label matching but is also available for applications that need ASCII-only case conversion following web standards.
UTF8: EncodingThe UTF-8 encoding object, recommended for new content and formats. This is a pre-constructed Encoding instance for UTF-8.
VERSION: strPackage version string (currently '0.5.1').
LABELS: dict[str, str]Complete mapping of encoding labels to canonical names as defined by the WHATWG Encoding standard. This dictionary contains all standard encoding labels and their aliases.
import webencodings
# ASCII case conversion
text = "Content-Type"
lower_text = webencodings.ascii_lower(text)
print(lower_text) # "content-type"
# Comparison with str.lower() for non-ASCII
keyword = "Bacκground" # Contains Greek kappa (κ)
print(keyword.lower()) # "bacκground" (κ unchanged in ASCII-only conversion)
print(webencodings.ascii_lower(keyword)) # "bacκground"
# Use predefined UTF-8 encoding
text = "Hello World"
data = webencodings.encode(text, webencodings.UTF8)
print(data) # b'Hello World'
# Check package version
print(webencodings.VERSION) # '0.5.1'
# Inspect available encoding labels
print(len(webencodings.LABELS)) # Number of supported encoding labels
print('utf-8' in webencodings.LABELS) # True
print('latin1' in webencodings.LABELS) # True
# View some common label mappings
common_labels = ['utf-8', 'latin1', 'ascii', 'iso-8859-1']
for label in common_labels:
canonical = webencodings.LABELS.get(label)
print(f"{label} -> {canonical}")
# utf-8 -> utf-8
# latin1 -> windows-1252
# ascii -> windows-1252
# iso-8859-1 -> windows-1252Install with Tessl CLI
npx tessl i tessl/pypi-webencodings