0
# Utilities
1
2
Utility functions and constants that support the core webencodings functionality. These include ASCII case conversion for label matching and pre-defined encoding objects and mappings.
3
4
## Capabilities
5
6
### ASCII Case Conversion
7
8
Transform only ASCII letters to lowercase for encoding label matching according to WHATWG standards.
9
10
```python { .api }
11
def ascii_lower(string: str) -> str:
12
"""
13
Transform ASCII letters A-Z to lowercase a-z, leaving other characters unchanged.
14
15
Args:
16
string: Unicode string to process
17
18
Returns:
19
New Unicode string with ASCII letters converted to lowercase
20
21
Note:
22
This differs from str.lower() which affects non-ASCII characters.
23
Used for ASCII case-insensitive matching of encoding labels and CSS keywords.
24
"""
25
```
26
27
This function is used internally for encoding label matching but is also available for applications that need ASCII-only case conversion following web standards.
28
29
## Constants
30
31
### Predefined Encoding Objects
32
33
```python { .api }
34
UTF8: Encoding
35
```
36
37
The UTF-8 encoding object, recommended for new content and formats. This is a pre-constructed Encoding instance for UTF-8.
38
39
### Version Information
40
41
```python { .api }
42
VERSION: str
43
```
44
45
Package version string (currently '0.5.1').
46
47
### Encoding Mappings
48
49
```python { .api }
50
LABELS: dict[str, str]
51
```
52
53
Complete mapping of encoding labels to canonical names as defined by the WHATWG Encoding standard. This dictionary contains all standard encoding labels and their aliases.
54
55
56
## Usage Examples
57
58
```python
59
import webencodings
60
61
# ASCII case conversion
62
text = "Content-Type"
63
lower_text = webencodings.ascii_lower(text)
64
print(lower_text) # "content-type"
65
66
# Comparison with str.lower() for non-ASCII
67
keyword = "Bacκground" # Contains Greek kappa (κ)
68
print(keyword.lower()) # "bacκground" (κ unchanged in ASCII-only conversion)
69
print(webencodings.ascii_lower(keyword)) # "bacκground"
70
71
# Use predefined UTF-8 encoding
72
text = "Hello World"
73
data = webencodings.encode(text, webencodings.UTF8)
74
print(data) # b'Hello World'
75
76
# Check package version
77
print(webencodings.VERSION) # '0.5.1'
78
79
# Inspect available encoding labels
80
print(len(webencodings.LABELS)) # Number of supported encoding labels
81
print('utf-8' in webencodings.LABELS) # True
82
print('latin1' in webencodings.LABELS) # True
83
84
# View some common label mappings
85
common_labels = ['utf-8', 'latin1', 'ascii', 'iso-8859-1']
86
for label in common_labels:
87
canonical = webencodings.LABELS.get(label)
88
print(f"{label} -> {canonical}")
89
90
# utf-8 -> utf-8
91
# latin1 -> windows-1252
92
# ascii -> windows-1252
93
# iso-8859-1 -> windows-1252
94
95
```