or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

core-objects.mdindex.mdstreaming-processing.mdstring-processing.mdutilities.md

utilities.mddocs/

0

# Utilities

1

2

Utility functions and constants that support the core webencodings functionality. These include ASCII case conversion for label matching and pre-defined encoding objects and mappings.

3

4

## Capabilities

5

6

### ASCII Case Conversion

7

8

Transform only ASCII letters to lowercase for encoding label matching according to WHATWG standards.

9

10

```python { .api }

11

def ascii_lower(string: str) -> str:

12

"""

13

Transform ASCII letters A-Z to lowercase a-z, leaving other characters unchanged.

14

15

Args:

16

string: Unicode string to process

17

18

Returns:

19

New Unicode string with ASCII letters converted to lowercase

20

21

Note:

22

This differs from str.lower() which affects non-ASCII characters.

23

Used for ASCII case-insensitive matching of encoding labels and CSS keywords.

24

"""

25

```

26

27

This function is used internally for encoding label matching but is also available for applications that need ASCII-only case conversion following web standards.

28

29

## Constants

30

31

### Predefined Encoding Objects

32

33

```python { .api }

34

UTF8: Encoding

35

```

36

37

The UTF-8 encoding object, recommended for new content and formats. This is a pre-constructed Encoding instance for UTF-8.

38

39

### Version Information

40

41

```python { .api }

42

VERSION: str

43

```

44

45

Package version string (currently '0.5.1').

46

47

### Encoding Mappings

48

49

```python { .api }

50

LABELS: dict[str, str]

51

```

52

53

Complete mapping of encoding labels to canonical names as defined by the WHATWG Encoding standard. This dictionary contains all standard encoding labels and their aliases.

54

55

56

## Usage Examples

57

58

```python

59

import webencodings

60

61

# ASCII case conversion

62

text = "Content-Type"

63

lower_text = webencodings.ascii_lower(text)

64

print(lower_text) # "content-type"

65

66

# Comparison with str.lower() for non-ASCII

67

keyword = "Bacκground" # Contains Greek kappa (κ)

68

print(keyword.lower()) # "bacκground" (κ unchanged in ASCII-only conversion)

69

print(webencodings.ascii_lower(keyword)) # "bacκground"

70

71

# Use predefined UTF-8 encoding

72

text = "Hello World"

73

data = webencodings.encode(text, webencodings.UTF8)

74

print(data) # b'Hello World'

75

76

# Check package version

77

print(webencodings.VERSION) # '0.5.1'

78

79

# Inspect available encoding labels

80

print(len(webencodings.LABELS)) # Number of supported encoding labels

81

print('utf-8' in webencodings.LABELS) # True

82

print('latin1' in webencodings.LABELS) # True

83

84

# View some common label mappings

85

common_labels = ['utf-8', 'latin1', 'ascii', 'iso-8859-1']

86

for label in common_labels:

87

canonical = webencodings.LABELS.get(label)

88

print(f"{label} -> {canonical}")

89

90

# utf-8 -> utf-8

91

# latin1 -> windows-1252

92

# ascii -> windows-1252

93

# iso-8859-1 -> windows-1252

94

95

```