or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.md

index.mddocs/

0

# wcwidth

1

2

A Python implementation of the POSIX wcwidth() and wcswidth() C functions for determining the printable width of Unicode strings on terminals. This library addresses the issue that string length doesn't always equal terminal display width due to characters that occupy 0 cells (zero-width/combining), 1 cell (normal), or 2 cells (wide East Asian characters).

3

4

The library includes comprehensive Unicode character width tables that can be configured to match specific Unicode versions via environment variables, making it essential for CLI applications, terminal emulators, and any software that needs accurate text formatting and alignment in terminal environments.

5

6

## Package Information

7

8

- **Package Name**: wcwidth

9

- **Package Type**: pypi

10

- **Language**: Python

11

- **Installation**: `pip install wcwidth`

12

- **Version**: 0.2.13

13

- **License**: MIT

14

15

## Core Imports

16

17

```python

18

import wcwidth

19

```

20

21

Selective imports for commonly used functions:

22

23

```python

24

from wcwidth import wcwidth, wcswidth, list_versions

25

```

26

27

Import all (includes private functions):

28

29

```python

30

from wcwidth import *

31

```

32

33

## Basic Usage

34

35

```python

36

from wcwidth import wcwidth, wcswidth

37

38

# Get width of a single character

39

char_width = wcwidth('A') # Returns 1

40

wide_char_width = wcwidth('コ') # Returns 2 (Japanese character)

41

zero_width = wcwidth('\u200d') # Returns 0 (zero-width joiner)

42

43

# Get width of a string

44

string_width = wcswidth('Hello') # Returns 5

45

japanese_width = wcswidth('コンニチハ') # Returns 10

46

mixed_width = wcswidth('Hello コ') # Returns 7

47

48

# Use with specific Unicode version

49

width_unicode_9 = wcwidth('🎉', unicode_version='9.0.0')

50

```

51

52

## Architecture

53

54

The wcwidth library is built around Unicode character width tables and binary search algorithms:

55

56

- **Character Width Tables**: Pre-computed tables for different Unicode versions containing ranges for zero-width, wide, and special characters

57

- **Binary Search**: Efficient lookup of character widths using `_bisearch()` function

58

- **Unicode Version Support**: Configurable support for Unicode versions 4.1.0 through 15.1.0

59

- **Caching**: LRU caches on core functions for performance optimization

60

- **Environment Integration**: Automatic Unicode version detection via `UNICODE_VERSION` environment variable

61

62

## Capabilities

63

64

### Character Width Calculation

65

66

Core functions for determining the printable width of Unicode characters and strings in terminal environments.

67

68

```python { .api }

69

def wcwidth(wc, unicode_version='auto'):

70

"""

71

Given one Unicode character, return its printable length on a terminal.

72

73

Parameters:

74

- wc: str, a single Unicode character

75

- unicode_version: str, Unicode version ('auto', 'latest', or specific version like '9.0.0')

76

77

Returns:

78

int, the width in cells:

79

- -1: not printable or indeterminate effect (control characters)

80

- 0: does not advance cursor (NULL, combining characters, zero-width)

81

- 1: normal width characters

82

- 2: wide characters (East Asian full-width)

83

"""

84

85

def wcswidth(pwcs, n=None, unicode_version='auto'):

86

"""

87

Given a unicode string, return its printable length on a terminal.

88

89

Parameters:

90

- pwcs: str, unicode string to measure

91

- n: int, optional maximum number of characters to measure (for POSIX compatibility)

92

- unicode_version: str, Unicode version ('auto', 'latest', or specific version)

93

94

Returns:

95

int, total width in cells, or -1 if any character is not printable

96

"""

97

```

98

99

### Unicode Version Management

100

101

Functions for working with supported Unicode versions and version matching.

102

103

```python { .api }

104

def list_versions():

105

"""

106

Return Unicode version levels supported by this module release.

107

108

Returns:

109

tuple of str, supported Unicode version numbers in ascending sorted order

110

"""

111

```

112

113

### Internal/Advanced Functions

114

115

Internal functions exported for advanced usage, but not part of the main public API.

116

117

```python { .api }

118

def _bisearch(ucs, table):

119

"""

120

Auxiliary function for binary search in interval table.

121

122

Parameters:

123

- ucs: int, ordinal value of unicode character

124

- table: list, list of starting and ending ranges as [(start, end), ...]

125

126

Returns:

127

int, 1 if ordinal value ucs is found within lookup table, else 0

128

"""

129

130

def _wcmatch_version(given_version):

131

"""

132

Return nearest matching supported Unicode version level.

133

134

Parameters:

135

- given_version: str, version for compare, may be 'auto' or 'latest'

136

137

Returns:

138

str, matched unicode version string

139

"""

140

141

def _wcversion_value(ver_string):

142

"""

143

Integer-mapped value of given dotted version string.

144

145

Parameters:

146

- ver_string: str, Unicode version string of form 'n.n.n'

147

148

Returns:

149

tuple of int, digit tuples representing version components

150

"""

151

```

152

153

## Constants and Tables

154

155

Character width lookup tables and constants for different character categories.

156

157

```python { .api }

158

ZERO_WIDTH: dict

159

# Unicode character table for zero-width characters by version

160

# Format: {'version': [(start, end), ...]}

161

162

WIDE_EASTASIAN: dict

163

# Unicode character table for wide East Asian characters by version

164

# Format: {'version': [(start, end), ...]}

165

166

VS16_NARROW_TO_WIDE: dict

167

# Unicode character table for variation selector 16 width changes

168

# Format: {'version': [(start, end), ...]}

169

170

__version__: str

171

# Package version string, currently '0.2.13'

172

```

173

174

## Environment Variables

175

176

### UNICODE_VERSION

177

178

Controls which Unicode version tables to use when `unicode_version='auto'` is specified.

179

180

```python

181

import os

182

os.environ['UNICODE_VERSION'] = '13.0'

183

184

# Now wcwidth() will use Unicode 13.0 tables by default

185

width = wcwidth('🎉') # Uses Unicode 13.0 tables

186

```

187

188

If not set, defaults to the latest supported version (15.1.0).

189

190

## Supported Unicode Versions

191

192

The library supports the following Unicode versions:

193

194

- **4.1.0** through **15.1.0**

195

- Complete list: 4.1.0, 5.0.0, 5.1.0, 5.2.0, 6.0.0, 6.1.0, 6.2.0, 6.3.0, 7.0.0, 8.0.0, 9.0.0, 10.0.0, 11.0.0, 12.0.0, 12.1.0, 13.0.0, 14.0.0, 15.0.0, 15.1.0

196

197

## Special Character Handling

198

199

### Zero-Width Joiner (ZWJ) Sequences

200

201

```python

202

from wcwidth import wcswidth

203

204

# ZWJ sequences are handled specially

205

emoji_sequence = '👨‍👩‍👧‍👦' # Family emoji with ZWJ

206

width = wcswidth(emoji_sequence) # Correctly handles ZWJ sequences

207

```

208

209

### Variation Selector 16 (VS16)

210

211

```python

212

# VS16 can change narrow characters to wide

213

text_with_vs16 = '🎉\uFE0F' # Emoji with VS16

214

width = wcswidth(text_with_vs16, unicode_version='9.0.0')

215

```

216

217

### Control Characters

218

219

```python

220

# Control characters return -1

221

control_char_width = wcwidth('\x01') # Returns -1

222

string_with_control = wcswidth('Hello\x01World') # Returns -1

223

```

224

225

## Error Handling

226

227

The library handles various edge cases:

228

229

- **Empty strings**: `wcwidth('')` returns 0, `wcswidth('')` returns 0

230

- **Control characters**: Return -1 for non-printable characters

231

- **Invalid Unicode versions**: Issues warnings and falls back to nearest supported version

232

- **Mixed printable/non-printable**: `wcswidth()` returns -1 if any character is non-printable

233

234

## Performance Considerations

235

236

- **LRU Caching**: `wcwidth()` uses `@lru_cache(maxsize=1000)` for performance

237

- **Version Matching**: Unicode version matching is cached with `@lru_cache(maxsize=8)`

238

- **Version Parsing**: Version string parsing is cached with `@lru_cache(maxsize=128)`

239

- **ASCII Optimization**: Fast path for printable ASCII characters (32-127)

240

241

## Dependencies

242

243

- **backports.functools-lru-cache**: Required for Python < 3.2

244

- **No other runtime dependencies**

245

246

## Common Use Cases

247

248

### Terminal Text Alignment

249

250

```python

251

from wcwidth import wcswidth

252

253

def terminal_center(text, width):

254

"""Center text in terminal with correct width calculation."""

255

text_width = wcswidth(text)

256

if text_width is None or text_width < 0:

257

return text # Handle unprintable characters

258

padding = max(0, width - text_width)

259

left_pad = padding // 2

260

return ' ' * left_pad + text

261

262

# Usage

263

centered = terminal_center('Hello コンニチハ', 40)

264

```

265

266

### Text Truncation

267

268

```python

269

from wcwidth import wcswidth

270

271

def truncate_to_width(text, max_width):

272

"""Truncate text to fit within specified terminal width."""

273

for i in range(len(text) + 1):

274

substring = text[:i]

275

width = wcswidth(substring)

276

if width is not None and width > max_width:

277

return text[:i-1] + '…'

278

return text

279

280

# Usage

281

truncated = truncate_to_width('Very long text with unicode コンニチハ', 20)

282

```

283

284

### Column Formatting

285

286

```python

287

from wcwidth import wcswidth

288

289

def format_columns(rows, column_widths):

290

"""Format data in aligned columns considering Unicode width."""

291

formatted_rows = []

292

for row in rows:

293

formatted_row = []

294

for cell, width in zip(row, column_widths):

295

cell_width = wcswidth(str(cell)) or 0

296

padding = max(0, width - cell_width)

297

formatted_row.append(str(cell) + ' ' * padding)

298

formatted_rows.append(''.join(formatted_row))

299

return formatted_rows

300

301

# Usage

302

data = [['Name', 'Age', 'City'], ['Alice', '25', 'Tokyo 東京']]

303

formatted = format_columns(data, [15, 5, 20])

304

```