or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-python-slugify

A Python slugify application that handles Unicode text conversion to URL-friendly slugs

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/python-slugify@8.0.x

To install, run

npx @tessl/cli install tessl/pypi-python-slugify@8.0.0

0

# Python Slugify

1

2

A comprehensive Python library for converting Unicode text strings into URL-friendly slugs. Python Slugify handles complex Unicode characters from various languages by transliterating them to ASCII equivalents, while offering extensive customization options including custom separators, stopword filtering, length limits, regex patterns, and character replacements.

3

4

## Package Information

5

6

- **Package Name**: python-slugify

7

- **Language**: Python

8

- **Installation**: `pip install python-slugify`

9

- **Optional**: `pip install python-slugify[unidecode]` (for advanced Unicode handling)

10

11

## Core Imports

12

13

```python

14

from slugify import slugify

15

```

16

17

Additional utilities and special character mappings:

18

19

```python

20

from slugify import slugify, smart_truncate

21

from slugify import PRE_TRANSLATIONS, CYRILLIC, GERMAN, GREEK

22

```

23

24

Version and metadata information:

25

26

```python

27

from slugify import __version__, __title__, __author__, __description__

28

```

29

30

Regex patterns and constants:

31

32

```python

33

from slugify import DEFAULT_SEPARATOR

34

from slugify import CHAR_ENTITY_PATTERN, DECIMAL_PATTERN, HEX_PATTERN

35

```

36

37

## Basic Usage

38

39

```python

40

from slugify import slugify

41

42

# Basic text slugification

43

text = "This is a test ---"

44

result = slugify(text)

45

print(result) # "this-is-a-test"

46

47

# Unicode text handling

48

text = '影師嗎'

49

result = slugify(text)

50

print(result) # "ying-shi-ma"

51

52

# Preserve Unicode characters

53

text = '影師嗎'

54

result = slugify(text, allow_unicode=True)

55

print(result) # "影師嗎"

56

57

# Custom separator and length limits

58

text = 'C\'est déjà l\'été.'

59

result = slugify(text, separator='_', max_length=15)

60

print(result) # "c_est_deja_l_et"

61

62

# Using replacement rules

63

text = "50% off | great deal"

64

result = slugify(text, replacements=[['%', 'percent'], ['|', 'or']])

65

print(result) # "50-percent-off-or-great-deal"

66

```

67

68

## Capabilities

69

70

### Text Slugification

71

72

The main function for converting text to URL-friendly slugs with comprehensive Unicode support and customization options.

73

74

```python { .api }

75

def slugify(

76

text: str,

77

entities: bool = True,

78

decimal: bool = True,

79

hexadecimal: bool = True,

80

max_length: int = 0,

81

word_boundary: bool = False,

82

separator: str = "-",

83

save_order: bool = False,

84

stopwords: Iterable[str] = (),

85

regex_pattern: re.Pattern[str] | str | None = None,

86

lowercase: bool = True,

87

replacements: Iterable[Iterable[str]] = (),

88

allow_unicode: bool = False,

89

) -> str:

90

"""

91

Convert text into a URL-friendly slug.

92

93

Parameters:

94

- text (str): Input text to slugify

95

- entities (bool): Convert HTML entities to unicode (default: True)

96

- decimal (bool): Convert HTML decimal entities to unicode (default: True)

97

- hexadecimal (bool): Convert HTML hexadecimal entities to unicode (default: True)

98

- max_length (int): Maximum output length, 0 for no limit (default: 0)

99

- word_boundary (bool): Truncate to complete words (default: False)

100

- separator (str): Separator between words (default: "-")

101

- save_order (bool): Preserve word order when truncating (default: False)

102

- stopwords (Iterable[str]): Words to exclude from output (default: ())

103

- regex_pattern (re.Pattern[str] | str | None): Custom regex for disallowed characters (default: None)

104

- lowercase (bool): Convert to lowercase (default: True)

105

- replacements (Iterable[Iterable[str]]): Custom replacement rules (default: ())

106

- allow_unicode (bool): Allow Unicode characters in output (default: False)

107

108

Returns:

109

str: URL-friendly slug

110

"""

111

```

112

113

#### Usage Examples

114

115

```python

116

from slugify import slugify

117

118

# HTML entity handling

119

text = "foo & bar"

120

result = slugify(text) # "foo-bar"

121

122

# Stopword filtering

123

text = "The quick brown fox"

124

result = slugify(text, stopwords=['the', 'a', 'an']) # "quick-brown-fox"

125

126

# Custom regex pattern

127

import re

128

text = "Hello World 123"

129

pattern = re.compile(r'[^a-z]+')

130

result = slugify(text, regex_pattern=pattern) # "hello-world"

131

132

# Length limits with word boundaries

133

text = "This is a very long sentence"

134

result = slugify(text, max_length=15, word_boundary=True) # "this-is-a-very"

135

136

# Multiple replacement rules

137

text = "Price: $50 | 20% off"

138

replacements = [['$', 'dollar'], ['%', 'percent'], ['|', 'and']]

139

result = slugify(text, replacements=replacements) # "price-dollar50-and-20-percent-off"

140

```

141

142

### Smart Text Truncation

143

144

Intelligent string truncation with word boundary preservation and order control.

145

146

```python { .api }

147

def smart_truncate(

148

string: str,

149

max_length: int = 0,

150

word_boundary: bool = False,

151

separator: str = " ",

152

save_order: bool = False,

153

) -> str:

154

"""

155

Intelligently truncate strings while preserving word boundaries.

156

157

Parameters:

158

- string (str): String to truncate

159

- max_length (int): Maximum length, 0 for no truncation (default: 0)

160

- word_boundary (bool): Respect word boundaries (default: False)

161

- separator (str): Word separator (default: " ")

162

- save_order (bool): Maintain original word order (default: False)

163

164

Returns:

165

str: Truncated string

166

"""

167

```

168

169

#### Usage Examples

170

171

```python

172

from slugify import smart_truncate

173

174

# Basic truncation

175

text = "This is a long sentence"

176

result = smart_truncate(text, max_length=10) # "This is a "

177

178

# Word boundary preservation

179

text = "This is a long sentence"

180

result = smart_truncate(text, max_length=15, word_boundary=True) # "This is a long"

181

182

# Custom separator

183

text = "word1-word2-word3-word4"

184

result = smart_truncate(text, max_length=15, word_boundary=True, separator="-") # "word1-word2"

185

```

186

187

### Language-Specific Character Mappings

188

189

Pre-defined character translation mappings for various languages, useful for custom transliteration workflows.

190

191

```python { .api }

192

# Character mapping lists

193

CYRILLIC: list[tuple[str, str]]

194

GERMAN: list[tuple[str, str]]

195

GREEK: list[tuple[str, str]]

196

PRE_TRANSLATIONS: list[tuple[str, str]]

197

198

def add_uppercase_char(char_list: list[tuple[str, str]]) -> list[tuple[str, str]]:

199

"""

200

Add uppercase variants to character replacement list.

201

202

Parameters:

203

- char_list (list[tuple[str, str]]): List of character replacement tuples

204

205

Returns:

206

list[tuple[str, str]]: Enhanced list with uppercase variants

207

"""

208

```

209

210

#### Available Character Mappings

211

212

```python

213

from slugify import CYRILLIC, GERMAN, GREEK, PRE_TRANSLATIONS

214

215

# Cyrillic mappings: ё->e, я->ya, х->h, у->y, щ->sch, ю->u (with uppercase variants)

216

print(CYRILLIC[:3]) # [('Ё', 'E'), ('ё', 'e'), ('Я', 'Ya'), ...]

217

218

# German umlaut mappings: ä->ae, ö->oe, ü->ue (with uppercase variants)

219

print(GERMAN[:3]) # [('Ä', 'Ae'), ('ä', 'ae'), ('Ö', 'Oe'), ...]

220

221

# Greek mappings: χ->ch, Ξ->X, ϒ->Y, υ->y, etc. (with uppercase variants)

222

print(GREEK[:3]) # [('Χ', 'Ch'), ('χ', 'ch'), ('Ξ', 'X'), ...]

223

224

# Combined mappings from all languages

225

print(len(PRE_TRANSLATIONS)) # Total count of all mappings

226

```

227

228

### Command Line Interface

229

230

Python Slugify provides a command-line interface for text slugification with full parameter support.

231

232

```python { .api }

233

def main(argv: list[str] | None = None):

234

"""

235

Command-line entry point for slugification.

236

237

Parameters:

238

- argv (list[str] | None): Command line arguments (default: None uses sys.argv)

239

"""

240

```

241

242

#### Command Line Usage

243

244

```bash

245

# Basic usage

246

slugify "Hello World" # Output: hello-world

247

248

# From stdin

249

echo "Hello World" | slugify --stdin

250

251

# With options

252

slugify "Hello World" --separator="_" --max-length=8 # Output: hello_wo

253

254

# Custom replacements

255

slugify "Price: $50" --replacements "\$->dollar" # Output: price-dollar50

256

257

# Custom regex pattern

258

slugify "Keep_underscores" --regex-pattern "[^-a-z0-9_]+" # Output: keep_underscores

259

260

# Allow unicode

261

slugify "影師嗎" --allow-unicode # Output: 影師嗎

262

263

# Complex combination

264

slugify "The ÜBER café costs 50%" --stopwords "the" --replacements "Ü->UE" "%->percent" --max-length=20

265

# Output: ueber-cafe-costs-50

266

267

# Help

268

slugify --help

269

```

270

271

#### Command Line Parameters

272

273

All `slugify()` function parameters are available as command-line options:

274

275

- `--separator`: Custom separator (default: "-")

276

- `--max-length`: Maximum output length

277

- `--word-boundary`: Truncate to complete words

278

- `--save-order`: Preserve word order when truncating

279

- `--stopwords`: Space-separated list of words to exclude

280

- `--regex-pattern`: Custom regex for disallowed characters

281

- `--no-lowercase`: Disable lowercase conversion

282

- `--replacements`: Replacement rules in format "old->new"

283

- `--allow-unicode`: Allow Unicode characters

284

- `--no-entities`: Disable HTML entity conversion

285

- `--no-decimal`: Disable HTML decimal conversion

286

- `--no-hexadecimal`: Disable HTML hexadecimal conversion

287

- `--stdin`: Read input from stdin

288

289

## Types and Constants

290

291

```python { .api }

292

# Default separator constant

293

DEFAULT_SEPARATOR: str = "-"

294

295

# Regex patterns for text processing

296

CHAR_ENTITY_PATTERN: re.Pattern[str] # HTML character entities

297

DECIMAL_PATTERN: re.Pattern[str] # HTML decimal references

298

HEX_PATTERN: re.Pattern[str] # HTML hexadecimal references

299

QUOTE_PATTERN: re.Pattern[str] # Quote characters

300

DISALLOWED_CHARS_PATTERN: re.Pattern[str] # Disallowed ASCII characters

301

DISALLOWED_UNICODE_CHARS_PATTERN: re.Pattern[str] # Disallowed Unicode characters

302

DUPLICATE_DASH_PATTERN: re.Pattern[str] # Duplicate dashes

303

NUMBERS_PATTERN: re.Pattern[str] # Comma-separated numbers

304

```

305

306

### Package Metadata

307

308

```python { .api }

309

# Version and package information

310

__version__: str # Package version (e.g., "8.0.4")

311

__title__: str # Package title ("python-slugify")

312

__author__: str # Package author ("Val Neekman")

313

__author_email__: str # Author email ("info@neekware.com")

314

__description__: str # Package description

315

__url__: str # Package URL ("https://github.com/un33k/python-slugify")

316

__license__: str # License ("MIT")

317

__copyright__: str # Copyright notice

318

```

319

320

#### Usage Examples

321

322

```python

323

from slugify import __version__, __title__, __author__

324

325

print(f"{__title__} version {__version__} by {__author__}")

326

# Output: python-slugify version 8.0.4 by Val Neekman

327

```

328

329

## Error Handling

330

331

Python Slugify is designed to be robust and handles various edge cases gracefully:

332

333

- **Invalid input types**: Automatically converts non-string inputs to strings

334

- **HTML entity errors**: Silently skips malformed decimal/hexadecimal entities

335

- **Empty input**: Returns empty string for empty or whitespace-only input

336

- **Unicode normalization**: Handles Unicode normalization form variations

337

- **Regex pattern errors**: Falls back to default patterns if custom regex is invalid

338

339

```python

340

from slugify import slugify

341

342

# Handles various input types

343

result = slugify(123) # "123"

344

result = slugify(None) # ""

345

result = slugify("") # ""

346

347

# Graceful error handling for malformed HTML entities

348

result = slugify("&#invalid;") # Skips invalid entity, continues processing

349

```

350

351

## Dependencies

352

353

- **Required**: `text-unidecode>=1.3` (GPL & Perl Artistic license)

354

- **Optional**: `Unidecode>=1.1.1` (install with `pip install python-slugify[unidecode]`)

355

356

The package automatically uses `Unidecode` if available, otherwise falls back to `text-unidecode`. `Unidecode` is considered more advanced for Unicode transliteration but has different licensing terms.