0
# Titlecase
1
2
A Python port of John Gruber's titlecase.pl that converts text to proper title case according to English style guides. The library applies intelligent capitalization rules, handling small words (a, an, the, etc.) based on the New York Times Manual of Style while preserving abbreviations, acronyms, and mixed-case words.
3
4
## Package Information
5
6
- **Package Name**: titlecase
7
- **Language**: Python
8
- **Installation**: `pip install titlecase`
9
- **Python Version**: >=3.7
10
- **Optional Dependencies**: `regex` (enhanced regex support, fallback to `re`)
11
12
## Core Imports
13
14
```python
15
from titlecase import titlecase
16
```
17
18
Alternative import pattern:
19
20
```python
21
import titlecase
22
23
# Then use: titlecase.titlecase()
24
```
25
26
## Basic Usage
27
28
```python
29
from titlecase import titlecase
30
31
# Basic title casing
32
result = titlecase('this is a simple test')
33
print(result) # "This Is a Simple Test"
34
35
# Handles all-caps input
36
result = titlecase('THIS IS ALL CAPS')
37
print(result) # "This Is All Caps"
38
39
# Preserves mixed case when appropriate
40
result = titlecase('this is a TEST')
41
print(result) # "This Is a TEST"
42
43
# Handles punctuation and contractions
44
result = titlecase("Q&A with steve jobs: 'that's what happens'")
45
print(result) # "Q&A With Steve Jobs: 'That's What Happens'"
46
```
47
48
## Capabilities
49
50
### Primary Text Processing
51
52
The main titlecase function for converting text to proper title case with intelligent handling of small words, abbreviations, and special cases.
53
54
```python { .api }
55
def titlecase(text, callback=None, small_first_last=True, preserve_blank_lines=False):
56
"""
57
Convert text to title case with intelligent capitalization rules.
58
59
Parameters:
60
- text (str): Input text to convert to title case
61
- callback (function, optional): Custom callback function for word-specific handling
62
- small_first_last (bool, default=True): Whether to capitalize small words at beginning/end
63
- preserve_blank_lines (bool, default=False): Whether to preserve blank lines in input
64
65
Returns:
66
str: Title-cased version of input text
67
"""
68
```
69
70
### Configuration Functions
71
72
Functions for customizing titlecase behavior by modifying small word patterns and creating callback filters.
73
74
```python { .api }
75
def set_small_word_list(small=SMALL):
76
"""
77
Configure the list of small words that should not be capitalized.
78
79
Parameters:
80
- small (str, optional): Regex pattern of small words (defaults to built-in SMALL pattern)
81
82
Returns:
83
None (modifies global regex patterns)
84
"""
85
86
def create_wordlist_filter_from_file(file_path):
87
"""
88
Create a callback function from a file containing abbreviations to preserve.
89
90
Parameters:
91
- file_path (str): Path to file containing abbreviations (one per line)
92
93
Returns:
94
function: Callback function for use with titlecase()
95
"""
96
```
97
98
### Command Line Interface
99
100
Command-line utility for processing text files or command arguments.
101
102
```python { .api }
103
def cmd():
104
"""
105
Handler for command-line invocation of titlecase utility.
106
107
Command-line options:
108
- Positional: string arguments to titlecase
109
- -f, --input-file: Input file path (or '-' for stdin)
110
- -o, --output-file: Output file path (or '-' for stdout)
111
- -w, --wordlist: Path to wordlist file for acronyms
112
- --preserve-blank-lines: Flag to preserve blank lines
113
114
Entry point: 'titlecase' console script
115
"""
116
```
117
118
## Advanced Usage
119
120
### Custom Callback Functions
121
122
```python
123
from titlecase import titlecase
124
125
def tech_acronyms(word, **kwargs):
126
"""Custom callback to preserve technical acronyms"""
127
acronyms = {'TCP', 'UDP', 'HTTP', 'API', 'JSON', 'XML'}
128
if word.upper() in acronyms:
129
return word.upper()
130
return None # Let titlecase handle normally
131
132
result = titlecase('a simple tcp and json api', callback=tech_acronyms)
133
print(result) # "A Simple TCP and JSON API"
134
```
135
136
### Wordlist File Processing
137
138
```python
139
from titlecase import titlecase, create_wordlist_filter_from_file
140
141
# Create callback from wordlist file
142
# File contains: TCP\nUDP\nAPI\nJSON (one per line)
143
wordlist_filter = create_wordlist_filter_from_file('~/.titlecase.txt')
144
145
result = titlecase('working with tcp and json apis', callback=wordlist_filter)
146
print(result) # "Working With TCP and JSON APIs"
147
```
148
149
### Global Configuration
150
151
```python
152
from titlecase import titlecase, set_small_word_list
153
154
# Customize small words (example: remove "and" from small words)
155
custom_small = r'a|an|as|at|but|by|en|for|if|in|of|on|or|the|to|v\.?|via|vs\.?'
156
set_small_word_list(custom_small)
157
158
result = titlecase('jack and jill')
159
print(result) # "Jack And Jill" (now "and" gets capitalized)
160
```
161
162
## Types and Constants
163
164
```python { .api }
165
# Version information
166
__version__: str = '2.4.1'
167
168
# Base class for immutable types
169
class Immutable:
170
"""Base class for types that should remain unchanged during titlecase processing"""
171
172
# Immutable string classes for callback return values
173
class ImmutableString(str, Immutable):
174
"""String subclass that marks content as unchanged by titlecase processing"""
175
176
class ImmutableBytes(bytes, Immutable):
177
"""Bytes subclass that marks content as unchanged by titlecase processing"""
178
179
# Built-in constants for small words and patterns
180
SMALL: str = r'a|an|and|as|at|but|by|en|for|if|in|of|on|or|the|to|v\.?|via|vs\.?' # Default small words pattern
181
PUNCT: str = r"""!""#$%&''()*+,\-–‒—―./:;?@[\\\]_`{|}~""" # Punctuation characters pattern
182
```
183
184
## Key Features
185
186
- **Intelligent Small Word Handling**: Automatically handles articles, prepositions, and conjunctions according to style guides
187
- **Abbreviation Detection**: Preserves existing capitalization for abbreviations and acronyms
188
- **Mac/Mc Name Support**: Proper handling of Scottish and Irish surnames (MacDonald, McPherson)
189
- **Hyphenated Word Processing**: Correctly handles compound words connected by hyphens
190
- **Slash-Separated Processing**: Handles alternatives and paths (word/word)
191
- **Unicode Support**: Works with international characters when regex module is available
192
- **Customizable via Callbacks**: Extensible through user-defined word processing functions
193
- **Command-Line Utility**: Full-featured CLI with file I/O and wordlist support
194
- **Contraction Handling**: Proper capitalization of contractions and possessives
195
196
## Error Handling
197
198
The library handles various edge cases gracefully:
199
200
- Empty strings return empty strings
201
- Invalid file paths in `create_wordlist_filter_from_file()` return a no-op callback
202
- Missing regex module falls back to standard `re` module with reduced Unicode support
203
- Invalid callback functions that return non-string values are ignored
204
205
## Installation Options
206
207
```bash
208
# Standard installation
209
pip install titlecase
210
211
# With enhanced regex support
212
pip install titlecase[regex]
213
```
214
215
The `regex` extra provides enhanced Unicode support and additional pattern matching capabilities compared to the standard `re` module.