0
# Text Processing and Utilities
1
2
Helper functions and utilities for text analysis, emoji detection, node parsing, and text measurement. These tools enable advanced text layout, custom processing workflows, and integration with existing text processing systems.
3
4
## Capabilities
5
6
### Text Parsing and Node System
7
8
Convert text strings into structured node representations, separating text, Unicode emojis, and Discord emojis for precise rendering control.
9
10
```python { .api }
11
def to_nodes(text: str, /) -> List[List[Node]]:
12
"""
13
Parses text into structured Node objects organized by lines.
14
15
Each line becomes a list of Node objects representing text segments,
16
Unicode emojis, and Discord emojis. This enables precise control over
17
rendering and text processing.
18
19
Parameters:
20
- text: str - Text to parse (supports multiline strings)
21
22
Returns:
23
- List[List[Node]] - Nested list where each inner list represents
24
a line of Node objects
25
26
Example:
27
>>> to_nodes("Hello π\\nWorld! π")
28
[[Node(NodeType.text, 'Hello '), Node(NodeType.emoji, 'π')],
29
[Node(NodeType.text, 'World! '), Node(NodeType.emoji, 'π')]]
30
"""
31
```
32
33
#### Usage Example
34
35
```python
36
from pilmoji.helpers import to_nodes, NodeType
37
38
# Parse mixed content
39
text = """Hello! π Welcome to our app
40
Support for Discord: <:custom:123456789>
41
And regular emojis: π π"""
42
43
nodes = to_nodes(text)
44
for line_num, line in enumerate(nodes):
45
print(f"Line {line_num + 1}:")
46
for node in line:
47
if node.type == NodeType.text:
48
print(f" Text: '{node.content}'")
49
elif node.type == NodeType.emoji:
50
print(f" Emoji: {node.content}")
51
elif node.type == NodeType.discord_emoji:
52
print(f" Discord: ID {node.content}")
53
```
54
55
### Node Data Structure
56
57
Structured representation of parsed text segments with type information for precise text processing.
58
59
```python { .api }
60
class Node(NamedTuple):
61
"""Represents a parsed segment of text with type information."""
62
63
type: NodeType
64
content: str
65
66
# Example usage:
67
# Node(NodeType.text, "Hello ")
68
# Node(NodeType.emoji, "π")
69
# Node(NodeType.discord_emoji, "123456789")
70
```
71
72
### Node Type Enumeration
73
74
Enumeration defining the types of text segments that can be parsed from input strings.
75
76
```python { .api }
77
class NodeType(Enum):
78
"""
79
Enumeration of node types for text parsing.
80
81
Values:
82
- text (0): Plain text segment
83
- emoji (1): Unicode emoji character
84
- discord_emoji (2): Discord custom emoji
85
"""
86
87
text = 0 # Plain text content
88
emoji = 1 # Unicode emoji character
89
discord_emoji = 2 # Discord custom emoji ID
90
```
91
92
### Text Size Calculation
93
94
Standalone function for calculating text dimensions with emoji support, useful for layout planning without creating a Pilmoji instance.
95
96
```python { .api }
97
def getsize(
98
text: str,
99
font: FontT = None,
100
*,
101
spacing: int = 4,
102
emoji_scale_factor: float = 1
103
) -> Tuple[int, int]:
104
"""
105
Calculate text dimensions including emoji substitutions.
106
107
Useful for text layout planning, centering, and UI calculations
108
without requiring a Pilmoji instance.
109
110
Parameters:
111
- text: str - Text to measure
112
- font: FontT - Font for measurement (defaults to Pillow's default)
113
- spacing: int - Line spacing in pixels (default: 4)
114
- emoji_scale_factor: float - Emoji scaling factor (default: 1.0)
115
116
Returns:
117
- Tuple[int, int] - (width, height) of rendered text
118
"""
119
```
120
121
#### Usage Examples
122
123
```python
124
from pilmoji.helpers import getsize
125
from PIL import ImageFont
126
127
# Basic text measurement
128
font = ImageFont.truetype('arial.ttf', 16)
129
text = "Hello! π How are you? π"
130
131
width, height = getsize(text, font)
132
print(f"Text size: {width}x{height}")
133
134
# Multiline text measurement
135
multiline = """Welcome! π
136
Line 2 with emoji π
137
Final line π"""
138
139
width, height = getsize(multiline, font, spacing=6, emoji_scale_factor=1.2)
140
print(f"Multiline size: {width}x{height}")
141
142
# Layout planning
143
def center_text_on_image(image, text, font):
144
"""Helper to center text on an image."""
145
img_width, img_height = image.size
146
text_width, text_height = getsize(text, font)
147
148
x = (img_width - text_width) // 2
149
y = (img_height - text_height) // 2
150
151
return (x, y)
152
153
# Usage in layout
154
from PIL import Image
155
image = Image.new('RGB', (400, 200), 'white')
156
position = center_text_on_image(image, "Centered! π―", font)
157
```
158
159
### Emoji Detection Regex
160
161
Pre-compiled regular expression for detecting emojis in text, supporting both Unicode emojis and Discord custom emoji format.
162
163
```python { .api }
164
EMOJI_REGEX: re.Pattern[str]
165
"""
166
Compiled regex pattern for detecting emojis in text.
167
168
Matches:
169
- Unicode emojis (from emoji library)
170
- Discord custom emojis (<:name:id> and <a:name:id>)
171
172
Useful for custom text processing and validation.
173
"""
174
```
175
176
#### Usage Example
177
178
```python
179
from pilmoji.helpers import EMOJI_REGEX
180
import re
181
182
# Find all emojis in text
183
text = "Hello π Discord: <:smile:123> and <a:wave:456>"
184
matches = EMOJI_REGEX.findall(text)
185
print("Found emojis:", matches)
186
187
# Split text by emojis
188
parts = EMOJI_REGEX.split(text)
189
print("Text parts:", parts)
190
191
# Check if text contains emojis
192
has_emojis = bool(EMOJI_REGEX.search(text))
193
print("Contains emojis:", has_emojis)
194
195
# Replace emojis with placeholders
196
def replace_emojis(text, placeholder="[EMOJI]"):
197
return EMOJI_REGEX.sub(placeholder, text)
198
199
clean_text = replace_emojis("Hello! π Welcome π")
200
print("Clean text:", clean_text)
201
```
202
203
### Advanced Text Processing Workflows
204
205
Examples of combining pilmoji utilities for advanced text processing scenarios.
206
207
#### Text Analysis and Statistics
208
209
```python
210
from pilmoji.helpers import to_nodes, NodeType, EMOJI_REGEX
211
from collections import Counter
212
213
def analyze_text(text: str) -> dict:
214
"""Analyze text for emoji usage and content statistics."""
215
216
# Parse into nodes
217
nodes = to_nodes(text)
218
219
# Flatten nodes and count types
220
all_nodes = [node for line in nodes for node in line]
221
type_counts = Counter(node.type for node in all_nodes)
222
223
# Extract emojis
224
unicode_emojis = [node.content for node in all_nodes
225
if node.type == NodeType.emoji]
226
discord_emojis = [node.content for node in all_nodes
227
if node.type == NodeType.discord_emoji]
228
229
return {
230
'total_lines': len(nodes),
231
'total_nodes': len(all_nodes),
232
'text_segments': type_counts[NodeType.text],
233
'unicode_emojis': len(unicode_emojis),
234
'discord_emojis': len(discord_emojis),
235
'unique_unicode_emojis': len(set(unicode_emojis)),
236
'emoji_list': unicode_emojis,
237
'discord_ids': discord_emojis
238
}
239
240
# Usage
241
text = """Welcome! π π
242
We support Discord <:custom:123> and <:other:456>
243
More emojis: π π¨ π"""
244
245
stats = analyze_text(text)
246
print(f"Analysis: {stats}")
247
```
248
249
#### Custom Text Renderer with Measurements
250
251
```python
252
from pilmoji.helpers import getsize, to_nodes, NodeType
253
from PIL import Image, ImageDraw, ImageFont
254
255
class CustomTextRenderer:
256
"""Custom text renderer with advanced measurement capabilities."""
257
258
def __init__(self, font):
259
self.font = font
260
261
def get_text_metrics(self, text: str) -> dict:
262
"""Get detailed text metrics."""
263
nodes = to_nodes(text)
264
width, height = getsize(text, self.font)
265
266
line_widths = []
267
for line in nodes:
268
line_text = ''.join(node.content for node in line
269
if node.type == NodeType.text)
270
if line_text:
271
line_width, _ = getsize(line_text, self.font)
272
line_widths.append(line_width)
273
274
return {
275
'total_width': width,
276
'total_height': height,
277
'line_count': len(nodes),
278
'max_line_width': max(line_widths) if line_widths else 0,
279
'avg_line_width': sum(line_widths) / len(line_widths) if line_widths else 0
280
}
281
282
def fit_text_to_width(self, text: str, max_width: int) -> str:
283
"""Truncate text to fit within specified width."""
284
if getsize(text, self.font)[0] <= max_width:
285
return text
286
287
# Binary search for maximum fitting length
288
low, high = 0, len(text)
289
best_fit = ""
290
291
while low <= high:
292
mid = (low + high) // 2
293
test_text = text[:mid] + "..."
294
295
if getsize(test_text, self.font)[0] <= max_width:
296
best_fit = test_text
297
low = mid + 1
298
else:
299
high = mid - 1
300
301
return best_fit
302
303
# Usage
304
font = ImageFont.load_default()
305
renderer = CustomTextRenderer(font)
306
307
text = "This is a long text with emojis π that might need truncation π"
308
metrics = renderer.get_text_metrics(text)
309
fitted = renderer.fit_text_to_width(text, 200)
310
311
print(f"Metrics: {metrics}")
312
print(f"Fitted text: {fitted}")
313
```