or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

core-rendering.mdemoji-sources.mdindex.mdtext-processing.md

text-processing.mddocs/

0

# Text Processing and Utilities

1

2

Helper functions and utilities for text analysis, emoji detection, node parsing, and text measurement. These tools enable advanced text layout, custom processing workflows, and integration with existing text processing systems.

3

4

## Capabilities

5

6

### Text Parsing and Node System

7

8

Convert text strings into structured node representations, separating text, Unicode emojis, and Discord emojis for precise rendering control.

9

10

```python { .api }

11

def to_nodes(text: str, /) -> List[List[Node]]:

12

"""

13

Parses text into structured Node objects organized by lines.

14

15

Each line becomes a list of Node objects representing text segments,

16

Unicode emojis, and Discord emojis. This enables precise control over

17

rendering and text processing.

18

19

Parameters:

20

- text: str - Text to parse (supports multiline strings)

21

22

Returns:

23

- List[List[Node]] - Nested list where each inner list represents

24

a line of Node objects

25

26

Example:

27

>>> to_nodes("Hello πŸ‘‹\\nWorld! 🌍")

28

[[Node(NodeType.text, 'Hello '), Node(NodeType.emoji, 'πŸ‘‹')],

29

[Node(NodeType.text, 'World! '), Node(NodeType.emoji, '🌍')]]

30

"""

31

```

32

33

#### Usage Example

34

35

```python

36

from pilmoji.helpers import to_nodes, NodeType

37

38

# Parse mixed content

39

text = """Hello! πŸ‘‹ Welcome to our app

40

Support for Discord: <:custom:123456789>

41

And regular emojis: πŸŽ‰ 🌟"""

42

43

nodes = to_nodes(text)

44

for line_num, line in enumerate(nodes):

45

print(f"Line {line_num + 1}:")

46

for node in line:

47

if node.type == NodeType.text:

48

print(f" Text: '{node.content}'")

49

elif node.type == NodeType.emoji:

50

print(f" Emoji: {node.content}")

51

elif node.type == NodeType.discord_emoji:

52

print(f" Discord: ID {node.content}")

53

```

54

55

### Node Data Structure

56

57

Structured representation of parsed text segments with type information for precise text processing.

58

59

```python { .api }

60

class Node(NamedTuple):

61

"""Represents a parsed segment of text with type information."""

62

63

type: NodeType

64

content: str

65

66

# Example usage:

67

# Node(NodeType.text, "Hello ")

68

# Node(NodeType.emoji, "πŸ‘‹")

69

# Node(NodeType.discord_emoji, "123456789")

70

```

71

72

### Node Type Enumeration

73

74

Enumeration defining the types of text segments that can be parsed from input strings.

75

76

```python { .api }

77

class NodeType(Enum):

78

"""

79

Enumeration of node types for text parsing.

80

81

Values:

82

- text (0): Plain text segment

83

- emoji (1): Unicode emoji character

84

- discord_emoji (2): Discord custom emoji

85

"""

86

87

text = 0 # Plain text content

88

emoji = 1 # Unicode emoji character

89

discord_emoji = 2 # Discord custom emoji ID

90

```

91

92

### Text Size Calculation

93

94

Standalone function for calculating text dimensions with emoji support, useful for layout planning without creating a Pilmoji instance.

95

96

```python { .api }

97

def getsize(

98

text: str,

99

font: FontT = None,

100

*,

101

spacing: int = 4,

102

emoji_scale_factor: float = 1

103

) -> Tuple[int, int]:

104

"""

105

Calculate text dimensions including emoji substitutions.

106

107

Useful for text layout planning, centering, and UI calculations

108

without requiring a Pilmoji instance.

109

110

Parameters:

111

- text: str - Text to measure

112

- font: FontT - Font for measurement (defaults to Pillow's default)

113

- spacing: int - Line spacing in pixels (default: 4)

114

- emoji_scale_factor: float - Emoji scaling factor (default: 1.0)

115

116

Returns:

117

- Tuple[int, int] - (width, height) of rendered text

118

"""

119

```

120

121

#### Usage Examples

122

123

```python

124

from pilmoji.helpers import getsize

125

from PIL import ImageFont

126

127

# Basic text measurement

128

font = ImageFont.truetype('arial.ttf', 16)

129

text = "Hello! πŸ‘‹ How are you? 🌟"

130

131

width, height = getsize(text, font)

132

print(f"Text size: {width}x{height}")

133

134

# Multiline text measurement

135

multiline = """Welcome! πŸŽ‰

136

Line 2 with emoji 🌈

137

Final line πŸš€"""

138

139

width, height = getsize(multiline, font, spacing=6, emoji_scale_factor=1.2)

140

print(f"Multiline size: {width}x{height}")

141

142

# Layout planning

143

def center_text_on_image(image, text, font):

144

"""Helper to center text on an image."""

145

img_width, img_height = image.size

146

text_width, text_height = getsize(text, font)

147

148

x = (img_width - text_width) // 2

149

y = (img_height - text_height) // 2

150

151

return (x, y)

152

153

# Usage in layout

154

from PIL import Image

155

image = Image.new('RGB', (400, 200), 'white')

156

position = center_text_on_image(image, "Centered! 🎯", font)

157

```

158

159

### Emoji Detection Regex

160

161

Pre-compiled regular expression for detecting emojis in text, supporting both Unicode emojis and Discord custom emoji format.

162

163

```python { .api }

164

EMOJI_REGEX: re.Pattern[str]

165

"""

166

Compiled regex pattern for detecting emojis in text.

167

168

Matches:

169

- Unicode emojis (from emoji library)

170

- Discord custom emojis (<:name:id> and <a:name:id>)

171

172

Useful for custom text processing and validation.

173

"""

174

```

175

176

#### Usage Example

177

178

```python

179

from pilmoji.helpers import EMOJI_REGEX

180

import re

181

182

# Find all emojis in text

183

text = "Hello πŸ‘‹ Discord: <:smile:123> and <a:wave:456>"

184

matches = EMOJI_REGEX.findall(text)

185

print("Found emojis:", matches)

186

187

# Split text by emojis

188

parts = EMOJI_REGEX.split(text)

189

print("Text parts:", parts)

190

191

# Check if text contains emojis

192

has_emojis = bool(EMOJI_REGEX.search(text))

193

print("Contains emojis:", has_emojis)

194

195

# Replace emojis with placeholders

196

def replace_emojis(text, placeholder="[EMOJI]"):

197

return EMOJI_REGEX.sub(placeholder, text)

198

199

clean_text = replace_emojis("Hello! πŸ‘‹ Welcome πŸŽ‰")

200

print("Clean text:", clean_text)

201

```

202

203

### Advanced Text Processing Workflows

204

205

Examples of combining pilmoji utilities for advanced text processing scenarios.

206

207

#### Text Analysis and Statistics

208

209

```python

210

from pilmoji.helpers import to_nodes, NodeType, EMOJI_REGEX

211

from collections import Counter

212

213

def analyze_text(text: str) -> dict:

214

"""Analyze text for emoji usage and content statistics."""

215

216

# Parse into nodes

217

nodes = to_nodes(text)

218

219

# Flatten nodes and count types

220

all_nodes = [node for line in nodes for node in line]

221

type_counts = Counter(node.type for node in all_nodes)

222

223

# Extract emojis

224

unicode_emojis = [node.content for node in all_nodes

225

if node.type == NodeType.emoji]

226

discord_emojis = [node.content for node in all_nodes

227

if node.type == NodeType.discord_emoji]

228

229

return {

230

'total_lines': len(nodes),

231

'total_nodes': len(all_nodes),

232

'text_segments': type_counts[NodeType.text],

233

'unicode_emojis': len(unicode_emojis),

234

'discord_emojis': len(discord_emojis),

235

'unique_unicode_emojis': len(set(unicode_emojis)),

236

'emoji_list': unicode_emojis,

237

'discord_ids': discord_emojis

238

}

239

240

# Usage

241

text = """Welcome! πŸ‘‹ πŸŽ‰

242

We support Discord <:custom:123> and <:other:456>

243

More emojis: 🌟 🎨 πŸ‘‹"""

244

245

stats = analyze_text(text)

246

print(f"Analysis: {stats}")

247

```

248

249

#### Custom Text Renderer with Measurements

250

251

```python

252

from pilmoji.helpers import getsize, to_nodes, NodeType

253

from PIL import Image, ImageDraw, ImageFont

254

255

class CustomTextRenderer:

256

"""Custom text renderer with advanced measurement capabilities."""

257

258

def __init__(self, font):

259

self.font = font

260

261

def get_text_metrics(self, text: str) -> dict:

262

"""Get detailed text metrics."""

263

nodes = to_nodes(text)

264

width, height = getsize(text, self.font)

265

266

line_widths = []

267

for line in nodes:

268

line_text = ''.join(node.content for node in line

269

if node.type == NodeType.text)

270

if line_text:

271

line_width, _ = getsize(line_text, self.font)

272

line_widths.append(line_width)

273

274

return {

275

'total_width': width,

276

'total_height': height,

277

'line_count': len(nodes),

278

'max_line_width': max(line_widths) if line_widths else 0,

279

'avg_line_width': sum(line_widths) / len(line_widths) if line_widths else 0

280

}

281

282

def fit_text_to_width(self, text: str, max_width: int) -> str:

283

"""Truncate text to fit within specified width."""

284

if getsize(text, self.font)[0] <= max_width:

285

return text

286

287

# Binary search for maximum fitting length

288

low, high = 0, len(text)

289

best_fit = ""

290

291

while low <= high:

292

mid = (low + high) // 2

293

test_text = text[:mid] + "..."

294

295

if getsize(test_text, self.font)[0] <= max_width:

296

best_fit = test_text

297

low = mid + 1

298

else:

299

high = mid - 1

300

301

return best_fit

302

303

# Usage

304

font = ImageFont.load_default()

305

renderer = CustomTextRenderer(font)

306

307

text = "This is a long text with emojis πŸŽ‰ that might need truncation 🌟"

308

metrics = renderer.get_text_metrics(text)

309

fitted = renderer.fit_text_to_width(text, 200)

310

311

print(f"Metrics: {metrics}")

312

print(f"Fitted text: {fitted}")

313

```