or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

classes-types.mdcompilation-utilities.mdflags-constants.mdindex.mdpattern-matching.mdsplitting.mdsubstitution.md

substitution.mddocs/

0

# String Substitution Functions

1

2

Advanced string replacement capabilities that extend beyond the standard `re` module with enhanced parameters for position control, concurrent execution, timeout handling, and format-based replacements. These functions provide powerful text transformation capabilities for complex pattern-based string manipulation.

3

4

## Capabilities

5

6

### Basic String Substitution

7

8

Replace pattern occurrences in a string with a replacement string or callable function, providing precise control over the number of replacements and search boundaries.

9

10

```python { .api }

11

def sub(pattern, repl, string, count=0, flags=0, pos=None, endpos=None,

12

concurrent=None, timeout=None, ignore_unused=False, **kwargs):

13

"""

14

Return the string obtained by replacing non-overlapping occurrences of pattern with replacement.

15

16

Args:

17

pattern (str): Regular expression pattern to find

18

repl (str or callable): Replacement string or function

19

string (str): String to perform substitutions on

20

count (int, optional): Maximum number of replacements (0 = all)

21

flags (int, optional): Regex flags to modify matching behavior

22

pos (int, optional): Start position for searching (default: 0)

23

endpos (int, optional): End position for searching (default: len(string))

24

concurrent (bool, optional): Release GIL during matching for multithreading

25

timeout (float, optional): Timeout in seconds for matching operation

26

ignore_unused (bool, optional): Ignore unused keyword arguments

27

**kwargs: Additional pattern compilation arguments

28

29

Returns:

30

str: String with replacements made

31

"""

32

```

33

34

**Usage Examples:**

35

36

```python

37

import regex

38

39

# Basic substitution

40

result = regex.sub(r'\d+', 'X', 'Replace 123 and 456 with X')

41

print(result) # 'Replace X and X with X'

42

43

# Limited number of replacements

44

result = regex.sub(r'\d+', 'NUM', 'Values: 1, 2, 3, 4', count=2)

45

print(result) # 'Values: NUM, NUM, 3, 4'

46

47

# Using replacement function

48

def capitalize_match(match):

49

return match.group().upper()

50

51

result = regex.sub(r'\b\w+\b', capitalize_match, 'hello world')

52

print(result) # 'HELLO WORLD'

53

54

# Position-bounded substitution

55

result = regex.sub(r'\d', 'X', '12abc34def56', pos=2, endpos=8)

56

print(result) # '12aXcXXdef56'

57

58

# Using backreferences

59

result = regex.sub(r'(\w+) (\w+)', r'\2, \1', 'John Doe')

60

print(result) # 'Doe, John'

61

62

# Named group backreferences

63

result = regex.sub(r'(?P<first>\w+) (?P<last>\w+)', r'\g<last>, \g<first>', 'Jane Smith')

64

print(result) # 'Smith, Jane'

65

```

66

67

### Format-Based Substitution

68

69

Replace pattern occurrences using Python's format string syntax, providing more flexible and readable replacement patterns.

70

71

```python { .api }

72

def subf(pattern, format, string, count=0, flags=0, pos=None, endpos=None,

73

concurrent=None, timeout=None, ignore_unused=False, **kwargs):

74

"""

75

Return the string obtained by replacing pattern occurrences using format string.

76

77

Args:

78

pattern (str): Regular expression pattern to find

79

format (str or callable): Format string or function using Python format syntax

80

string (str): String to perform substitutions on

81

count (int, optional): Maximum number of replacements (0 = all)

82

flags (int, optional): Regex flags to modify matching behavior

83

pos (int, optional): Start position for searching (default: 0)

84

endpos (int, optional): End position for searching (default: len(string))

85

concurrent (bool, optional): Release GIL during matching for multithreading

86

timeout (float, optional): Timeout in seconds for matching operation

87

ignore_unused (bool, optional): Ignore unused keyword arguments

88

**kwargs: Additional pattern compilation arguments

89

90

Returns:

91

str: String with format-based replacements made

92

"""

93

```

94

95

**Usage Examples:**

96

97

```python

98

import regex

99

100

# Format string with positional arguments

101

result = regex.subf(r'(\w+) (\w+)', '{1}, {0}', 'John Doe')

102

print(result) # 'Doe, John'

103

104

# Format string with named groups

105

pattern = r'(?P<name>\w+): (?P<value>\d+)'

106

format_str = '{name} = {value}'

107

result = regex.subf(pattern, format_str, 'width: 100, height: 200')

108

print(result) # 'width = 100, height = 200'

109

110

# Format function for complex transformations

111

def format_currency(match):

112

amount = float(match.group('amount'))

113

return f'${amount:.2f}'

114

115

pattern = r'(?P<amount>\d+\.\d+)'

116

result = regex.subf(pattern, format_currency, 'Price: 19.9, Tax: 2.5')

117

print(result) # 'Price: $19.90, Tax: $2.50'

118

```

119

120

### Substitution with Count

121

122

Perform substitutions and return both the modified string and the number of substitutions made, useful for tracking replacement operations.

123

124

```python { .api }

125

def subn(pattern, repl, string, count=0, flags=0, pos=None, endpos=None,

126

concurrent=None, timeout=None, ignore_unused=False, **kwargs):

127

"""

128

Return a 2-tuple containing (new_string, number_of_substitutions_made).

129

130

Args:

131

pattern (str): Regular expression pattern to find

132

repl (str or callable): Replacement string or function

133

string (str): String to perform substitutions on

134

count (int, optional): Maximum number of replacements (0 = all)

135

flags (int, optional): Regex flags to modify matching behavior

136

pos (int, optional): Start position for searching (default: 0)

137

endpos (int, optional): End position for searching (default: len(string))

138

concurrent (bool, optional): Release GIL during matching for multithreading

139

timeout (float, optional): Timeout in seconds for matching operation

140

ignore_unused (bool, optional): Ignore unused keyword arguments

141

**kwargs: Additional pattern compilation arguments

142

143

Returns:

144

tuple: (modified_string, substitution_count)

145

"""

146

```

147

148

**Usage Examples:**

149

150

```python

151

import regex

152

153

# Basic substitution with count

154

result, count = regex.subn(r'\d+', 'NUM', 'Replace 123 and 456')

155

print(f"Result: '{result}', Replacements: {count}")

156

# Result: 'Replace NUM and NUM', Replacements: 2

157

158

# Limited replacements with count

159

result, count = regex.subn(r'\w+', 'WORD', 'one two three four', count=2)

160

print(f"Result: '{result}', Replacements: {count}")

161

# Result: 'WORD WORD three four', Replacements: 2

162

163

# Check if any replacements were made

164

original = 'No numbers here'

165

result, count = regex.subn(r'\d+', 'NUM', original)

166

if count == 0:

167

print("No substitutions were made")

168

else:

169

print(f"Made {count} substitutions: {result}")

170

```

171

172

### Format-Based Substitution with Count

173

174

Combine format-based replacement with substitution counting for complete replacement operation tracking.

175

176

```python { .api }

177

def subfn(pattern, format, string, count=0, flags=0, pos=None, endpos=None,

178

concurrent=None, timeout=None, ignore_unused=False, **kwargs):

179

"""

180

Same as subf but also return the number of substitutions made.

181

182

Args:

183

pattern (str): Regular expression pattern to find

184

format (str or callable): Format string or function using Python format syntax

185

string (str): String to perform substitutions on

186

count (int, optional): Maximum number of replacements (0 = all)

187

flags (int, optional): Regex flags to modify matching behavior

188

pos (int, optional): Start position for searching (default: 0)

189

endpos (int, optional): End position for searching (default: len(string))

190

concurrent (bool, optional): Release GIL during matching for multithreading

191

timeout (float, optional): Timeout in seconds for matching operation

192

ignore_unused (bool, optional): Ignore unused keyword arguments

193

**kwargs: Additional pattern compilation arguments

194

195

Returns:

196

tuple: (formatted_string, substitution_count)

197

"""

198

```

199

200

**Usage Examples:**

201

202

```python

203

import regex

204

205

# Format-based substitution with count

206

pattern = r'(?P<name>\w+): (?P<value>\d+)'

207

format_str = '{name}={value}'

208

result, count = regex.subfn(pattern, format_str, 'width: 100, height: 200')

209

print(f"Result: '{result}', Replacements: {count}")

210

# Result: 'width=100, height=200', Replacements: 2

211

212

# Track format replacements

213

def format_phone(match):

214

area = match.group(1)

215

number = match.group(2)

216

return f"({area}) {number[:3]}-{number[3:]}"

217

218

pattern = r'(\d{3})(\d{7})'

219

text = 'Call 5551234567 or 8009876543'

220

result, count = regex.subfn(pattern, format_phone, text)

221

print(f"Formatted {count} phone numbers: {result}")

222

# Formatted 2 phone numbers: Call (555) 123-4567 or (800) 987-6543

223

```

224

225

## Advanced Substitution Features

226

227

### Replacement Functions

228

229

Replacement functions receive a Match object and can perform complex transformations:

230

231

```python

232

def smart_replace(match):

233

value = match.group()

234

if value.isdigit():

235

return str(int(value) * 2) # Double numbers

236

else:

237

return value.upper() # Uppercase text

238

239

result = regex.sub(r'\w+', smart_replace, 'test 123 hello 456')

240

print(result) # 'TEST 246 HELLO 912'

241

```

242

243

### Conditional Replacements

244

245

Use Match object properties for conditional replacements:

246

247

```python

248

def conditional_replace(match):

249

word = match.group()

250

if len(word) > 4:

251

return word.upper()

252

else:

253

return word.lower()

254

255

result = regex.sub(r'\b\w+\b', conditional_replace, 'Hello World Test')

256

print(result) # 'hello WORLD test'

257

```

258

259

### Position-Aware Replacements

260

261

Access match position information in replacement functions:

262

263

```python

264

def position_replace(match):

265

start = match.start()

266

text = match.group()

267

return f"{text}@{start}"

268

269

result = regex.sub(r'\w+', position_replace, 'one two three')

270

print(result) # 'one@0 two@4 three@8'

271

```

272

273

### Reverse Pattern Substitution

274

275

Use the REVERSE flag for right-to-left pattern matching:

276

277

```python

278

# Replace from right to left

279

result = regex.sub(r'\d+', 'X', '123abc456def789', flags=regex.REVERSE, count=2)

280

print(result) # '123abc456defX' (replaces from right)

281

```

282

283

### Fuzzy Pattern Substitution

284

285

Combine fuzzy matching with substitutions:

286

287

```python

288

# Replace approximate matches

289

pattern = r'(?e)(hello){e<=1}' # Allow 1 error

290

result = regex.sub(pattern, 'hi', 'helo world, hallo there')

291

print(result) # 'hi world, hi there'

292

```

293

294

### Concurrent Substitution

295

296

Enable concurrent execution for performance with large texts:

297

298

```python

299

# Process large text with concurrent execution

300

large_text = "..." * 10000 # Large text

301

result = regex.sub(r'\w+', 'WORD', large_text, concurrent=True)

302

303

# Set timeout for potentially slow operations

304

try:

305

result = regex.sub(complex_pattern, replacement, text, timeout=5.0)

306

except regex.error as e:

307

print(f"Substitution timed out: {e}")

308

```