or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

advanced-matching.mdconstruction.mdexecution.mdextensibility.mdindex.mdpattern-building.mdstring-processing.mdunicode-support.md

advanced-matching.mddocs/

0

# Advanced Matching Features

1

2

Specialized matching capabilities including recursive/balanced delimiters and chained matching for complex text processing scenarios.

3

4

## Capabilities

5

6

### Recursive Matching

7

8

Matches balanced delimiters with comprehensive configuration options.

9

10

```javascript { .api }

11

/**

12

* Returns array of match strings between outermost left and right delimiters

13

* @param str - String to search

14

* @param left - Left delimiter as XRegExp pattern

15

* @param right - Right delimiter as XRegExp pattern

16

* @param flags - Any combination of XRegExp flags for delimiters

17

* @param options - Configuration options for matching behavior

18

* @returns Array of matches or detailed match objects

19

*/

20

function matchRecursive(str: string, left: string, right: string, flags?: string, options?: MatchRecursiveOptions): string[] | MatchRecursiveValueNameMatch[];

21

22

interface MatchRecursiveOptions {

23

/** Single char used to escape delimiters within the subject string */

24

escapeChar?: string;

25

/** Array of 4 strings naming the parts to return in extended mode */

26

valueNames?: [string | null, string | null, string | null, string | null];

27

/** Handling mode for unbalanced delimiters */

28

unbalanced?: 'error' | 'skip' | 'skip-lazy';

29

}

30

31

interface MatchRecursiveValueNameMatch {

32

name: string;

33

value: string;

34

start: number;

35

end: number;

36

}

37

```

38

39

**Usage Examples:**

40

41

```javascript

42

// Basic balanced delimiter matching

43

const str1 = '(t((e))s)t()(ing)';

44

XRegExp.matchRecursive(str1, '\\\\(', '\\\\)', 'g');

45

// Result: ['t((e))s', '', 'ing']

46

47

// Extended information mode with valueNames

48

const str2 = 'Here is <div> <div>an</div></div> example';

49

XRegExp.matchRecursive(str2, '<div\\\\s*>', '</div>', 'gi', {

50

valueNames: ['between', 'left', 'match', 'right']

51

});

52

// Result: [

53

// {name: 'between', value: 'Here is ', start: 0, end: 8},

54

// {name: 'left', value: '<div>', start: 8, end: 13},

55

// {name: 'match', value: ' <div>an</div>', start: 13, end: 27},

56

// {name: 'right', value: '</div>', start: 27, end: 33},

57

// {name: 'between', value: ' example', start: 33, end: 41}

58

// ]

59

60

// Using escape characters

61

const str3 = '...{1}.\\\\{{function(x,y){return {y:x}}}';

62

XRegExp.matchRecursive(str3, '{', '}', 'g', {

63

valueNames: ['literal', null, 'value', null],

64

escapeChar: '\\\\'

65

});

66

// Result: [

67

// {name: 'literal', value: '...', start: 0, end: 3},

68

// {name: 'value', value: '1', start: 4, end: 5},

69

// {name: 'literal', value: '.\\\\{', start: 6, end: 9},

70

// {name: 'value', value: 'function(x,y){return {y:x}}', start: 10, end: 37}

71

// ]

72

```

73

74

### Chain Matching

75

76

Chains regexes for successive matching within previous results.

77

78

```javascript { .api }

79

/**

80

* Retrieves matches from searching using a chain of regexes

81

* @param str - String to search

82

* @param chain - Array of regexes or objects with regex and backref properties

83

* @returns Matches by the last regex in the chain, or empty array

84

*/

85

function matchChain(str: string, chain: (RegExp | ChainArrayElement)[]): string[];

86

87

interface ChainArrayElement {

88

/** The regex to use */

89

regex: RegExp;

90

/** The named or numbered backreference to pass forward */

91

backref: number | string;

92

}

93

```

94

95

**Usage Examples:**

96

97

```javascript

98

// Basic usage - matches numbers within <b> tags

99

XRegExp.matchChain('1 <b>2</b> 3 <b>4 a 56</b>', [

100

XRegExp('(?is)<b>.*?</b>'),

101

/\\d+/

102

]);

103

// Result: ['2', '4', '56']

104

105

// Passing forward and returning specific backreferences

106

const html = \`<a href="http://xregexp.com/api/">XRegExp</a>

107

<a href="http://www.google.com/">Google</a>\`;

108

109

XRegExp.matchChain(html, [

110

{regex: /<a href="([^"]+)">/i, backref: 1},

111

{regex: XRegExp('(?i)^https?://(?<domain>[^/?#]+)'), backref: 'domain'}

112

]);

113

// Result: ['xregexp.com', 'www.google.com']

114

115

// Multi-step extraction

116

const data = 'user:john@example.com, user:jane@test.org';

117

XRegExp.matchChain(data, [

118

/user:([^,]+)/g, // Extract user entries

119

/@([^\\s]+)/, // Extract domain from each entry

120

/([^.]+)\\./ // Extract domain name without TLD

121

]);

122

// Result: ['example', 'test']

123

```

124

125

## Recursive Matching Features

126

127

### Value Names Configuration

128

129

Control what parts of matches are returned:

130

131

```javascript

132

// valueNames array maps to:

133

// [0] - String segments outside matches (before, between, after)

134

// [1] - Matched left delimiters

135

// [2] - Content between outermost delimiters

136

// [3] - Matched right delimiters

137

138

// Get only content between delimiters

139

XRegExp.matchRecursive(str, '{', '}', 'g', {

140

valueNames: [null, null, 'content', null]

141

});

142

143

// Get everything except right delimiters

144

XRegExp.matchRecursive(str, '{', '}', 'g', {

145

valueNames: ['between', 'left', 'content', null]

146

});

147

```

148

149

### Escape Character Handling

150

151

Handle escaped delimiters within content:

152

153

```javascript

154

const code = 'var obj = {key: "value\\\\}", nested: {inner: true}};';

155

156

XRegExp.matchRecursive(code, '{', '}', 'g', {

157

escapeChar: '\\\\',

158

valueNames: [null, null, 'content', null]

159

});

160

// Properly handles escaped } in string literal

161

```

162

163

### Unbalanced Delimiter Handling

164

165

Configure behavior for unbalanced delimiters:

166

167

```javascript

168

const unbalanced = 'Here is <div> <div>content</div> missing close';

169

170

// Error on unbalanced (default)

171

try {

172

XRegExp.matchRecursive(unbalanced, '<div>', '</div>', 'gi');

173

} catch (e) {

174

console.log('Unbalanced delimiter error');

175

}

176

177

// Skip unbalanced delimiters

178

XRegExp.matchRecursive(unbalanced, '<div>', '</div>', 'gi', {

179

unbalanced: 'skip'

180

});

181

// Result: ['content'] (skips unbalanced opening div)

182

183

// Skip lazily (minimal advancement)

184

XRegExp.matchRecursive(unbalanced, '<div>', '</div>', 'gi', {

185

unbalanced: 'skip-lazy'

186

});

187

```

188

189

### Sticky Mode Support

190

191

Use flag `y` for sticky matching:

192

193

```javascript

194

const str = '<1><<<2>>><3>4<5>';

195

XRegExp.matchRecursive(str, '<', '>', 'gy');

196

// Result: ['1', '<<2>>', '3']

197

// Stops at first non-match due to sticky mode

198

```

199

200

## Chain Matching Features

201

202

### Backreference Forwarding

203

204

Pass specific capture groups between chain stages:

205

206

```javascript

207

// Forward numbered backreferences

208

const chain1 = [

209

/(\\w+):(\\w+)/g, // Capture key:value pairs

210

{regex: /./, backref: 2} // Forward only the value part

211

];

212

213

// Forward named backreferences

214

const chain2 = [

215

{regex: XRegExp('(?<protocol>https?)://(?<domain>[^/]+)'), backref: 'domain'},

216

{regex: /([^.]+)/, backref: 1} // Extract subdomain

217

];

218

```

219

220

### Multi-level Processing

221

222

Chain multiple processing stages:

223

224

```javascript

225

const logData = \`

226

[INFO] 2021-01-15 Server started on port 8080

227

[ERROR] 2021-01-15 Database connection failed

228

[DEBUG] 2021-01-15 Cache initialized

229

\`;

230

231

// Extract error messages with timestamps

232

XRegExp.matchChain(logData, [

233

/\\[ERROR\\]([^\\n]+)/g, // Find error lines

234

/\\d{4}-\\d{2}-\\d{2}(.+)/, // Extract message after date

235

/\\s*(.+)/ // Trim leading whitespace

236

]);

237

```

238

239

### Error Handling

240

241

Chain matching provides clear error messages for invalid backreferences:

242

243

```javascript

244

try {

245

XRegExp.matchChain('test', [

246

/(\\w+)/,

247

{regex: /\\w/, backref: 2} // Invalid - no group 2

248

]);

249

} catch (e) {

250

console.log(e.message); // "Backreference to undefined group: 2"

251

}

252

253

try {

254

XRegExp.matchChain('test', [

255

XRegExp('(?<word>\\\\w+)'),

256

{regex: /\\w/, backref: 'missing'} // Invalid - no 'missing' group

257

]);

258

} catch (e) {

259

console.log(e.message); // "Backreference to undefined group: missing"

260

}

261

```

262

263

## Advanced Use Cases

264

265

### Parsing Nested Structures

266

267

Extract content from nested markup or code:

268

269

```javascript

270

// Parse nested function calls

271

const code = 'outer(inner(a, b), middle(c), last)';

272

const args = XRegExp.matchRecursive(code, '\\\\(', '\\\\)', 'g');

273

// Result: ['inner(a, b), middle(c), last']

274

275

// Parse JSON-like structures

276

const json = '{name: "John", data: {age: 30, city: "NYC"}}';

277

const objects = XRegExp.matchRecursive(json, '{', '}', 'g');

278

```

279

280

### Multi-stage Text Processing

281

282

Combine recursive and chain matching:

283

284

```javascript

285

// Extract and process HTML attributes

286

const html = '<div class="highlight active" data-id="123">Content</div>';

287

288

// First extract tag attributes, then process each

289

XRegExp.matchChain(html, [

290

/<[^>]+>/, // Get the full tag

291

/\\s([^>]+)/, // Extract attributes portion

292

/([\\w-]+)="([^"]*)"/g // Extract name="value" pairs

293

]);

294

```

295

296

### Template Processing

297

298

Parse template syntax with balanced delimiters:

299

300

```javascript

301

const template = 'Hello {{user.name}}, you have {{#if messages}}{{messages.length}}{{/if}} messages';

302

303

// Extract template expressions

304

const expressions = XRegExp.matchRecursive(template, '{{', '}}', 'g');

305

// Result: ['user.name', '#if messages', 'messages.length', '/if']

306

307

// Process expressions further

308

const conditionals = expressions.filter(expr => expr.startsWith('#if'));

309

```