0
# Advanced Matching Features
1
2
Specialized matching capabilities including recursive/balanced delimiters and chained matching for complex text processing scenarios.
3
4
## Capabilities
5
6
### Recursive Matching
7
8
Matches balanced delimiters with comprehensive configuration options.
9
10
```javascript { .api }
11
/**
12
* Returns array of match strings between outermost left and right delimiters
13
* @param str - String to search
14
* @param left - Left delimiter as XRegExp pattern
15
* @param right - Right delimiter as XRegExp pattern
16
* @param flags - Any combination of XRegExp flags for delimiters
17
* @param options - Configuration options for matching behavior
18
* @returns Array of matches or detailed match objects
19
*/
20
function matchRecursive(str: string, left: string, right: string, flags?: string, options?: MatchRecursiveOptions): string[] | MatchRecursiveValueNameMatch[];
21
22
interface MatchRecursiveOptions {
23
/** Single char used to escape delimiters within the subject string */
24
escapeChar?: string;
25
/** Array of 4 strings naming the parts to return in extended mode */
26
valueNames?: [string | null, string | null, string | null, string | null];
27
/** Handling mode for unbalanced delimiters */
28
unbalanced?: 'error' | 'skip' | 'skip-lazy';
29
}
30
31
interface MatchRecursiveValueNameMatch {
32
name: string;
33
value: string;
34
start: number;
35
end: number;
36
}
37
```
38
39
**Usage Examples:**
40
41
```javascript
42
// Basic balanced delimiter matching
43
const str1 = '(t((e))s)t()(ing)';
44
XRegExp.matchRecursive(str1, '\\\\(', '\\\\)', 'g');
45
// Result: ['t((e))s', '', 'ing']
46
47
// Extended information mode with valueNames
48
const str2 = 'Here is <div> <div>an</div></div> example';
49
XRegExp.matchRecursive(str2, '<div\\\\s*>', '</div>', 'gi', {
50
valueNames: ['between', 'left', 'match', 'right']
51
});
52
// Result: [
53
// {name: 'between', value: 'Here is ', start: 0, end: 8},
54
// {name: 'left', value: '<div>', start: 8, end: 13},
55
// {name: 'match', value: ' <div>an</div>', start: 13, end: 27},
56
// {name: 'right', value: '</div>', start: 27, end: 33},
57
// {name: 'between', value: ' example', start: 33, end: 41}
58
// ]
59
60
// Using escape characters
61
const str3 = '...{1}.\\\\{{function(x,y){return {y:x}}}';
62
XRegExp.matchRecursive(str3, '{', '}', 'g', {
63
valueNames: ['literal', null, 'value', null],
64
escapeChar: '\\\\'
65
});
66
// Result: [
67
// {name: 'literal', value: '...', start: 0, end: 3},
68
// {name: 'value', value: '1', start: 4, end: 5},
69
// {name: 'literal', value: '.\\\\{', start: 6, end: 9},
70
// {name: 'value', value: 'function(x,y){return {y:x}}', start: 10, end: 37}
71
// ]
72
```
73
74
### Chain Matching
75
76
Chains regexes for successive matching within previous results.
77
78
```javascript { .api }
79
/**
80
* Retrieves matches from searching using a chain of regexes
81
* @param str - String to search
82
* @param chain - Array of regexes or objects with regex and backref properties
83
* @returns Matches by the last regex in the chain, or empty array
84
*/
85
function matchChain(str: string, chain: (RegExp | ChainArrayElement)[]): string[];
86
87
interface ChainArrayElement {
88
/** The regex to use */
89
regex: RegExp;
90
/** The named or numbered backreference to pass forward */
91
backref: number | string;
92
}
93
```
94
95
**Usage Examples:**
96
97
```javascript
98
// Basic usage - matches numbers within <b> tags
99
XRegExp.matchChain('1 <b>2</b> 3 <b>4 a 56</b>', [
100
XRegExp('(?is)<b>.*?</b>'),
101
/\\d+/
102
]);
103
// Result: ['2', '4', '56']
104
105
// Passing forward and returning specific backreferences
106
const html = \`<a href="http://xregexp.com/api/">XRegExp</a>
107
<a href="http://www.google.com/">Google</a>\`;
108
109
XRegExp.matchChain(html, [
110
{regex: /<a href="([^"]+)">/i, backref: 1},
111
{regex: XRegExp('(?i)^https?://(?<domain>[^/?#]+)'), backref: 'domain'}
112
]);
113
// Result: ['xregexp.com', 'www.google.com']
114
115
// Multi-step extraction
116
const data = 'user:john@example.com, user:jane@test.org';
117
XRegExp.matchChain(data, [
118
/user:([^,]+)/g, // Extract user entries
119
/@([^\\s]+)/, // Extract domain from each entry
120
/([^.]+)\\./ // Extract domain name without TLD
121
]);
122
// Result: ['example', 'test']
123
```
124
125
## Recursive Matching Features
126
127
### Value Names Configuration
128
129
Control what parts of matches are returned:
130
131
```javascript
132
// valueNames array maps to:
133
// [0] - String segments outside matches (before, between, after)
134
// [1] - Matched left delimiters
135
// [2] - Content between outermost delimiters
136
// [3] - Matched right delimiters
137
138
// Get only content between delimiters
139
XRegExp.matchRecursive(str, '{', '}', 'g', {
140
valueNames: [null, null, 'content', null]
141
});
142
143
// Get everything except right delimiters
144
XRegExp.matchRecursive(str, '{', '}', 'g', {
145
valueNames: ['between', 'left', 'content', null]
146
});
147
```
148
149
### Escape Character Handling
150
151
Handle escaped delimiters within content:
152
153
```javascript
154
const code = 'var obj = {key: "value\\\\}", nested: {inner: true}};';
155
156
XRegExp.matchRecursive(code, '{', '}', 'g', {
157
escapeChar: '\\\\',
158
valueNames: [null, null, 'content', null]
159
});
160
// Properly handles escaped } in string literal
161
```
162
163
### Unbalanced Delimiter Handling
164
165
Configure behavior for unbalanced delimiters:
166
167
```javascript
168
const unbalanced = 'Here is <div> <div>content</div> missing close';
169
170
// Error on unbalanced (default)
171
try {
172
XRegExp.matchRecursive(unbalanced, '<div>', '</div>', 'gi');
173
} catch (e) {
174
console.log('Unbalanced delimiter error');
175
}
176
177
// Skip unbalanced delimiters
178
XRegExp.matchRecursive(unbalanced, '<div>', '</div>', 'gi', {
179
unbalanced: 'skip'
180
});
181
// Result: ['content'] (skips unbalanced opening div)
182
183
// Skip lazily (minimal advancement)
184
XRegExp.matchRecursive(unbalanced, '<div>', '</div>', 'gi', {
185
unbalanced: 'skip-lazy'
186
});
187
```
188
189
### Sticky Mode Support
190
191
Use flag `y` for sticky matching:
192
193
```javascript
194
const str = '<1><<<2>>><3>4<5>';
195
XRegExp.matchRecursive(str, '<', '>', 'gy');
196
// Result: ['1', '<<2>>', '3']
197
// Stops at first non-match due to sticky mode
198
```
199
200
## Chain Matching Features
201
202
### Backreference Forwarding
203
204
Pass specific capture groups between chain stages:
205
206
```javascript
207
// Forward numbered backreferences
208
const chain1 = [
209
/(\\w+):(\\w+)/g, // Capture key:value pairs
210
{regex: /./, backref: 2} // Forward only the value part
211
];
212
213
// Forward named backreferences
214
const chain2 = [
215
{regex: XRegExp('(?<protocol>https?)://(?<domain>[^/]+)'), backref: 'domain'},
216
{regex: /([^.]+)/, backref: 1} // Extract subdomain
217
];
218
```
219
220
### Multi-level Processing
221
222
Chain multiple processing stages:
223
224
```javascript
225
const logData = \`
226
[INFO] 2021-01-15 Server started on port 8080
227
[ERROR] 2021-01-15 Database connection failed
228
[DEBUG] 2021-01-15 Cache initialized
229
\`;
230
231
// Extract error messages with timestamps
232
XRegExp.matchChain(logData, [
233
/\\[ERROR\\]([^\\n]+)/g, // Find error lines
234
/\\d{4}-\\d{2}-\\d{2}(.+)/, // Extract message after date
235
/\\s*(.+)/ // Trim leading whitespace
236
]);
237
```
238
239
### Error Handling
240
241
Chain matching provides clear error messages for invalid backreferences:
242
243
```javascript
244
try {
245
XRegExp.matchChain('test', [
246
/(\\w+)/,
247
{regex: /\\w/, backref: 2} // Invalid - no group 2
248
]);
249
} catch (e) {
250
console.log(e.message); // "Backreference to undefined group: 2"
251
}
252
253
try {
254
XRegExp.matchChain('test', [
255
XRegExp('(?<word>\\\\w+)'),
256
{regex: /\\w/, backref: 'missing'} // Invalid - no 'missing' group
257
]);
258
} catch (e) {
259
console.log(e.message); // "Backreference to undefined group: missing"
260
}
261
```
262
263
## Advanced Use Cases
264
265
### Parsing Nested Structures
266
267
Extract content from nested markup or code:
268
269
```javascript
270
// Parse nested function calls
271
const code = 'outer(inner(a, b), middle(c), last)';
272
const args = XRegExp.matchRecursive(code, '\\\\(', '\\\\)', 'g');
273
// Result: ['inner(a, b), middle(c), last']
274
275
// Parse JSON-like structures
276
const json = '{name: "John", data: {age: 30, city: "NYC"}}';
277
const objects = XRegExp.matchRecursive(json, '{', '}', 'g');
278
```
279
280
### Multi-stage Text Processing
281
282
Combine recursive and chain matching:
283
284
```javascript
285
// Extract and process HTML attributes
286
const html = '<div class="highlight active" data-id="123">Content</div>';
287
288
// First extract tag attributes, then process each
289
XRegExp.matchChain(html, [
290
/<[^>]+>/, // Get the full tag
291
/\\s([^>]+)/, // Extract attributes portion
292
/([\\w-]+)="([^"]*)"/g // Extract name="value" pairs
293
]);
294
```
295
296
### Template Processing
297
298
Parse template syntax with balanced delimiters:
299
300
```javascript
301
const template = 'Hello {{user.name}}, you have {{#if messages}}{{messages.length}}{{/if}} messages';
302
303
// Extract template expressions
304
const expressions = XRegExp.matchRecursive(template, '{{', '}}', 'g');
305
// Result: ['user.name', '#if messages', 'messages.length', '/if']
306
307
// Process expressions further
308
const conditionals = expressions.filter(expr => expr.startsWith('#if'));
309
```