Regular expression parser for ECMAScript that generates Abstract Syntax Trees from regex patterns.
npx @tessl/cli install tessl/npm-regexpp@3.2.00
# RegexPP
1
2
RegexPP is a comprehensive regular expression parser and validator for ECMAScript that generates Abstract Syntax Trees (AST) from regex patterns. It provides three main components: RegExpParser for parsing regex literals and patterns into ASTs, RegExpValidator for validating regex syntax against ECMAScript standards, and RegExpVisitor for traversing and manipulating regex ASTs. The library supports all modern ECMAScript regex features including Unicode handling, named capture groups, lookbehind assertions, and other ES2015-ES2022 regex enhancements.
3
4
## Package Information
5
6
- **Package Name**: regexpp
7
- **Package Type**: npm
8
- **Language**: TypeScript
9
- **Installation**: `npm install regexpp`
10
11
## Core Imports
12
13
```typescript
14
// Main exports
15
import {
16
AST,
17
RegExpParser,
18
RegExpValidator,
19
parseRegExpLiteral,
20
validateRegExpLiteral,
21
visitRegExpAST
22
} from "regexpp";
23
24
// For direct RegExpVisitor usage (not available from main module)
25
import { RegExpVisitor } from "regexpp/visitor";
26
```
27
28
For CommonJS:
29
30
```javascript
31
// Main exports
32
const {
33
AST,
34
RegExpParser,
35
RegExpValidator,
36
parseRegExpLiteral,
37
validateRegExpLiteral,
38
visitRegExpAST
39
} = require("regexpp");
40
41
// For direct RegExpVisitor usage (not available from main module)
42
const { RegExpVisitor } = require("regexpp/visitor");
43
```
44
45
## Basic Usage
46
47
```typescript
48
import { parseRegExpLiteral, validateRegExpLiteral, visitRegExpAST } from "regexpp";
49
50
// Parse a regex literal to AST
51
const ast = parseRegExpLiteral("/[a-z]+/gi");
52
console.log(ast.type); // "RegExpLiteral"
53
console.log(ast.flags.global); // true
54
console.log(ast.flags.ignoreCase); // true
55
56
// Validate regex syntax
57
try {
58
validateRegExpLiteral("/[a-z]+/gi");
59
console.log("Valid regex");
60
} catch (error) {
61
console.log("Invalid regex:", error.message);
62
}
63
64
// Visit AST nodes
65
visitRegExpAST(ast, {
66
onCharacterClassEnter(node) {
67
console.log("Found character class:", node.raw);
68
},
69
onQuantifierEnter(node) {
70
console.log("Found quantifier:", node.raw);
71
}
72
});
73
```
74
75
## Architecture
76
77
RegexPP is built around several key components:
78
79
- **Core API**: Three convenience functions (`parseRegExpLiteral`, `validateRegExpLiteral`, `visitRegExpAST`) for common use cases
80
- **Parser System**: `RegExpParser` class providing full parsing functionality with AST generation
81
- **Validation System**: `RegExpValidator` class for syntax validation with detailed callbacks
82
- **Visitor Pattern**: `RegExpVisitor` class for AST traversal and manipulation
83
- **Type System**: Comprehensive AST node interfaces covering all ECMAScript regex features
84
- **Unicode Support**: Full Unicode property support with ES2015-ES2022 compliance
85
86
## Capabilities
87
88
### Core Parsing Functions
89
90
Three main convenience functions that provide simple access to the most common regex processing operations.
91
92
```typescript { .api }
93
/**
94
* Parse a given regular expression literal then make AST object
95
* @param source - The source code to parse (string or RegExp)
96
* @param options - The parsing options
97
* @returns The AST of the regular expression
98
*/
99
function parseRegExpLiteral(
100
source: string | RegExp,
101
options?: RegExpParser.Options,
102
): AST.RegExpLiteral;
103
104
/**
105
* Validate a given regular expression literal
106
* @param source - The source code to validate
107
* @param options - The validation options
108
*/
109
function validateRegExpLiteral(
110
source: string,
111
options?: RegExpValidator.Options,
112
): void;
113
114
/**
115
* Visit each node of a given AST
116
* @param node - The AST to visit
117
* @param handlers - The visitor callbacks
118
*/
119
function visitRegExpAST(
120
node: AST.Node,
121
handlers: RegExpVisitor.Handlers,
122
): void;
123
```
124
125
[Core Functions](./core-functions.md)
126
127
### Regular Expression Parsing
128
129
Advanced parsing functionality for converting regex strings into detailed Abstract Syntax Trees with full ECMAScript compliance.
130
131
```typescript { .api }
132
class RegExpParser {
133
constructor(options?: RegExpParser.Options);
134
135
parseLiteral(source: string, start?: number, end?: number): AST.RegExpLiteral;
136
parsePattern(source: string, start?: number, end?: number, uFlag?: boolean): AST.Pattern;
137
parseFlags(source: string, start?: number, end?: number): AST.Flags;
138
}
139
140
interface RegExpParser.Options {
141
/** Disable Annex B syntax. Default is false */
142
strict?: boolean;
143
/** ECMAScript version. Default is 2022 */
144
ecmaVersion?: EcmaVersion;
145
}
146
```
147
148
[Parsing](./parsing.md)
149
150
### Regular Expression Validation
151
152
Syntax validation with optional detailed callbacks for each regex component during validation.
153
154
```typescript { .api }
155
class RegExpValidator {
156
constructor(options?: RegExpValidator.Options);
157
158
validateLiteral(source: string, start?: number, end?: number): void;
159
validatePattern(source: string, start?: number, end?: number, uFlag?: boolean): void;
160
validateFlags(source: string, start?: number, end?: number): void;
161
}
162
163
interface RegExpValidator.Options {
164
/** Disable Annex B syntax. Default is false */
165
strict?: boolean;
166
/** ECMAScript version. Default is 2022 */
167
ecmaVersion?: EcmaVersion;
168
// Plus many optional callback functions for validation events
169
}
170
```
171
172
[Validation](./validation.md)
173
174
### AST Traversal and Manipulation
175
176
Visitor pattern implementation for traversing and manipulating regex Abstract Syntax Trees.
177
178
```typescript { .api }
179
class RegExpVisitor {
180
constructor(handlers: RegExpVisitor.Handlers);
181
182
visit(node: AST.Node): void;
183
}
184
185
interface RegExpVisitor.Handlers {
186
// Optional callback functions for entering/leaving each AST node type
187
onRegExpLiteralEnter?(node: AST.RegExpLiteral): void;
188
onRegExpLiteralLeave?(node: AST.RegExpLiteral): void;
189
onPatternEnter?(node: AST.Pattern): void;
190
onPatternLeave?(node: AST.Pattern): void;
191
// ... many more callback options
192
}
193
```
194
195
[Visitor Pattern](./visitor.md)
196
197
### AST Node Types and Structures
198
199
Comprehensive type system covering all ECMAScript regular expression syntax elements.
200
201
```typescript { .api }
202
// Core node types
203
type AST.Node = AST.BranchNode | AST.LeafNode;
204
type AST.BranchNode = AST.RegExpLiteral | AST.Pattern | AST.Alternative | /* ... */;
205
type AST.LeafNode = AST.BoundaryAssertion | AST.CharacterSet | /* ... */;
206
207
// Key interfaces
208
interface AST.RegExpLiteral extends AST.NodeBase {
209
type: "RegExpLiteral";
210
pattern: AST.Pattern;
211
flags: AST.Flags;
212
}
213
214
interface AST.Pattern extends AST.NodeBase {
215
type: "Pattern";
216
alternatives: AST.Alternative[];
217
}
218
```
219
220
[AST Types](./ast-types.md)
221
222
## Types
223
224
```typescript { .api }
225
// ECMAScript version support
226
type EcmaVersion = 5 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 | 2022;
227
228
// Base interface for all AST nodes
229
interface AST.NodeBase {
230
type: string;
231
parent: AST.Node | null;
232
start: number;
233
end: number;
234
raw: string;
235
}
236
```