0
# Ret
1
2
Ret is a TypeScript library that tokenizes regular expression strings into structured AST-like representations, providing comprehensive parsing and reconstruction capabilities for regex analysis, transformation, and validation tools.
3
4
## Package Information
5
6
- **Package Name**: ret
7
- **Package Type**: npm
8
- **Language**: TypeScript
9
- **Installation**: `npm install ret`
10
11
## Core Imports
12
13
```typescript
14
import ret, { tokenizer, reconstruct, types } from "ret";
15
// Character sets are not directly exported from main module
16
// Import them separately if needed:
17
// import { words, ints, whitespace } from "ret/dist/sets";
18
```
19
20
For CommonJS:
21
22
```javascript
23
const ret = require("ret");
24
const { types, reconstruct } = ret;
25
// ret is the tokenizer function
26
// ret.types and ret.reconstruct are also available
27
28
// For character set utilities:
29
const sets = require("ret/dist/sets");
30
const { words, ints, whitespace, notWords, notInts, notWhitespace, anyChar } = sets;
31
```
32
33
## Basic Usage
34
35
```typescript
36
import ret, { reconstruct, types } from "ret";
37
38
// Tokenize a regular expression
39
const tokens = ret(/foo|bar/.source);
40
// or: const tokens = tokenizer(/foo|bar/.source);
41
42
// Tokens structure:
43
// {
44
// "type": types.ROOT,
45
// "options": [
46
// [ { "type": types.CHAR, "value": 102 }, // 'f'
47
// { "type": types.CHAR, "value": 111 }, // 'o'
48
// { "type": types.CHAR, "value": 111 } ],// 'o'
49
// [ { "type": types.CHAR, "value": 98 }, // 'b'
50
// { "type": types.CHAR, "value": 97 }, // 'a'
51
// { "type": types.CHAR, "value": 114 } ] // 'r'
52
// ]
53
// }
54
55
// Reconstruct regex from tokens
56
const regexString = reconstruct(tokens); // "foo|bar"
57
58
// Working with character sets
59
import { words, ints } from "ret/dist/sets";
60
61
const wordToken = words(); // Equivalent to \w
62
const digitToken = ints(); // Equivalent to \d
63
64
reconstruct(wordToken); // "\\w"
65
reconstruct(digitToken); // "\\d"
66
```
67
68
## Architecture
69
70
Ret is built around several key components:
71
72
- **Tokenizer**: Core parser that converts regex strings into structured token trees
73
- **Type System**: Comprehensive TypeScript types for all token variants (characters, groups, sets, repetitions, etc.)
74
- **Reconstruction**: Converts token structures back to valid regex strings
75
- **Character Sets**: Predefined character class utilities (digits, words, whitespace, etc.)
76
- **Utilities**: Helper functions for string processing and character class parsing
77
78
## Capabilities
79
80
### Regex Tokenization
81
82
Core tokenization functionality that parses regular expression strings into structured token representations. Handles all regex features including groups, character classes, quantifiers, and lookarounds.
83
84
```typescript { .api }
85
function tokenizer(regexpStr: string): Root;
86
```
87
88
[Tokenization](./tokenization.md)
89
90
### Token Reconstruction
91
92
Converts token structures back into valid regular expression strings, enabling regex transformation and analysis workflows.
93
94
```typescript { .api }
95
function reconstruct(token: Tokens): string;
96
```
97
98
[Reconstruction](./reconstruction.md)
99
100
### Character Set Utilities
101
102
Predefined character class utilities for generating common regex character sets programmatically.
103
104
```typescript { .api }
105
function words(): Set;
106
function notWords(): Set;
107
function ints(): Set;
108
function notInts(): Set;
109
function whitespace(): Set;
110
function notWhitespace(): Set;
111
function anyChar(): Set;
112
```
113
114
[Character Sets](./character-sets.md)
115
116
## Core Types
117
118
```typescript { .api }
119
enum types {
120
ROOT,
121
GROUP,
122
POSITION,
123
SET,
124
RANGE,
125
REPETITION,
126
REFERENCE,
127
CHAR
128
}
129
130
interface Root {
131
type: types.ROOT;
132
stack?: Token[];
133
options?: Token[][];
134
flags?: string[];
135
}
136
137
interface Group {
138
type: types.GROUP;
139
stack?: Token[];
140
options?: Token[][];
141
remember: boolean;
142
followedBy?: boolean;
143
notFollowedBy?: boolean;
144
lookBehind?: boolean;
145
name?: string;
146
}
147
148
interface Set {
149
type: types.SET;
150
set: SetTokens;
151
not: boolean;
152
}
153
154
interface Repetition {
155
type: types.REPETITION;
156
min: number;
157
max: number;
158
value: Token;
159
}
160
161
interface Position {
162
type: types.POSITION;
163
value: '$' | '^' | 'b' | 'B';
164
}
165
166
interface Reference {
167
type: types.REFERENCE;
168
value: number;
169
}
170
171
interface Char {
172
type: types.CHAR;
173
value: number;
174
}
175
176
interface Range {
177
type: types.RANGE;
178
from: number;
179
to: number;
180
}
181
182
type Token = Group | Position | Set | Range | Repetition | Reference | Char;
183
type Tokens = Root | Token;
184
type SetTokens = (Range | Char | Set)[];
185
```