0
# Parsing Expressions
1
2
Low-level parsing expression classes that form the building blocks of Ohm grammars. These classes represent the fundamental parsing operations defined in parsing expression grammars (PEGs).
3
4
## Imports
5
6
```javascript
7
import { pexprs, grammar } from "ohm-js";
8
```
9
10
For TypeScript:
11
12
```typescript
13
import {
14
pexprs,
15
grammar,
16
Grammar,
17
PExpr,
18
Terminal,
19
Range,
20
Param,
21
Alt,
22
Extend,
23
Splice,
24
Seq,
25
Iter,
26
Star,
27
Plus,
28
Opt,
29
Not,
30
Lookahead,
31
Lex,
32
Apply,
33
UnicodeChar,
34
CaseInsensitiveTerminal
35
} from "ohm-js";
36
```
37
38
## Capabilities
39
40
### Parsing Expression Namespace
41
42
The `pexprs` namespace contains all parsing expression constructors and singleton instances.
43
44
```typescript { .api }
45
/**
46
* Constructors for parsing expressions (aka pexprs). (Except for `any`
47
* and `end`, which are not constructors but singleton instances.)
48
*/
49
const pexprs: {
50
PExpr: typeof PExpr;
51
Terminal: typeof Terminal;
52
Range: typeof Range;
53
Param: typeof Param;
54
Alt: typeof Alt;
55
Extend: typeof Extend;
56
Splice: typeof Splice;
57
Seq: typeof Seq;
58
Iter: typeof Iter;
59
Star: typeof Star;
60
Plus: typeof Plus;
61
Opt: typeof Opt;
62
Not: typeof Not;
63
Lookahead: typeof Lookahead;
64
Lex: typeof Lex;
65
Apply: typeof Apply;
66
UnicodeChar: typeof UnicodeChar;
67
CaseInsensitiveTerminal: typeof CaseInsensitiveTerminal;
68
any: PExpr;
69
end: PExpr;
70
};
71
```
72
73
### Base Parsing Expression
74
75
Abstract base class for all parsing expressions.
76
77
```typescript { .api }
78
/**
79
* Abstract base class for parsing expressions
80
*/
81
class PExpr {
82
/** Returns the arity (number of children) this expression expects */
83
getArity(): number;
84
/** Returns true if this expression can match zero characters */
85
isNullable(): boolean;
86
/** Returns a string representation of this expression */
87
toString(): string;
88
/** Returns a human-readable display string */
89
toDisplayString(): string;
90
}
91
```
92
93
### Terminal Expressions
94
95
Expressions that match literal text and character ranges.
96
97
```typescript { .api }
98
/**
99
* Matches literal terminal strings
100
*/
101
class Terminal extends PExpr {}
102
103
/**
104
* Matches character ranges (e.g., 'a'..'z')
105
*/
106
class Range extends PExpr {}
107
108
/**
109
* Case-insensitive terminal matching
110
*/
111
class CaseInsensitiveTerminal extends PExpr {}
112
113
/**
114
* Matches Unicode character categories
115
*/
116
class UnicodeChar extends PExpr {}
117
```
118
119
**Usage in Grammar:**
120
121
```javascript
122
// These expressions are typically created automatically by grammar compilation
123
// Terminal: "hello" in grammar source
124
// Range: 'a'..'z' in grammar source
125
// CaseInsensitiveTerminal: caseInsensitive<"Hello"> in grammar source
126
// UnicodeChar: unicodeCgegory<"Letter"> in grammar source
127
```
128
129
### Composite Expressions
130
131
Expressions that combine other expressions.
132
133
```typescript { .api }
134
/**
135
* Alternation (choice) - matches first successful alternative
136
*/
137
class Alt extends PExpr {
138
/** Array of alternative expressions to try */
139
terms: PExpr[];
140
}
141
142
/**
143
* Sequence - matches all expressions in order
144
*/
145
class Seq extends PExpr {
146
/** Array of expressions that must all match */
147
factors: PExpr[];
148
}
149
150
/**
151
* Grammar extension (used internally for rule extension)
152
*/
153
class Extend extends Alt {}
154
155
/**
156
* Grammar splicing (used internally for rule override)
157
*/
158
class Splice extends Alt {}
159
```
160
161
### Iteration Expressions
162
163
Expressions for repetition patterns.
164
165
```typescript { .api }
166
/**
167
* Abstract base for iteration expressions
168
*/
169
class Iter extends PExpr {}
170
171
/**
172
* Zero or more repetitions (*)
173
*/
174
class Star extends Iter {}
175
176
/**
177
* One or more repetitions (+)
178
*/
179
class Plus extends Iter {}
180
181
/**
182
* Optional (zero or one) (?)
183
*/
184
class Opt extends Iter {}
185
```
186
187
**Grammar Examples:**
188
189
```javascript
190
// Star: rule* in grammar
191
// Plus: rule+ in grammar
192
// Opt: rule? in grammar
193
```
194
195
### Lookahead Expressions
196
197
Expressions that look ahead without consuming input.
198
199
```typescript { .api }
200
/**
201
* Negative lookahead (~expr) - succeeds if expr fails
202
*/
203
class Not extends PExpr {}
204
205
/**
206
* Positive lookahead (&expr) - succeeds if expr succeeds
207
*/
208
class Lookahead extends PExpr {}
209
```
210
211
**Grammar Examples:**
212
213
```javascript
214
// Not: ~"keyword" in grammar
215
// Lookahead: &"prefix" in grammar
216
```
217
218
### Rule Application
219
220
Expression for applying grammar rules.
221
222
```typescript { .api }
223
/**
224
* Rule application - applies a named rule
225
*/
226
class Apply extends PExpr {}
227
```
228
229
**Grammar Examples:**
230
231
```javascript
232
// Apply: ruleName or ruleName<arg1, arg2> in grammar
233
```
234
235
### Special Expressions
236
237
Specialized parsing expressions for specific scenarios.
238
239
```typescript { .api }
240
/**
241
* Parameter reference (used in parameterized rules)
242
*/
243
class Param extends PExpr {}
244
245
/**
246
* Lexical rule application (implicit whitespace handling disabled)
247
*/
248
class Lex extends PExpr {}
249
```
250
251
### Singleton Instances
252
253
Pre-created instances for common patterns.
254
255
```typescript { .api }
256
/**
257
* Matches any single character (.)
258
*/
259
const any: PExpr;
260
261
/**
262
* Matches end of input
263
*/
264
const end: PExpr;
265
```
266
267
**Grammar Examples:**
268
269
```javascript
270
// any: . in grammar
271
// end: used implicitly at end of top-level matches
272
```
273
274
## Expression Hierarchy
275
276
The parsing expression classes form a hierarchy:
277
278
```
279
PExpr (abstract base)
280
├── Terminal
281
├── Range
282
├── CaseInsensitiveTerminal
283
├── UnicodeChar
284
├── Param
285
├── Alt
286
│ ├── Extend
287
│ └── Splice
288
├── Seq
289
├── Iter (abstract)
290
│ ├── Star
291
│ ├── Plus
292
│ └── Opt
293
├── Not
294
├── Lookahead
295
├── Lex
296
└── Apply
297
```
298
299
## Working with Parsing Expressions
300
301
While parsing expressions are typically created automatically during grammar compilation, they can be useful for advanced use cases:
302
303
### Introspection
304
305
```javascript
306
import { grammar, pexprs } from "ohm-js";
307
308
const g = grammar(`
309
Example {
310
rule = "hello" | "world"
311
list = rule+
312
}
313
`);
314
315
// Access rule bodies (which are parsing expressions)
316
const ruleBody = g.rules.rule.body;
317
console.log(ruleBody instanceof pexprs.Alt); // true
318
console.log(ruleBody.terms.length); // 2 (for "hello" | "world")
319
320
const listBody = g.rules.list.body;
321
console.log(listBody instanceof pexprs.Plus); // true
322
```
323
324
### Custom Grammar Construction
325
326
Advanced users can construct grammars programmatically:
327
328
```javascript
329
// This is advanced usage - most users should use grammar strings
330
import { pexprs } from "ohm-js";
331
332
// Create parsing expressions manually
333
const terminal = new pexprs.Terminal("hello");
334
const range = new pexprs.Range("a", "z");
335
const choice = new pexprs.Alt([terminal, range]);
336
```
337
338
### Expression Properties
339
340
All parsing expressions support introspection:
341
342
```javascript
343
const expr = g.rules.someRule.body;
344
345
console.log("Arity:", expr.getArity()); // Number of child nodes produced
346
console.log("Nullable:", expr.isNullable()); // Can match empty string
347
console.log("String form:", expr.toString()); // Grammar source representation
348
console.log("Display:", expr.toDisplayString()); // Human-readable form
349
```
350
351
## Built-in Rule Integration
352
353
Parsing expressions integrate with Ohm's built-in rules:
354
355
```javascript
356
// Built-in rules are available in all grammars
357
const match = grammar.match("abc123", "alnum+");
358
if (match.succeeded()) {
359
// Built-in rules like 'alnum', 'letter', 'digit' are implemented
360
// using parsing expressions under the hood
361
}
362
```
363
364
## Error Handling
365
366
Parsing expressions contribute to error reporting:
367
368
```javascript
369
const match = grammar.match("invalid input");
370
if (match.failed()) {
371
// Error messages incorporate information from parsing expressions
372
// to provide detailed failure information
373
console.log(match.message);
374
}
375
```