0
# Grammar Parsing
1
2
Built-in parser for PEG grammar syntax that converts grammar strings into abstract syntax trees for compilation into executable parsers.
3
4
## Capabilities
5
6
### Grammar Parser
7
8
Parses PEG grammar strings into abstract syntax tree representations.
9
10
```javascript { .api }
11
/**
12
* Built-in grammar parser module
13
*/
14
const parser = {
15
/**
16
* Parse PEG grammar string into AST
17
* @param input - PEG grammar string to parse
18
* @param options - Optional parsing configuration
19
* @returns Grammar AST object
20
* @throws {SyntaxError} If grammar syntax is invalid
21
*/
22
parse: function(input, options),
23
24
/**
25
* Syntax error constructor for grammar parsing failures
26
*/
27
SyntaxError: function(message, expected, found, location)
28
};
29
```
30
31
**Usage Examples:**
32
33
```javascript
34
const peg = require("pegjs");
35
36
// Parse a simple grammar
37
const grammarText = `
38
start = "hello" "world"
39
`;
40
41
const ast = peg.parser.parse(grammarText);
42
console.log(ast); // Grammar AST object
43
44
// Parse with options
45
const ast = peg.parser.parse(grammarText, {
46
startRule: "grammar"
47
});
48
```
49
50
### Syntax Error Handling
51
52
Grammar parsing errors provide detailed information about syntax problems.
53
54
```javascript { .api }
55
/**
56
* Grammar syntax error class
57
*/
58
class SyntaxError extends Error {
59
name: "SyntaxError";
60
message: string;
61
expected: ExpectedItem[];
62
found: string | null;
63
location: LocationRange;
64
65
/**
66
* Build human-readable error message
67
* @param expected - Array of expected items
68
* @param found - What was actually found
69
* @returns Formatted error message
70
*/
71
static buildMessage(expected, found): string;
72
}
73
74
/**
75
* Expected item in error reporting
76
*/
77
interface ExpectedItem {
78
type: "literal" | "class" | "any" | "end" | "other";
79
text?: string;
80
description?: string;
81
inverted?: boolean;
82
parts?: (string | [string, string])[];
83
}
84
85
/**
86
* Location range information
87
*/
88
interface LocationRange {
89
start: { offset: number; line: number; column: number };
90
end: { offset: number; line: number; column: number };
91
}
92
```
93
94
**Error Handling Example:**
95
96
```javascript
97
try {
98
const ast = peg.parser.parse("invalid grammar syntax");
99
} catch (error) {
100
if (error.name === "SyntaxError") {
101
console.error("Grammar syntax error:", error.message);
102
console.error("Expected:", error.expected);
103
console.error("Found:", error.found);
104
console.error(`At line ${error.location.start.line}, column ${error.location.start.column}`);
105
106
// Build custom error message
107
const customMessage = peg.parser.SyntaxError.buildMessage(
108
error.expected,
109
error.found
110
);
111
console.error("Custom message:", customMessage);
112
}
113
}
114
```
115
116
## Grammar Syntax
117
118
### Grammar Structure
119
120
PEG grammars consist of rules that define parsing patterns using various expression types.
121
122
**Basic Grammar Format:**
123
124
```pegjs
125
// Optional initializer (JavaScript code executed before parsing)
126
{
127
function helper() {
128
return "helper function";
129
}
130
}
131
132
// Rules define parsing patterns
133
ruleName "human readable name"
134
= expression
135
136
anotherRule
137
= expression / alternative
138
```
139
140
### Expression Types
141
142
The grammar parser recognizes these expression patterns:
143
144
#### Literals and Characters
145
146
```pegjs
147
// String literals (case-sensitive)
148
"exact text"
149
'single quotes'
150
151
// Case-insensitive literals
152
"text"i
153
'TEXT'i
154
155
// Any single character
156
.
157
158
// Character classes
159
[a-z] // lowercase letters
160
[^0-9] // anything except digits
161
[a-zA-Z0-9] // alphanumeric
162
[abc]i // case-insensitive class
163
```
164
165
#### Quantifiers and Grouping
166
167
```pegjs
168
// Optional (zero or one)
169
expression?
170
171
// Zero or more
172
expression*
173
174
// One or more
175
expression+
176
177
// Grouping
178
(expression)
179
```
180
181
#### Predicates
182
183
```pegjs
184
// Positive lookahead (and predicate)
185
&expression
186
187
// Negative lookahead (not predicate)
188
!expression
189
190
// Semantic predicates
191
&{ return condition; }
192
!{ return condition; }
193
```
194
195
#### Sequences and Choices
196
197
```pegjs
198
// Sequence (all must match in order)
199
expression1 expression2 expression3
200
201
// Choice (first match wins)
202
expression1 / expression2 / expression3
203
```
204
205
#### Labels and Actions
206
207
```pegjs
208
// Labels for capturing results
209
label:expression
210
211
// Actions for transforming results
212
expression { return transformedValue; }
213
214
// Text capture
215
$expression
216
```
217
218
#### Rule References
219
220
```pegjs
221
// Reference to another rule
222
ruleName
223
224
// Rule with human-readable name
225
integer "integer"
226
= digits:[0-9]+ { return parseInt(digits.join(""), 10); }
227
```
228
229
### Grammar AST Structure
230
231
The parser produces AST nodes with these types:
232
233
```javascript { .api }
234
/**
235
* Root grammar node
236
*/
237
interface GrammarNode {
238
type: "grammar";
239
initializer?: InitializerNode;
240
rules: RuleNode[];
241
}
242
243
/**
244
* Rule definition node
245
*/
246
interface RuleNode {
247
type: "rule";
248
name: string;
249
displayName?: string;
250
expression: ExpressionNode;
251
}
252
253
/**
254
* Expression node types
255
*/
256
type ExpressionNode =
257
| ChoiceNode | SequenceNode | LabeledNode | ActionNode
258
| OptionalNode | ZeroOrMoreNode | OneOrMoreNode
259
| SimpleAndNode | SimpleNotNode | SemanticAndNode | SemanticNotNode
260
| TextNode | RuleRefNode | LiteralNode | ClassNode | AnyNode;
261
262
/**
263
* Choice expression (alternatives)
264
*/
265
interface ChoiceNode {
266
type: "choice";
267
alternatives: ExpressionNode[];
268
}
269
270
/**
271
* Sequence expression
272
*/
273
interface SequenceNode {
274
type: "sequence";
275
elements: ExpressionNode[];
276
}
277
278
/**
279
* Labeled expression
280
*/
281
interface LabeledNode {
282
type: "labeled";
283
label: string;
284
expression: ExpressionNode;
285
}
286
```
287
288
**AST Usage Example:**
289
290
```javascript
291
const ast = peg.parser.parse(`
292
start = greeting name
293
greeting = "hello" / "hi"
294
name = [a-zA-Z]+
295
`);
296
297
console.log(ast.type); // "grammar"
298
console.log(ast.rules.length); // 3
299
console.log(ast.rules[0].name); // "start"
300
console.log(ast.rules[0].expression.type); // "sequence"
301
```
302
303
## Advanced Features
304
305
### Initializer Code
306
307
Grammars can include JavaScript code executed before parsing begins:
308
309
```pegjs
310
{
311
// Global variables and functions available in actions
312
var utils = {
313
capitalize: function(str) {
314
return str.charAt(0).toUpperCase() + str.slice(1);
315
}
316
};
317
}
318
319
start = word { return utils.capitalize(text()); }
320
word = [a-z]+
321
```
322
323
### Semantic Actions
324
325
Actions transform parse results using JavaScript code:
326
327
```pegjs
328
number = digits:[0-9]+ {
329
return parseInt(digits.join(""), 10);
330
}
331
332
array = "[" head:value tail:("," value)* "]" {
333
return [head].concat(tail.map(function(t) { return t[1]; }));
334
}
335
```
336
337
### Location Information
338
339
Actions can access location information:
340
341
```pegjs
342
rule = expression {
343
var loc = location();
344
return {
345
value: expression,
346
line: loc.start.line,
347
column: loc.start.column
348
};
349
}
350
```
351
352
### Error Reporting
353
354
Custom error messages in actions:
355
356
```javascript
357
rule = expression {
358
if (!isValid(expression)) {
359
expected("valid expression");
360
}
361
return expression;
362
}
363
```