0
# Algorithm Selection
1
2
Support for multiple parsing algorithms including LR(0), SLR(1), LALR(1), LR(1), and LL(1) with automatic selection based on grammar complexity. Each algorithm offers different trade-offs between parsing power, table size, and generation time.
3
4
## Capabilities
5
6
### Algorithm Overview
7
8
Jison supports five different parsing algorithms, each with specific characteristics:
9
10
- **LR(0)**: Simplest LR algorithm, works with very limited grammars
11
- **SLR(1)**: Simple LR with one token lookahead, more powerful than LR(0)
12
- **LALR(1)**: Look-Ahead LR, good balance of power and efficiency (default)
13
- **LR(1)**: Canonical LR with full lookahead, most powerful LR algorithm
14
- **LL(1)**: Top-down parsing, requires left-factored grammars
15
16
### LR(0) Generator
17
18
Basic LR parser generator for simple grammars without lookahead conflicts.
19
20
```javascript { .api }
21
/**
22
* LR(0) parser generator constructor
23
* @param grammar - Grammar definition
24
* @param options - Generation options
25
*/
26
function LR0Generator(grammar, options);
27
28
/**
29
* LR(0) generator instance methods
30
*/
31
class LR0Generator {
32
type: "LR(0)"; // Algorithm identifier
33
34
/**
35
* Build LR(0) parsing table
36
*/
37
buildTable(): void;
38
39
/**
40
* Create parser instance from generated table
41
* @returns Parser ready for input processing
42
*/
43
createParser(): GeneratedParser;
44
45
/**
46
* Generate parser source code
47
* @param options - Code generation options
48
* @returns Generated parser source
49
*/
50
generate(options?: GenerationOptions): string;
51
}
52
```
53
54
**Usage Examples:**
55
56
```javascript
57
const { LR0Generator } = require("jison");
58
59
// Simple grammar that works with LR(0)
60
const lr0Grammar = {
61
"bnf": {
62
"S": [["A", "return $1;"]],
63
"A": [["a", "$$ = 'matched_a';"]]
64
}
65
};
66
67
const lr0Gen = new LR0Generator(lr0Grammar);
68
const parser = lr0Gen.createParser();
69
70
try {
71
const result = parser.parse("a");
72
console.log(result); // "matched_a"
73
} catch (error) {
74
console.error("LR(0) parse failed:", error.message);
75
}
76
```
77
78
### SLR(1) Generator
79
80
Simple LR(1) parser generator using Follow sets for lookahead resolution.
81
82
```javascript { .api }
83
/**
84
* SLR(1) parser generator constructor
85
* @param grammar - Grammar definition
86
* @param options - Generation options
87
*/
88
function SLRGenerator(grammar, options);
89
90
/**
91
* SLR(1) generator instance methods
92
*/
93
class SLRGenerator extends LR0Generator {
94
type: "SLR(1)"; // Algorithm identifier
95
96
/**
97
* Compute lookahead tokens using Follow sets
98
* @param state - Parser state
99
* @param item - LR item
100
* @returns Array of lookahead tokens
101
*/
102
lookAheads(state: State, item: Item): string[];
103
}
104
```
105
106
**Usage Examples:**
107
108
```javascript
109
const { SLRGenerator } = require("jison");
110
111
// Grammar with simple conflicts resolvable by SLR(1)
112
const slrGrammar = {
113
"bnf": {
114
"E": [
115
["E + T", "$$ = $1 + $3;"],
116
["T", "$$ = $1;"]
117
],
118
"T": [
119
["( E )", "$$ = $2;"],
120
["id", "$$ = $1;"]
121
]
122
}
123
};
124
125
const slrGen = new SLRGenerator(slrGrammar);
126
const parser = slrGen.createParser();
127
```
128
129
### LALR(1) Generator
130
131
Look-Ahead LR(1) parser generator - the default algorithm providing good balance of power and efficiency.
132
133
```javascript { .api }
134
/**
135
* LALR(1) parser generator constructor (default algorithm)
136
* @param grammar - Grammar definition
137
* @param options - Generation options including onDemandLookahead
138
*/
139
function LALRGenerator(grammar, options);
140
141
/**
142
* LALR(1) generator instance methods
143
*/
144
class LALRGenerator extends LR0Generator {
145
type: "LALR(1)"; // Algorithm identifier
146
onDemandLookahead: boolean; // Compute lookaheads only for inadequate states
147
148
/**
149
* Build new grammar for lookahead computation
150
*/
151
buildNewGrammar(): void;
152
153
/**
154
* Union lookahead sets from auxiliary grammar
155
*/
156
unionLookaheads(): void;
157
158
/**
159
* Compute lookahead tokens for LALR(1)
160
* @param state - Parser state
161
* @param item - LR item
162
* @returns Array of lookahead tokens
163
*/
164
lookAheads(state: State, item: Item): string[];
165
166
/**
167
* Compute goto operation for state and symbol
168
* @param p - State number
169
* @param w - Symbol sequence
170
* @returns Target state number
171
*/
172
go(p: number, w: string[]): number;
173
}
174
```
175
176
**Usage Examples:**
177
178
```javascript
179
const { LALRGenerator } = require("jison");
180
181
// Complex grammar requiring LALR(1) lookahead
182
const lalrGrammar = {
183
"lex": {
184
"rules": [
185
["\\s+", "/* skip */"],
186
["[a-zA-Z]+", "return 'ID';"],
187
["=", "return '=';"],
188
[";", "return ';';"]
189
]
190
},
191
"bnf": {
192
"prog": [["stmts", "return $1;"]],
193
"stmts": [
194
["stmts stmt", "$$ = $1.concat([$2]);"],
195
["stmt", "$$ = [$1];"]
196
],
197
"stmt": [["ID = ID ;", "$$ = {type: 'assign', left: $1, right: $3};"]]
198
}
199
};
200
201
const lalrGen = new LALRGenerator(lalrGrammar, {
202
onDemandLookahead: true // Optimize lookahead computation
203
});
204
const parser = lalrGen.createParser();
205
```
206
207
### LR(1) Generator
208
209
Canonical LR(1) parser generator - most powerful LR algorithm with full lookahead context.
210
211
```javascript { .api }
212
/**
213
* Canonical LR(1) parser generator constructor
214
* @param grammar - Grammar definition
215
* @param options - Generation options
216
*/
217
function LR1Generator(grammar, options);
218
219
/**
220
* LR(1) generator instance methods
221
*/
222
class LR1Generator extends LR0Generator {
223
type: "Canonical LR(1)"; // Algorithm identifier
224
225
/**
226
* Enhanced LR(1) Item class with lookahead context
227
*/
228
Item: typeof LR1Item;
229
230
/**
231
* LR(1) closure operation with lookahead propagation
232
* @param itemSet - Set of LR(1) items
233
* @returns Closure of the item set
234
*/
235
closureOperation(itemSet: ItemSet): ItemSet;
236
237
/**
238
* Compute lookahead tokens for LR(1) items
239
* @param state - Parser state
240
* @param item - LR(1) item
241
* @returns Array of lookahead tokens
242
*/
243
lookAheads(state: State, item: LR1Item): string[];
244
}
245
246
/**
247
* LR(1) Item with lookahead context
248
*/
249
class LR1Item {
250
production: Production; // Associated production
251
dotPosition: number; // Position of dot in production
252
follows: string[]; // Lookahead tokens
253
predecessor?: number; // Predecessor item reference
254
255
/**
256
* Remaining handle after dot position
257
* @returns Array of symbols after the dot
258
*/
259
remainingHandle(): string[];
260
261
/**
262
* Check equality with another LR(1) item
263
* @param item - Item to compare with
264
* @returns True if items are equivalent
265
*/
266
eq(item: LR1Item): boolean;
267
}
268
```
269
270
**Usage Examples:**
271
272
```javascript
273
const { LR1Generator } = require("jison");
274
275
// Grammar requiring full LR(1) power (not LALR)
276
const lr1Grammar = {
277
"bnf": {
278
"S": [
279
["A a", "return 'S->Aa';"],
280
["B b", "return 'S->Bb';"]
281
],
282
"A": [["c", "$$ = 'A->c';"]],
283
"B": [["c", "$$ = 'B->c';"]]
284
}
285
};
286
287
const lr1Gen = new LR1Generator(lr1Grammar);
288
const parser = lr1Gen.createParser();
289
290
// This grammar creates LALR conflicts but LR(1) can handle it
291
console.log(parser.parse("c a")); // "S->Aa"
292
console.log(parser.parse("c b")); // "S->Bb"
293
```
294
295
### LL(1) Generator
296
297
Top-down LL(1) parser generator for left-factored grammars.
298
299
```javascript { .api }
300
/**
301
* LL(1) parser generator constructor
302
* @param grammar - Grammar definition (must be left-factored)
303
* @param options - Generation options
304
*/
305
function LLGenerator(grammar, options);
306
307
/**
308
* LL(1) generator instance methods
309
*/
310
class LLGenerator {
311
type: "LL(1)"; // Algorithm identifier
312
313
/**
314
* Build LL(1) parsing table
315
* @param productions - Grammar productions
316
* @returns LL(1) parsing table
317
*/
318
parseTable(productions: Production[]): ParseTable;
319
320
/**
321
* Create LL(1) parser instance
322
* @returns Generated LL(1) parser
323
*/
324
createParser(): GeneratedParser;
325
}
326
```
327
328
**Usage Examples:**
329
330
```javascript
331
const { LLGenerator } = require("jison");
332
333
// Left-factored grammar suitable for LL(1)
334
const llGrammar = {
335
"bnf": {
336
"E": [
337
["T E_prime", "$$ = $2($1);"]
338
],
339
"E_prime": [
340
["+ T E_prime", "$$ = function(left) { return $3(left + $2); };"],
341
["", "$$ = function(left) { return left; };"] // epsilon production
342
],
343
"T": [["id", "$$ = $1;"]]
344
}
345
};
346
347
const llGen = new LLGenerator(llGrammar);
348
const parser = llGen.createParser();
349
```
350
351
## Algorithm Selection Guidelines
352
353
### When to Use Each Algorithm
354
355
**LR(0)**: Use for very simple grammars without any conflicts:
356
```javascript
357
// Example: Simple assignment statements
358
const simpleGrammar = {
359
"bnf": {
360
"stmt": [["ID = NUM", "return {assign: $1, value: $3};"]]
361
}
362
};
363
```
364
365
**SLR(1)**: Use for grammars with simple reduce-reduce conflicts:
366
```javascript
367
// Example: Basic arithmetic expressions
368
const arithmeticGrammar = {
369
"bnf": {
370
"expr": [
371
["expr + term", "$$ = $1 + $3;"],
372
["term", "$$ = $1;"]
373
],
374
"term": [["NUM", "$$ = Number($1);"]]
375
}
376
};
377
```
378
379
**LALR(1)** (default): Use for most grammars - good balance of power and efficiency:
380
```javascript
381
// Example: Programming language constructs
382
const programGrammar = {
383
"bnf": {
384
"program": [["declarations statements", "return {decls: $1, stmts: $2};"]],
385
"declarations": [["var_decls func_decls", "$$ = $1.concat($2);"]],
386
// ... more complex grammar rules
387
}
388
};
389
```
390
391
**LR(1)**: Use for grammars with complex lookahead requirements:
392
```javascript
393
// Example: Context-sensitive constructs
394
const contextGrammar = {
395
"bnf": {
396
"stmt": [
397
["if_stmt", "$$ = $1;"],
398
["block_stmt", "$$ = $1;"]
399
],
400
"if_stmt": [["IF expr THEN stmt ELSE stmt", "return {if: $2, then: $4, else: $6};"]],
401
// Grammar that creates LALR conflicts
402
}
403
};
404
```
405
406
**LL(1)**: Use for grammars that need top-down parsing:
407
```javascript
408
// Example: Recursive descent friendly grammar
409
const recursiveGrammar = {
410
"bnf": {
411
"expr": [["term expr_rest", "$$ = $2($1);"]],
412
"expr_rest": [
413
["+ term expr_rest", "$$ = function(left) { return $3(left + $2); };"],
414
["", "$$ = function(left) { return left; };"]
415
]
416
}
417
};
418
```
419
420
### Automatic Selection
421
422
Use the Generator factory for automatic algorithm selection:
423
424
```javascript
425
const jison = require("jison");
426
427
// Automatic selection (defaults to LALR(1))
428
const autoGen = new jison.Generator(grammar);
429
430
// Explicit algorithm selection
431
const lr1Gen = new jison.Generator(grammar, { type: 'lr' });
432
const llGen = new jison.Generator(grammar, { type: 'll' });
433
```
434
435
### Performance Comparison
436
437
| Algorithm | Parse Table Size | Generation Time | Parse Speed | Grammar Support |
438
|-----------|------------------|-----------------|-------------|----------------|
439
| LR(0) | Smallest | Fastest | Fastest | Very Limited |
440
| SLR(1) | Small | Fast | Fast | Limited |
441
| LALR(1) | Medium | Medium | Fast | Good |
442
| LR(1) | Large | Slow | Fast | Excellent |
443
| LL(1) | Medium | Medium | Medium | Limited |
444
445
### Debugging Algorithm Selection
446
447
Enable debug mode to see algorithm-specific information:
448
449
```javascript
450
const generator = new jison.Generator(grammar, {
451
type: 'lalr',
452
debug: true
453
});
454
455
// Debug output shows:
456
// - State construction details
457
// - Conflict resolution
458
// - Lookahead computation
459
// - Table compression statistics
460
```
461
462
## Error Handling by Algorithm
463
464
Different algorithms provide different error recovery capabilities:
465
466
### LR Algorithms (LR0, SLR, LALR, LR1)
467
468
```javascript
469
// LR parsers support error recovery
470
const parser = generator.createParser();
471
472
try {
473
const result = parser.parse(invalidInput);
474
} catch (error) {
475
console.log("Parse error:", error.message);
476
console.log("Recoverable:", error.hash.recoverable);
477
console.log("Expected tokens:", error.hash.expected);
478
}
479
```
480
481
### LL(1) Algorithm
482
483
```javascript
484
// LL parsers have simpler error reporting
485
const llParser = llGenerator.createParser();
486
487
try {
488
const result = llParser.parse(invalidInput);
489
} catch (error) {
490
console.log("LL parse error:", error.message);
491
// Less detailed error information than LR parsers
492
}
493
```
494
495
## Types
496
497
```javascript { .api }
498
/**
499
* Parser generator state
500
*/
501
interface State {
502
edges: Record<string, number>; // Transitions to other states
503
reductions: Item[]; // Reduction items in this state
504
inadequate: boolean; // Whether state has conflicts
505
}
506
507
/**
508
* Production rule representation
509
*/
510
interface Production {
511
symbol: string; // Left-hand side nonterminal
512
handle: string[]; // Right-hand side symbols
513
id: number; // Production identifier
514
precedence: number; // Operator precedence
515
}
516
517
/**
518
* LR parsing item
519
*/
520
interface Item {
521
production: Production; // Associated production rule
522
dotPosition: number; // Position of dot in production
523
follows: string[]; // Lookahead tokens
524
525
remainingHandle(): string[]; // Symbols after dot position
526
eq(item: Item): boolean; // Check equality with another item
527
}
528
529
/**
530
* Set of LR items
531
*/
532
interface ItemSet {
533
contains(item: Item): boolean; // Check if item is in set
534
push(item: Item): void; // Add item to set
535
forEach(fn: (item: Item) => void): void; // Iterate over items
536
}
537
538
/**
539
* Code generation options
540
*/
541
interface GenerationOptions {
542
moduleName?: string; // Generated module name
543
moduleType?: 'commonjs' | 'amd' | 'js'; // Module format
544
debug?: boolean; // Include debug information
545
}
546
547
/**
548
* LL(1) parsing table
549
*/
550
interface ParseTable {
551
[nonterminal: string]: {
552
[terminal: string]: number[]; // Production numbers to apply
553
};
554
}
555
```