0
# Nearley
1
2
Nearley is a comprehensive JavaScript parsing toolkit that implements the Earley parsing algorithm, enabling developers to parse any context-free grammar including left-recursive ones that challenge other parser generators. It offers a powerful domain-specific language for grammar definition, an efficient streaming parser with comprehensive error handling and ambiguity resolution, and extensive tooling including railroad diagram generation, grammar testing, and fuzzing capabilities.
3
4
## Package Information
5
6
- **Package Name**: nearley
7
- **Package Type**: npm
8
- **Language**: JavaScript
9
- **Installation**: `npm install nearley`
10
11
## Core Imports
12
13
```javascript
14
const nearley = require("nearley");
15
const { Parser, Grammar, Rule } = nearley;
16
```
17
18
For ES6 modules:
19
20
```javascript
21
import * as nearley from "nearley";
22
import { Parser, Grammar, Rule } from "nearley";
23
```
24
25
## Basic Usage
26
27
```javascript
28
const nearley = require("nearley");
29
30
// Use a pre-compiled grammar (generated with nearleyc)
31
const grammar = nearley.Grammar.fromCompiled(require("./my-grammar.js"));
32
33
// Create a parser instance
34
const parser = new nearley.Parser(grammar);
35
36
// Parse input text
37
try {
38
parser.feed("your input text here");
39
console.log("Parse results:", parser.results);
40
} catch (parseError) {
41
console.error("Parse error:", parseError.message);
42
}
43
```
44
45
## Architecture
46
47
Nearley is built around several key components:
48
49
- **Parser Engine**: Core `Parser` class implementing the Earley parsing algorithm with streaming support
50
- **Grammar System**: `Grammar` and `Rule` classes for representing compiled grammars and production rules
51
- **Compilation Pipeline**: Tools to compile `.ne` grammar files into executable JavaScript
52
- **Lexer Integration**: Support for various lexers including the built-in `StreamLexer` and external lexers like `moo`
53
- **Tooling Ecosystem**: CLI tools for compilation, testing, diagram generation, and text generation
54
- **Error Handling**: Comprehensive error reporting with state stack traces and position information
55
56
## Capabilities
57
58
### Core Parsing
59
60
Main parsing functionality using the Earley algorithm for parsing any context-free grammar. Supports streaming input, error recovery, and ambiguous grammars.
61
62
```javascript { .api }
63
class Parser {
64
constructor(rules, start, options);
65
constructor(grammar, options);
66
67
feed(chunk: string): Parser;
68
save(): Column;
69
restore(column: Column): void;
70
finish(): any[];
71
}
72
73
static Parser.fail: object;
74
```
75
76
[Core Parsing](./core-parsing.md)
77
78
### Grammar Management
79
80
Grammar representation and compilation system for working with context-free grammars and production rules.
81
82
```javascript { .api }
83
class Grammar {
84
constructor(rules: Rule[], start?: string);
85
static fromCompiled(rules: object, start?: string): Grammar;
86
}
87
88
class Rule {
89
constructor(name: string, symbols: any[], postprocess?: Function);
90
toString(withCursorAt?: number): string;
91
}
92
```
93
94
[Grammar Management](./grammar-management.md)
95
96
### Stream Processing
97
98
Node.js stream integration for parsing large inputs or continuous data streams.
99
100
```javascript { .api }
101
class StreamWrapper extends Writable {
102
constructor(parser: Parser);
103
}
104
```
105
106
[Stream Processing](./stream-processing.md)
107
108
### Text Generation
109
110
Random text generation from grammars for testing, fuzzing, and example generation.
111
112
```javascript { .api }
113
function Unparse(grammar: Grammar, start: string, depth?: number): string;
114
```
115
116
[Text Generation](./text-generation.md)
117
118
### CLI Tools
119
120
Command-line tools for grammar compilation, testing, railroad diagram generation, and text generation.
121
122
- `nearleyc` - Compile .ne grammar files to JavaScript
123
- `nearley-test` - Test compiled grammars with input
124
- `nearley-railroad` - Generate railroad diagrams from grammars
125
- `nearley-unparse` - Generate random text from grammars
126
127
[CLI Tools](./cli-tools.md)
128
129
## Built-in Resources
130
131
### Default Lexer
132
133
```javascript { .api }
134
class StreamLexer {
135
constructor();
136
reset(data: string, state?: object): void;
137
next(): {value: string} | undefined;
138
save(): {line: number, col: number};
139
formatError(token: object, message: string): string;
140
}
141
```
142
143
### Built-in Grammars
144
145
Nearley includes several built-in grammar modules that can be included in your own grammars:
146
147
- `builtin/number.ne` - Number parsing patterns (unsigned_int, int, decimal, percentage, jsonfloat)
148
- `builtin/string.ne` - String parsing patterns with escape sequences
149
- `builtin/whitespace.ne` - Whitespace handling patterns
150
- `builtin/postprocessors.ne` - Common postprocessing functions (id, nuller, joiner, etc.)
151
- `builtin/cow.ne` - Simple example grammar for matching "MOO", "MOOO", etc.
152
153
**Usage in Grammar Files:**
154
155
```nearley
156
@include "builtin/number.ne"
157
@include "builtin/whitespace.ne"
158
159
# Your grammar rules can now use built-in rules
160
expr -> number _:* "+" _:* number {%
161
function(d) { return d[0] + d[4]; }
162
%}
163
```
164
165
**Importing in JavaScript:**
166
167
```javascript
168
// Built-in grammars are typically included at compile time
169
// For runtime access to compiled built-ins, they must be pre-compiled:
170
const numberGrammar = require("nearley/builtin/number.js"); // if compiled
171
```
172
173
## Types
174
175
```javascript { .api }
176
interface ParserOptions {
177
keepHistory?: boolean;
178
lexer?: object;
179
}
180
181
interface State {
182
rule: Rule;
183
dot: number;
184
reference: number;
185
data: any[];
186
wantedBy: State[];
187
isComplete: boolean;
188
}
189
190
interface Column {
191
grammar: Grammar;
192
index: number;
193
states: State[];
194
wants: object;
195
scannable: State[];
196
completed: object;
197
}
198
```