0
# Parser Generation and Building
1
2
Core functionality for generating Tree-sitter parsers from grammar files and compiling them to native dynamic libraries or WebAssembly modules. This covers the complete build pipeline from grammar to executable parser.
3
4
## Capabilities
5
6
### Parser Generation
7
8
Generate a Tree-sitter parser from grammar definition files, transforming grammar.js into C source code using the Tree-sitter generation system.
9
10
```bash { .api }
11
tree-sitter generate [options] [grammar_path] # Aliases: gen, g
12
```
13
14
**Arguments:**
15
- `[grammar_path]`: Path to grammar file (default: grammar.js in current directory)
16
17
**Options:**
18
- `--log, -l`: Show debug log during generation
19
- `--no-bindings`: Deprecated, no-op (will be removed in v0.25.0)
20
- `--abi <VERSION>`: Select language ABI version (default: 15, use "latest" for newest)
21
- `--build, -b`: Compile all defined languages after generation
22
- `--debug-build, -0`: Compile parser in debug mode
23
- `--libdir <PATH>`: Path to parser library directory
24
- `--output <DIRECTORY>, -o`: Output directory for generated source files
25
- `--report-states-for-rule <RULE>`: Report states for specific rule (use "-" for all rules)
26
- `--json`: Report conflicts in JSON format
27
- `--js-runtime <EXECUTABLE>`: JavaScript runtime for generation (default: "node")
28
29
**Generated Files:**
30
- `src/parser.c`: Main parser implementation
31
- `src/tree_sitter/parser.h`: Parser header
32
- `src/grammar.json`: Grammar metadata
33
34
**Example:**
35
```bash
36
# Generate parser from grammar.js
37
tree-sitter generate
38
39
# Generate with debug logging
40
tree-sitter generate --log
41
42
# Generate for specific ABI version
43
tree-sitter generate --abi 14
44
45
# Generate and build immediately
46
tree-sitter generate --build
47
48
# Use custom output directory
49
tree-sitter generate --output ./generated
50
51
# Report state information for debugging
52
tree-sitter generate --report-states-for-rule expression
53
54
# Generate with custom JavaScript runtime
55
tree-sitter generate --js-runtime bun
56
```
57
58
### Parser Compilation
59
60
Compile generated parser source code into native dynamic libraries or WebAssembly modules for runtime usage.
61
62
```bash { .api }
63
tree-sitter build [options] [path] # Alias: b
64
```
65
66
**Arguments:**
67
- `[path]`: Path to grammar directory (default: current directory)
68
69
**Options:**
70
- `--wasm, -w`: Build WebAssembly module instead of native library
71
- `--docker, -d`: Force Docker usage for Emscripten (WASM builds only)
72
- `--output <PATH>, -o`: Output path for compiled parser
73
- `--reuse-allocator`: Make parser reuse same allocator as library
74
- `--debug, -0`: Compile parser in debug mode
75
76
**Output:**
77
- **Native builds**: Dynamic library (`.so`, `.dylib`, `.dll`)
78
- **WASM builds**: WebAssembly module (`.wasm`)
79
80
**Example:**
81
```bash
82
# Build native dynamic library
83
tree-sitter build
84
85
# Build WebAssembly module
86
tree-sitter build --wasm
87
88
# Build with custom output path
89
tree-sitter build --output ./parsers/my-lang.so
90
91
# Build in debug mode
92
tree-sitter build --debug
93
94
# Force Docker for WASM (useful in CI)
95
tree-sitter build --wasm --docker
96
97
# Build with allocator reuse
98
tree-sitter build --reuse-allocator
99
```
100
101
## Generation Process
102
103
### Grammar Processing Pipeline
104
105
1. **Grammar Parsing**: Parse `grammar.js` using Node.js runtime
106
2. **Rule Processing**: Transform grammar rules into parser states
107
3. **Table Generation**: Build parse tables and lexer tables
108
4. **Conflict Resolution**: Resolve shift/reduce and reduce/reduce conflicts
109
5. **C Code Generation**: Generate parser.c with embedded tables
110
6. **Header Generation**: Create parser.h with public API
111
112
### ABI Version Management
113
114
Tree-sitter supports multiple ABI versions for backwards compatibility:
115
116
```bash
117
# Use default ABI version (15)
118
tree-sitter generate
119
120
# Use latest ABI version
121
tree-sitter generate --abi latest
122
123
# Use specific ABI version
124
tree-sitter generate --abi 13
125
```
126
127
**ABI Version Impact:**
128
- Determines parser API surface
129
- Affects runtime compatibility
130
- Influences generated code structure
131
132
### Build Targets
133
134
#### Native Libraries
135
136
Native compilation produces platform-specific dynamic libraries:
137
138
- **Linux**: `.so` (shared object)
139
- **macOS**: `.dylib` (dynamic library)
140
- **Windows**: `.dll` (dynamic link library)
141
142
#### WebAssembly Modules
143
144
WASM compilation requires Emscripten toolchain:
145
146
```bash
147
# Local Emscripten installation
148
tree-sitter build --wasm
149
150
# Force Docker usage (installs Emscripten automatically)
151
tree-sitter build --wasm --docker
152
```
153
154
**WASM Benefits:**
155
- Browser compatibility
156
- Sandboxed execution
157
- Consistent cross-platform behavior
158
- No native dependency requirements
159
160
## Advanced Generation Options
161
162
### State Reporting
163
164
Debug grammar development by analyzing parser states:
165
166
```bash
167
# Report states for specific rule
168
tree-sitter generate --report-states-for-rule function_definition
169
170
# Report states for all rules
171
tree-sitter generate --report-states-for-rule -
172
173
# Output state analysis as JSON
174
tree-sitter generate --report-states-for-rule expression --json
175
```
176
177
### Custom JavaScript Runtime
178
179
Override default Node.js runtime for generation:
180
181
```bash
182
# Use Bun runtime
183
tree-sitter generate --js-runtime bun
184
185
# Use custom Node.js installation
186
tree-sitter generate --js-runtime /usr/local/bin/node
187
188
# Use via environment variable
189
TREE_SITTER_JS_RUNTIME=deno tree-sitter generate
190
```
191
192
### Debug Mode Compilation
193
194
Enable debug symbols and additional runtime checks:
195
196
```bash
197
# Generate and build with debug info
198
tree-sitter generate --debug-build
199
200
# Build existing parser with debug symbols
201
tree-sitter build --debug
202
```
203
204
**Debug Mode Features:**
205
- Enhanced error messages
206
- Additional runtime validation
207
- Debug symbol inclusion
208
- Slower execution with better diagnostics
209
210
## Error Handling
211
212
Common generation errors and solutions:
213
214
**Unresolved Conflicts:**
215
- Review grammar for ambiguities
216
- Use precedence declarations
217
- Add conflict resolution rules
218
219
**Missing Dependencies:**
220
- Ensure Node.js is installed and accessible
221
- Verify grammar.js syntax
222
- Check external grammar dependencies
223
224
**Build Failures:**
225
- Confirm C/C++ compiler availability
226
- Verify Emscripten installation (WASM builds)
227
- Check file permissions and disk space