0
# Code Generation
1
2
Generate static Python parser code and object model classes from EBNF grammars for deployment, distribution, and integration into applications without runtime dependencies on TatSu.
3
4
## Capabilities
5
6
### Python Parser Code Generation
7
8
Generate complete, standalone Python parser classes from EBNF grammars that can be distributed and used independently of TatSu.
9
10
```python { .api }
11
def to_python_sourcecode(grammar, name=None, filename=None, config=None, **settings):
12
"""
13
Generate Python parser source code from grammar.
14
15
Parameters:
16
- grammar (str): EBNF grammar definition string
17
- name (str, optional): Parser class name (defaults to grammar filename base)
18
- filename (str, optional): Source filename for error reporting and class naming
19
- config (ParserConfig, optional): Parser configuration object
20
- **settings: Additional generation settings (trace, left_recursion, etc.)
21
22
Returns:
23
str: Complete Python source code for a parser class
24
25
Raises:
26
GrammarError: If grammar contains syntax or semantic errors
27
CodegenError: If code generation fails
28
"""
29
```
30
31
Usage example:
32
33
```python
34
import tatsu
35
36
grammar = '''
37
start = expr;
38
expr = term ("+" term)*;
39
term = factor ("*" factor)*;
40
factor = "(" expr ")" | number;
41
number = /\d+/;
42
'''
43
44
# Generate parser code
45
parser_code = tatsu.to_python_sourcecode(grammar, name="Calculator")
46
47
# Save to file
48
with open("calculator_parser.py", "w") as f:
49
f.write(parser_code)
50
51
# The generated code can be imported and used:
52
# from calculator_parser import CalculatorParser
53
# parser = CalculatorParser()
54
# result = parser.parse("2 + 3 * 4")
55
```
56
57
### Object Model Generation
58
59
Generate Python dataclass or custom class definitions that correspond to grammar rules, enabling strongly-typed parse results.
60
61
```python { .api }
62
def to_python_model(grammar, name=None, filename=None, base_type=None, config=None, **settings):
63
"""
64
Generate Python object model classes from grammar.
65
66
Parameters:
67
- grammar (str): EBNF grammar definition string
68
- name (str, optional): Model class prefix (defaults to grammar filename base)
69
- filename (str, optional): Source filename for error reporting
70
- base_type (type, optional): Base class for generated model classes (default: Node)
71
- config (ParserConfig, optional): Parser configuration object
72
- **settings: Additional generation settings
73
74
Returns:
75
str: Python source code for object model classes
76
77
Raises:
78
GrammarError: If grammar contains syntax or semantic errors
79
CodegenError: If model generation fails
80
"""
81
```
82
83
Usage example:
84
85
```python
86
import tatsu
87
from tatsu.objectmodel import Node
88
89
grammar = '''
90
start = expr;
91
expr::Expr = term ("+" term)*;
92
term::Term = factor ("*" factor)*;
93
factor::Factor = "(" expr ")" | number;
94
number::Number = /\d+/;
95
'''
96
97
# Generate object model with custom base type
98
class MyBaseNode(Node):
99
def __repr__(self):
100
return f"{self.__class__.__name__}({super().__repr__()})"
101
102
model_code = tatsu.to_python_model(
103
grammar,
104
name="Calculator",
105
base_type=MyBaseNode
106
)
107
108
# Save generated model classes
109
with open("calculator_model.py", "w") as f:
110
f.write(model_code)
111
112
# Use with semantic actions
113
from calculator_model import *
114
115
class CalculatorSemantics:
116
def number(self, ast):
117
return Number(value=int(ast))
118
119
def expr(self, ast):
120
return Expr(terms=ast)
121
122
model = tatsu.compile(grammar)
123
result = model.parse("2 + 3", semantics=CalculatorSemantics())
124
```
125
126
### Code Generation Options
127
128
Advanced options for customizing the generated parser and model code:
129
130
```python { .api }
131
# Parser generation settings
132
trace: bool = False # Include tracing support in generated parser
133
left_recursion: bool = True # Enable left-recursion in generated parser
134
nameguard: bool = None # Include nameguard logic in generated parser
135
whitespace: str = None # Default whitespace handling
136
137
# Model generation settings
138
base_type: type = None # Base class for generated model classes
139
types: Dict[str, type] = None # Custom type mappings for specific rules
140
```
141
142
### Generated Code Structure
143
144
The generated parser code follows a consistent structure:
145
146
```python
147
# Generated parser class structure
148
class GeneratedParser:
149
"""Generated parser class with all grammar rules as methods."""
150
151
def __init__(self, **kwargs):
152
"""Initialize parser with optional configuration."""
153
154
def parse(self, text, start=None, **kwargs):
155
"""Main parsing method."""
156
157
def _rule_name_(self):
158
"""Generated method for each grammar rule."""
159
160
# Error handling and utility methods
161
def _error(self, item, pos):
162
"""Error reporting method."""
163
164
def _call(self, rule):
165
"""Rule invocation method."""
166
```
167
168
The generated object model classes are dataclasses or Node subclasses:
169
170
```python
171
@dataclass
172
class RuleName(BaseType):
173
"""Generated class for grammar rule 'rule_name'."""
174
field1: Any
175
field2: List[Any]
176
# Fields correspond to named elements in the rule
177
```
178
179
### Integration Examples
180
181
Using generated code in applications:
182
183
```python
184
# Example: Using generated parser in a web application
185
from flask import Flask, request, jsonify
186
from my_generated_parser import MyParser
187
from tatsu.exceptions import ParseException
188
189
app = Flask(__name__)
190
parser = MyParser()
191
192
@app.route('/parse', methods=['POST'])
193
def parse_input():
194
try:
195
text = request.json['input']
196
result = parser.parse(text)
197
return jsonify({'success': True, 'ast': result})
198
except ParseException as e:
199
return jsonify({
200
'success': False,
201
'error': str(e),
202
'line': e.line,
203
'column': e.col
204
}), 400
205
```
206
207
### Standalone Deployment
208
209
Generated parsers are completely standalone and can be deployed without TatSu:
210
211
```python
212
# Requirements for generated parser (minimal dependencies)
213
# - Python 3.10+
214
# - No external dependencies (TatSu not required)
215
216
# The generated parser includes all necessary parsing logic:
217
# - PEG parsing algorithm
218
# - Memoization (packrat parsing)
219
# - Left-recursion support
220
# - Error handling and reporting
221
# - AST construction
222
```
223
224
## Advanced Code Generation
225
226
### Custom Code Templates
227
228
For advanced use cases, TatSu's code generation can be customized using the underlying code generation infrastructure:
229
230
```python { .api }
231
from tatsu.ngcodegen import codegen
232
from tatsu.ngcodegen.objectmodel import modelgen
233
234
# Direct access to code generators
235
def custom_codegen(model, **kwargs):
236
"""Access to lower-level code generation."""
237
return codegen(model, **kwargs)
238
239
def custom_modelgen(model, **kwargs):
240
"""Access to lower-level model generation."""
241
return modelgen(model, **kwargs)
242
```
243
244
### Generated Code Optimization
245
246
Generated parsers include several optimizations:
247
248
- **Memoization**: Packrat parsing with automatic memoization
249
- **Left-recursion**: Advanced left-recursion handling algorithms
250
- **Error recovery**: Comprehensive error reporting with position information
251
- **Minimized overhead**: Optimized for production deployment
252
253
The generated code is suitable for:
254
- High-performance parsing applications
255
- Production web services
256
- Embedded parsing in larger applications
257
- Distribution as standalone parsing libraries