TatSu takes a grammar in a variation of EBNF as input, and outputs a memoizing PEG/Packrat parser in Python.
npx @tessl/cli install tessl/pypi-tatsu@5.13.00
# TatSu
1
2
TatSu is a Python parser generator that compiles Extended Backus-Naur Form (EBNF) grammars into memoizing Parsing Expression Grammar (PEG) parsers. It generates efficient Packrat parsers with left-recursion support, provides both runtime parsing and static parser generation, and includes comprehensive AST building and semantic action capabilities for building compilers, interpreters, and domain-specific languages.
3
4
## Package Information
5
6
- **Package Name**: TatSu
7
- **Language**: Python
8
- **Installation**: `pip install TatSu`
9
- **Repository**: https://github.com/neogeny/TatSu
10
- **Documentation**: https://tatsu.readthedocs.io/en/stable/
11
12
## Core Imports
13
14
```python
15
import tatsu
16
```
17
18
For specific functionality:
19
20
```python
21
from tatsu import compile, parse, to_python_sourcecode, to_python_model
22
from tatsu.exceptions import ParseException
23
from tatsu.semantics import ModelBuilderSemantics
24
```
25
26
## Basic Usage
27
28
```python
29
import tatsu
30
31
# Define a simple grammar
32
grammar = '''
33
expr = term (("+" | "-") term)*;
34
term = factor (("*" | "/") factor)*;
35
factor = "(" expr ")" | number;
36
number = /\d+/;
37
'''
38
39
# Compile the grammar into a parser model
40
model = tatsu.compile(grammar)
41
42
# Parse some input
43
result = model.parse("2 + 3 * 4")
44
print(result) # Outputs the AST
45
46
# Or parse directly in one step
47
result = tatsu.parse(grammar, "2 + 3 * 4")
48
49
# Generate Python parser code
50
parser_code = tatsu.to_python_sourcecode(grammar, name="Calculator")
51
52
# Generate object model classes
53
model_code = tatsu.to_python_model(grammar, name="Calculator")
54
```
55
56
## Architecture
57
58
TatSu follows a multi-stage architecture for parser generation and execution:
59
60
- **Grammar Parsing**: EBNF grammars are parsed into internal model representations
61
- **Model Compilation**: Grammar models are compiled into executable parser objects with memoization
62
- **Runtime Parsing**: Compiled parsers execute against input text using PEG semantics
63
- **AST Construction**: Parse results build abstract syntax trees or custom object models
64
- **Code Generation**: Static Python parser code can be generated for distribution
65
- **Semantic Actions**: Custom semantic actions transform parse results during parsing
66
67
This design enables both interactive grammar development and production parser deployment, with support for advanced features like left-recursion, packrat memoization, and extensible semantic processing.
68
69
## Capabilities
70
71
### Core Parsing Functions
72
73
The primary interface for compiling grammars and parsing input text, providing both one-step parsing and separate compilation for reuse.
74
75
```python { .api }
76
def compile(grammar, name=None, semantics=None, asmodel=False, config=None, **settings):
77
"""
78
Compile an EBNF grammar into a parser model.
79
80
Parameters:
81
- grammar: str, EBNF grammar definition
82
- name: str, optional name for the parser
83
- semantics: semantic actions object
84
- asmodel: bool, use ModelBuilderSemantics if True
85
- config: ParserConfig, parser configuration
86
- **settings: additional parser settings
87
88
Returns:
89
Model object that can parse input text
90
"""
91
92
def parse(grammar, input, start=None, name=None, semantics=None, asmodel=False, config=None, **settings):
93
"""
94
Parse input text using the provided grammar.
95
96
Parameters:
97
- grammar: str, EBNF grammar definition
98
- input: str, text to parse
99
- start: str, optional start rule name
100
- name: str, optional parser name
101
- semantics: semantic actions object
102
- asmodel: bool, use ModelBuilderSemantics if True
103
- config: ParserConfig, parser configuration
104
- **settings: additional parser settings
105
106
Returns:
107
Parsed AST or semantic action result
108
"""
109
```
110
111
[Core Parsing](./core-parsing.md)
112
113
### Code Generation
114
115
Generate static Python parser code and object model classes from grammars for deployment and distribution.
116
117
```python { .api }
118
def to_python_sourcecode(grammar, name=None, filename=None, config=None, **settings):
119
"""
120
Generate Python parser source code from grammar.
121
122
Parameters:
123
- grammar: str, EBNF grammar definition
124
- name: str, optional parser class name
125
- filename: str, optional source filename for error reporting
126
- config: ParserConfig, parser configuration
127
- **settings: additional generation settings
128
129
Returns:
130
str, Python source code for the parser
131
"""
132
133
def to_python_model(grammar, name=None, filename=None, base_type=None, config=None, **settings):
134
"""
135
Generate Python object model classes from grammar.
136
137
Parameters:
138
- grammar: str, EBNF grammar definition
139
- name: str, optional model class prefix
140
- filename: str, optional source filename
141
- base_type: type, base class for generated model classes
142
- config: ParserConfig, parser configuration
143
- **settings: additional generation settings
144
145
Returns:
146
str, Python source code for object model classes
147
"""
148
```
149
150
[Code Generation](./code-generation.md)
151
152
### Exception Handling
153
154
Comprehensive exception hierarchy for handling grammar compilation errors, parse failures, and semantic processing issues.
155
156
```python { .api }
157
class ParseException(Exception):
158
"""Base exception for all TatSu parsing errors."""
159
160
class GrammarError(ParseException):
161
"""Grammar definition and compilation errors."""
162
163
class FailedParse(ParseException):
164
"""Base parse failure with position information."""
165
166
class FailedToken(FailedParse):
167
"""Expected token not found."""
168
169
class FailedPattern(FailedParse):
170
"""Regular expression pattern match failed."""
171
```
172
173
[Exception Handling](./exceptions.md)
174
175
### Semantic Actions
176
177
Build custom semantic actions to transform parse results, construct object models, and implement domain-specific processing during parsing.
178
179
```python { .api }
180
class ModelBuilderSemantics:
181
"""Object model building semantics with type registration."""
182
183
def __init__(self, context=None, base_type=None, types=None):
184
"""
185
Initialize model builder semantics.
186
187
Parameters:
188
- context: parsing context
189
- base_type: base class for generated nodes (default: Node)
190
- types: dict of rule name to type mappings
191
"""
192
193
class ASTSemantics:
194
"""Basic AST building semantics for parse tree construction."""
195
```
196
197
[Semantic Actions](./semantic-actions.md)
198
199
### Configuration and Context
200
201
Configure parser behavior, manage parsing state, and access parse position and rule information.
202
203
```python { .api }
204
class ParserConfig:
205
"""Parser configuration with settings for parsing behavior."""
206
207
class ParseInfo:
208
"""Parse position and rule information with line tracking."""
209
210
class LineInfo:
211
"""Source line information with position data."""
212
```
213
214
[Configuration](./configuration.md)
215
216
### AST and Object Models
217
218
Work with abstract syntax trees and structured parse results, including node creation, traversal, and manipulation.
219
220
```python { .api }
221
class AST(dict):
222
"""Abstract syntax tree node, dictionary-based with parse info."""
223
224
class Node:
225
"""Base parse tree node with parent/child relationships."""
226
227
def children(self):
228
"""Get all child nodes."""
229
230
def text_lines(self):
231
"""Get source text lines for this node."""
232
```
233
234
[AST and Models](./ast-models.md)
235
236
### Tree Walking
237
238
Traverse and transform parse trees using visitor patterns with pre-order, depth-first, and context-aware walking strategies.
239
240
```python { .api }
241
class NodeWalker:
242
"""Base tree walker with method dispatch."""
243
244
def walk(self, node):
245
"""Walk a parse tree starting from the given node."""
246
247
class DepthFirstWalker(NodeWalker):
248
"""Depth-first tree traversal walker."""
249
250
class ContextWalker(NodeWalker):
251
"""Context-aware tree walking with stack management."""
252
```
253
254
[Tree Walking](./tree-walking.md)
255
256
## Types
257
258
```python { .api }
259
from typing import Any, Dict, List, Optional, Union, Callable
260
261
# Core types
262
GrammarType = str
263
InputType = str
264
ASTType = Union[Dict[str, Any], List[Any], str, None]
265
SemanticActionType = Callable[[Any], Any]
266
267
# Configuration types
268
ParserSettings = Dict[str, Any]
269
GenerationSettings = Dict[str, Any]
270
271
# Model types
272
class Model:
273
"""Compiled grammar model that can parse input."""
274
275
def parse(self, input: str, **kwargs) -> ASTType:
276
"""Parse input text and return AST."""
277
278
def pretty(self) -> str:
279
"""Return pretty-printed grammar."""
280
```