0
# Lark Parser
1
2
A modern general-purpose parsing library for Python that can parse any context-free grammar efficiently with minimal code. Lark provides multiple parsing algorithms, automatically builds annotated parse trees, supports EBNF grammar syntax, and offers full Unicode support with automatic line/column tracking.
3
4
## Package Information
5
6
- **Package Name**: lark-parser
7
- **Language**: Python
8
- **Installation**: `pip install lark-parser`
9
10
## Core Imports
11
12
```python
13
from lark import Lark, Tree, Token
14
```
15
16
Common imports for visitors and transformers:
17
18
```python
19
from lark import Transformer, Visitor, v_args
20
```
21
22
Exception handling:
23
24
```python
25
from lark import (ParseError, LexError, GrammarError, UnexpectedToken,
26
UnexpectedInput, UnexpectedCharacters, UnexpectedEOF, LarkError)
27
```
28
29
## Basic Usage
30
31
```python
32
from lark import Lark
33
34
# Define a simple grammar
35
grammar = """
36
start: sum
37
sum: product ("+" product)*
38
product: number ("*" number)*
39
number: NUMBER
40
41
%import common.NUMBER
42
%import common.WS
43
%ignore WS
44
"""
45
46
# Create parser
47
parser = Lark(grammar)
48
49
# Parse text
50
text = "2 + 3 * 4"
51
tree = parser.parse(text)
52
print(tree.pretty())
53
54
# Result:
55
# start
56
# sum
57
# product
58
# number 2
59
# product
60
# number 3
61
# number 4
62
```
63
64
## Architecture
65
66
Lark follows a modular design with clear separation of concerns:
67
68
- **Lark**: Main parser interface that coordinates grammar loading, lexing, and parsing
69
- **Grammar**: EBNF grammar definitions with rule declarations and terminal imports
70
- **Lexer**: Tokenizes input text according to terminal definitions
71
- **Parser**: Transforms token stream into parse trees using selected algorithm (Earley, LALR, CYK)
72
- **Tree**: Parse tree nodes containing rule data and child elements
73
- **Visitors/Transformers**: Process parse trees for extraction, transformation, or interpretation
74
75
This architecture enables flexible parsing workflows where users can choose parsing algorithms, customize lexing behavior, and process results using various tree traversal patterns.
76
77
## Capabilities
78
79
### Core Parsing
80
81
Main parsing functionality including the Lark class, configuration options, parsing algorithms, and grammar loading. This provides the primary interface for creating parsers and parsing text.
82
83
```python { .api }
84
class Lark:
85
def __init__(self, grammar: str, **options): ...
86
def parse(self, text: str, start: str = None) -> Tree: ...
87
def lex(self, text: str) -> Iterator[Token]: ...
88
```
89
90
```python { .api }
91
class LarkOptions:
92
parser: str # "earley", "lalr", "cyk"
93
lexer: str # "auto", "standard", "contextual", "dynamic"
94
start: Union[str, List[str]]
95
debug: bool
96
transformer: Optional[Transformer]
97
```
98
99
[Core Parsing](./core-parsing.md)
100
101
### Tree Processing
102
103
Parse tree representation and processing including the Tree class for AST nodes, visitor patterns for tree traversal, and transformer classes for tree modification and data extraction.
104
105
```python { .api }
106
class Tree:
107
def __init__(self, data: str, children: list, meta=None): ...
108
def pretty(self, indent_str: str = ' ') -> str: ...
109
def find_data(self, data: str) -> Iterator[Tree]: ...
110
```
111
112
```python { .api }
113
class Transformer:
114
def transform(self, tree: Tree) -> Any: ...
115
def __default__(self, data: str, children: list, meta) -> Any: ...
116
```
117
118
```python { .api }
119
def v_args(inline: bool = False, meta: bool = False, tree: bool = False): ...
120
```
121
122
[Tree Processing](./tree-processing.md)
123
124
### Tokens and Lexing
125
126
Token representation and lexical analysis including the Token class for lexical units, lexer configuration, and indentation handling for Python-like languages.
127
128
```python { .api }
129
class Token(str):
130
def __new__(cls, type_: str, value: str, start_pos=None, line=None, column=None): ...
131
type: str
132
line: int
133
column: int
134
```
135
136
```python { .api }
137
class Indenter:
138
def process(self, stream: Iterator[Token]) -> Iterator[Token]: ...
139
```
140
141
[Tokens and Lexing](./tokens-lexing.md)
142
143
### Exception Handling
144
145
Comprehensive error handling including parse errors, lexical errors, grammar errors, and unexpected input handling with context information and error recovery.
146
147
```python { .api }
148
class ParseError(Exception): ...
149
class LexError(ParseError): ...
150
class GrammarError(LarkError): ...
151
class UnexpectedInput(ParseError):
152
def get_context(self, text: str, span: int = 40) -> str: ...
153
```
154
155
```python { .api }
156
class UnexpectedToken(UnexpectedInput):
157
token: Token
158
accepts: Set[str]
159
```
160
161
[Exception Handling](./exceptions.md)
162
163
### Utilities and Tools
164
165
Additional utilities including AST generation helpers, tree reconstruction, standalone parser generation, serialization, and visualization tools.
166
167
```python { .api }
168
def create_transformer(ast_module, transformer=None): ...
169
```
170
171
```python { .api }
172
class Reconstructor:
173
def __init__(self, parser: Lark): ...
174
def reconstruct(self, tree: Tree) -> str: ...
175
```
176
177
```python { .api }
178
def gen_standalone(lark_instance: Lark, out=None, compress: bool = False): ...
179
```
180
181
[Utilities and Tools](./utilities.md)
182
183
## Types
184
185
```python { .api }
186
# Core types
187
Tree = Tree
188
Token = Token
189
190
# Parser configuration
191
LarkOptions = LarkOptions
192
PostLex = PostLex
193
LexerConf = LexerConf
194
ParserConf = ParserConf
195
196
# Interactive parsing
197
InteractiveParser = InteractiveParser
198
ImmutableInteractiveParser = ImmutableInteractiveParser
199
200
# Grammar building
201
Symbol = Symbol
202
Terminal = Terminal
203
NonTerminal = NonTerminal
204
Rule = Rule
205
RuleOptions = RuleOptions
206
207
# Visitor/Transformer types
208
Transformer = Transformer
209
Visitor = Visitor
210
Interpreter = Interpreter
211
212
# AST utilities
213
Ast = Ast
214
AsList = AsList
215
Reconstructor = Reconstructor
216
217
# Exception types
218
LarkError = LarkError
219
ParseError = ParseError
220
LexError = LexError
221
GrammarError = GrammarError
222
UnexpectedInput = UnexpectedInput
223
UnexpectedToken = UnexpectedToken
224
UnexpectedCharacters = UnexpectedCharacters
225
UnexpectedEOF = UnexpectedEOF
226
VisitError = VisitError
227
Discard = Discard
228
```