0
# Tree-sitter
1
2
Tree-sitter provides Python bindings to the Tree-sitter parsing library for incremental parsing and syntax tree analysis. It enables developers to parse source code into syntax trees for language analysis, code navigation, syntax highlighting, and building development tools like language servers and code formatters.
3
4
## Package Information
5
6
- **Package Name**: tree-sitter
7
- **Language**: Python
8
- **Installation**: `pip install tree-sitter`
9
- **Minimum Python Version**: 3.10+
10
11
## Core Imports
12
13
```python
14
from tree_sitter import Language, Parser, Tree, Node, TreeCursor
15
```
16
17
For queries and pattern matching:
18
19
```python
20
from tree_sitter import Query, QueryCursor, QueryError, QueryPredicate
21
```
22
23
For utilities and types:
24
25
```python
26
from tree_sitter import Point, Range, LogType, LookaheadIterator
27
```
28
29
Constants:
30
31
```python
32
from tree_sitter import LANGUAGE_VERSION, MIN_COMPATIBLE_LANGUAGE_VERSION
33
```
34
35
## Basic Usage
36
37
```python
38
import tree_sitter_python
39
from tree_sitter import Language, Parser
40
41
# Load a language grammar (requires separate language package)
42
PY_LANGUAGE = Language(tree_sitter_python.language())
43
44
# Create parser and set language
45
parser = Parser(PY_LANGUAGE)
46
47
# Parse source code
48
source_code = '''
49
def hello_world():
50
print("Hello, world!")
51
return True
52
'''
53
54
tree = parser.parse(bytes(source_code, "utf8"))
55
56
# Inspect the syntax tree
57
root_node = tree.root_node
58
print(f"Root node type: {root_node.type}")
59
print(f"Tree structure:\n{root_node}")
60
61
# Navigate the tree
62
function_node = root_node.children[0]
63
function_name = function_node.child_by_field_name("name")
64
print(f"Function name: {function_name.type}")
65
```
66
67
## Architecture
68
69
Tree-sitter uses a layered architecture optimized for incremental parsing:
70
71
- **Language**: Represents a grammar for parsing specific programming languages
72
- **Parser**: Stateful parser that converts source code into syntax trees using a language
73
- **Tree**: Immutable syntax tree containing the parsed structure
74
- **Node**: Individual elements in the tree with position, type, and relationship information
75
- **TreeCursor**: Efficient navigation mechanism for traversing large trees
76
- **Query System**: Pattern matching for structural queries against syntax trees
77
78
The design enables high-performance parsing with incremental updates, making it ideal for real-time applications like editors and development tools.
79
80
## Capabilities
81
82
### Language and Parser Management
83
84
Core functionality for loading language grammars and creating parsers for converting source code into syntax trees.
85
86
```python { .api }
87
class Language:
88
def __init__(self, ptr: object) -> None: ...
89
@property
90
def name(self) -> str | None: ...
91
@property
92
def abi_version(self) -> int: ...
93
94
class Parser:
95
def __init__(
96
self,
97
language: Language | None = None,
98
*,
99
included_ranges: list[Range] | None = None,
100
logger: Callable[[LogType, str], None] | None = None,
101
) -> None: ...
102
def parse(
103
self,
104
source: bytes | Callable[[int, Point], bytes | None],
105
old_tree: Tree | None = None,
106
encoding: str = "utf8"
107
) -> Tree: ...
108
```
109
110
[Language and Parser Management](./language-parser.md)
111
112
### Syntax Tree Navigation
113
114
Navigate and inspect parsed syntax trees using nodes and cursors for efficient tree traversal.
115
116
```python { .api }
117
class Tree:
118
@property
119
def root_node(self) -> Node: ...
120
def walk(self) -> TreeCursor: ...
121
def changed_ranges(self, new_tree: Tree) -> list[Range]: ...
122
123
class Node:
124
@property
125
def type(self) -> str: ...
126
@property
127
def children(self) -> list[Node]: ...
128
@property
129
def start_point(self) -> Point: ...
130
@property
131
def end_point(self) -> Point: ...
132
def child_by_field_name(self, name: str) -> Node | None: ...
133
134
class TreeCursor:
135
@property
136
def node(self) -> Node | None: ...
137
def goto_first_child(self) -> bool: ...
138
def goto_next_sibling(self) -> bool: ...
139
def goto_parent(self) -> bool: ...
140
```
141
142
[Syntax Tree Navigation](./tree-navigation.md)
143
144
### Pattern Matching and Queries
145
146
Powerful query system for finding patterns in syntax trees using Tree-sitter's query language.
147
148
```python { .api }
149
class Query:
150
def __init__(self, language: Language, source: str) -> None: ...
151
def pattern_count(self) -> int: ...
152
def capture_name(self, index: int) -> str: ...
153
154
class QueryCursor:
155
def __init__(self, query: Query, *, match_limit: int = 0xFFFFFFFF) -> None: ...
156
def captures(
157
self,
158
node: Node,
159
predicate: QueryPredicate | None = None
160
) -> dict[str, list[Node]]: ...
161
def matches(
162
self,
163
node: Node,
164
predicate: QueryPredicate | None = None
165
) -> list[tuple[int, dict[str, list[Node]]]]: ...
166
```
167
168
[Pattern Matching and Queries](./queries.md)
169
170
### Incremental Parsing
171
172
Edit syntax trees and perform incremental parsing for efficient updates when source code changes.
173
174
```python { .api }
175
# Tree editing
176
def Tree.edit(
177
self,
178
start_byte: int,
179
old_end_byte: int,
180
new_end_byte: int,
181
start_point: Point | tuple[int, int],
182
old_end_point: Point | tuple[int, int],
183
new_end_point: Point | tuple[int, int],
184
) -> None: ...
185
186
# Incremental parsing
187
def Parser.parse(
188
self,
189
source: bytes,
190
old_tree: Tree | None = None, # Enables incremental parsing
191
encoding: str = "utf8"
192
) -> Tree: ...
193
```
194
195
[Incremental Parsing](./incremental-parsing.md)
196
197
## Types
198
199
```python { .api }
200
from typing import NamedTuple
201
from enum import IntEnum
202
203
class Point(NamedTuple):
204
row: int
205
column: int
206
207
class Range:
208
def __init__(
209
self,
210
start_point: Point | tuple[int, int],
211
end_point: Point | tuple[int, int],
212
start_byte: int,
213
end_byte: int,
214
) -> None: ...
215
@property
216
def start_point(self) -> Point: ...
217
@property
218
def end_point(self) -> Point: ...
219
@property
220
def start_byte(self) -> int: ...
221
@property
222
def end_byte(self) -> int: ...
223
224
class LogType(IntEnum):
225
PARSE: int
226
LEX: int
227
228
class QueryError(ValueError): ...
229
230
class LookaheadIterator:
231
"""Iterator for lookahead symbols in parse states."""
232
@property
233
def language(self) -> Language: ...
234
@property
235
def current_symbol(self) -> int: ...
236
@property
237
def current_symbol_name(self) -> str: ...
238
def reset(self, state: int, language: Language | None = None) -> bool: ...
239
def names(self) -> list[str]: ...
240
def symbols(self) -> list[int]: ...
241
def __next__(self) -> tuple[int, str]: ...
242
243
# Protocol for custom query predicates
244
class QueryPredicate:
245
def __call__(
246
self,
247
predicate: str,
248
args: list[tuple[str, str]], # str is "capture" | "string"
249
pattern_index: int,
250
captures: dict[str, list[Node]],
251
) -> bool: ...
252
253
# Constants
254
LANGUAGE_VERSION: int
255
"""Tree-sitter language ABI version."""
256
257
MIN_COMPATIBLE_LANGUAGE_VERSION: int
258
"""Minimum compatible language ABI version."""
259
```