0
# Tree Sitter Languages
1
2
Binary Python wheels for all tree sitter languages, eliminating the need to download and compile support for individual languages. This package provides a comprehensive collection of tree-sitter language parsers that can be easily installed via pip, offering a simple API to access any of the included language parsers without the complexity of individual language setup.
3
4
## Package Information
5
6
- **Package Name**: tree-sitter-languages
7
- **Language**: Python
8
- **Installation**: `pip install tree_sitter_languages`
9
10
## Core Imports
11
12
```python
13
from tree_sitter_languages import get_language, get_parser
14
```
15
16
## Basic Usage
17
18
```python
19
from tree_sitter_languages import get_language, get_parser
20
21
# Get a language object for Python
22
language = get_language('python')
23
24
# Get a pre-configured parser for Python
25
parser = get_parser('python')
26
27
# Parse some Python code
28
code = b"""
29
def hello():
30
print("Hello, world!")
31
"""
32
33
tree = parser.parse(code)
34
root_node = tree.root_node
35
36
# Query for function definitions
37
query = language.query('(function_definition name: (identifier) @func)')
38
captures = query.captures(root_node)
39
40
# Print function names
41
for node, capture_name in captures:
42
if capture_name == "func":
43
print(f"Found function: {node.text.decode()}")
44
```
45
46
## Capabilities
47
48
### Language Object Creation
49
50
Creates a tree-sitter Language object for the specified language, loading the appropriate binary parser from the bundled language binaries.
51
52
```python { .api }
53
def get_language(language: str) -> Language
54
```
55
56
**Parameters:**
57
- `language` (str): Language name identifier (one of the 48 supported languages)
58
59
**Returns:**
60
- `Language`: A tree-sitter Language object configured for the specified language
61
62
**Raises:**
63
- Exceptions from the underlying tree-sitter library for invalid language names
64
65
### Parser Creation
66
67
Creates a pre-configured tree-sitter Parser object for the specified language, combining language loading and parser setup in one step.
68
69
```python { .api }
70
def get_parser(language: str) -> Parser
71
```
72
73
**Parameters:**
74
- `language` (str): Language name identifier (one of the 48 supported languages)
75
76
**Returns:**
77
- `Parser`: A tree-sitter Parser object pre-configured with the specified language
78
79
**Raises:**
80
- Exceptions from the underlying tree-sitter library for invalid language names
81
82
## Supported Languages
83
84
The package includes binary parsers for the following 48 programming languages:
85
86
- `bash` - Bash shell scripts
87
- `c` - C programming language
88
- `c_sharp` - C# programming language (use 'c_sharp', not 'c-sharp')
89
- `commonlisp` - Common Lisp
90
- `cpp` - C++ programming language
91
- `css` - Cascading Style Sheets
92
- `dockerfile` - Docker container files
93
- `dot` - Graphviz DOT language
94
- `elisp` - Emacs Lisp
95
- `elixir` - Elixir programming language
96
- `elm` - Elm programming language
97
- `embedded_template` - Embedded template languages (use 'embedded_template', not 'embedded-template')
98
- `erlang` - Erlang programming language
99
- `fixed_form_fortran` - Fixed-form Fortran
100
- `fortran` - Modern Fortran
101
- `go` - Go programming language
102
- `gomod` - Go module files (use 'gomod', not 'go-mod')
103
- `hack` - Hack programming language
104
- `haskell` - Haskell programming language
105
- `hcl` - HashiCorp Configuration Language
106
- `html` - HyperText Markup Language
107
- `java` - Java programming language
108
- `javascript` - JavaScript programming language
109
- `jsdoc` - JSDoc documentation comments
110
- `json` - JavaScript Object Notation
111
- `julia` - Julia programming language
112
- `kotlin` - Kotlin programming language
113
- `lua` - Lua programming language
114
- `make` - Makefile syntax
115
- `markdown` - Markdown markup language
116
- `objc` - Objective-C programming language
117
- `ocaml` - OCaml programming language
118
- `perl` - Perl programming language
119
- `php` - PHP programming language
120
- `python` - Python programming language
121
- `ql` - CodeQL query language
122
- `r` - R programming language
123
- `regex` - Regular expressions
124
- `rst` - reStructuredText markup
125
- `ruby` - Ruby programming language
126
- `rust` - Rust programming language
127
- `scala` - Scala programming language
128
- `sql` - SQL database language
129
- `sqlite` - SQLite-specific SQL
130
- `toml` - TOML configuration format
131
- `tsq` - Tree-sitter query language
132
- `typescript` - TypeScript programming language
133
- `yaml` - YAML configuration format
134
135
## Package Constants
136
137
```python { .api }
138
__version__: str = '1.10.2'
139
__title__: str = 'tree_sitter_languages'
140
__author__: str = 'Grant Jenks'
141
__license__: str = 'Apache 2.0'
142
__copyright__: str = '2022-2023, Grant Jenks'
143
```
144
145
## Types
146
147
The functions return standard tree-sitter objects:
148
149
```python { .api }
150
# From tree_sitter package (dependency)
151
class Language:
152
"""Tree-sitter language parser object"""
153
def query(self, source: str) -> Query: ...
154
155
class Parser:
156
"""Tree-sitter parser object"""
157
def parse(self, source: bytes) -> Tree: ...
158
def set_language(self, language: Language) -> None: ...
159
160
class Tree:
161
"""Parse tree result"""
162
@property
163
def root_node(self) -> Node: ...
164
165
class Node:
166
"""Tree node"""
167
@property
168
def text(self) -> bytes: ...
169
@property
170
def type(self) -> str: ...
171
172
class Query:
173
"""Tree-sitter query object"""
174
def captures(self, node: Node) -> List[Tuple[Node, str]]: ...
175
```
176
177
## Advanced Usage Examples
178
179
### Multi-language Project Analysis
180
181
```python
182
from tree_sitter_languages import get_parser
183
184
# Parse different file types in a project
185
parsers = {
186
'python': get_parser('python'),
187
'javascript': get_parser('javascript'),
188
'css': get_parser('css'),
189
'html': get_parser('html')
190
}
191
192
def analyze_file(file_path, content):
193
if file_path.endswith('.py'):
194
tree = parsers['python'].parse(content.encode())
195
elif file_path.endswith('.js'):
196
tree = parsers['javascript'].parse(content.encode())
197
elif file_path.endswith('.css'):
198
tree = parsers['css'].parse(content.encode())
199
elif file_path.endswith('.html'):
200
tree = parsers['html'].parse(content.encode())
201
else:
202
return None
203
204
return tree.root_node
205
```
206
207
### Finding Code Patterns with Queries
208
209
```python
210
from tree_sitter_languages import get_language, get_parser
211
212
# Set up Python parser and language
213
language = get_language('python')
214
parser = get_parser('python')
215
216
# Parse Python code
217
python_code = b'''
218
class Calculator:
219
def add(self, a, b):
220
return a + b
221
222
def multiply(self, a, b):
223
return a * b
224
225
def standalone_function():
226
calc = Calculator()
227
return calc.add(1, 2)
228
'''
229
230
tree = parser.parse(python_code)
231
232
# Find all method definitions in classes
233
method_query = language.query('''
234
(class_definition
235
body: (block
236
(function_definition
237
name: (identifier) @method_name)))
238
''')
239
240
methods = method_query.captures(tree.root_node)
241
for node, capture_name in methods:
242
print(f"Method: {node.text.decode()}")
243
244
# Find all function calls
245
call_query = language.query('(call function: (identifier) @func_name)')
246
calls = call_query.captures(tree.root_node)
247
for node, capture_name in calls:
248
print(f"Function call: {node.text.decode()}")
249
```
250
251
## Error Handling
252
253
Invalid language names will raise exceptions from the underlying tree-sitter library:
254
255
```python
256
from tree_sitter_languages import get_language
257
258
try:
259
# This will raise an exception
260
language = get_language('invalid_language')
261
except Exception as e:
262
print(f"Error: {e}")
263
# Handle the error appropriately
264
```
265
266
The package handles platform-specific binary loading automatically (`.so` files on Unix/Linux, `.dll` files on Windows), so no platform-specific code is needed in your application.