or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

index.md
tile.json

tessl/pypi-tree-sitter-languages

Binary Python wheels for all tree sitter languages

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/tree-sitter-languages@1.10.x

To install, run

npx @tessl/cli install tessl/pypi-tree-sitter-languages@1.10.0

index.mddocs/

Tree Sitter Languages

Binary Python wheels for all tree sitter languages, eliminating the need to download and compile support for individual languages. This package provides a comprehensive collection of tree-sitter language parsers that can be easily installed via pip, offering a simple API to access any of the included language parsers without the complexity of individual language setup.

Package Information

  • Package Name: tree-sitter-languages
  • Language: Python
  • Installation: pip install tree_sitter_languages

Core Imports

from tree_sitter_languages import get_language, get_parser

Basic Usage

from tree_sitter_languages import get_language, get_parser

# Get a language object for Python
language = get_language('python')

# Get a pre-configured parser for Python  
parser = get_parser('python')

# Parse some Python code
code = b"""
def hello():
    print("Hello, world!")
"""

tree = parser.parse(code)
root_node = tree.root_node

# Query for function definitions
query = language.query('(function_definition name: (identifier) @func)')
captures = query.captures(root_node)

# Print function names
for node, capture_name in captures:
    if capture_name == "func":
        print(f"Found function: {node.text.decode()}")

Capabilities

Language Object Creation

Creates a tree-sitter Language object for the specified language, loading the appropriate binary parser from the bundled language binaries.

def get_language(language: str) -> Language

Parameters:

  • language (str): Language name identifier (one of the 48 supported languages)

Returns:

  • Language: A tree-sitter Language object configured for the specified language

Raises:

  • Exceptions from the underlying tree-sitter library for invalid language names

Parser Creation

Creates a pre-configured tree-sitter Parser object for the specified language, combining language loading and parser setup in one step.

def get_parser(language: str) -> Parser

Parameters:

  • language (str): Language name identifier (one of the 48 supported languages)

Returns:

  • Parser: A tree-sitter Parser object pre-configured with the specified language

Raises:

  • Exceptions from the underlying tree-sitter library for invalid language names

Supported Languages

The package includes binary parsers for the following 48 programming languages:

  • bash - Bash shell scripts
  • c - C programming language
  • c_sharp - C# programming language (use 'c_sharp', not 'c-sharp')
  • commonlisp - Common Lisp
  • cpp - C++ programming language
  • css - Cascading Style Sheets
  • dockerfile - Docker container files
  • dot - Graphviz DOT language
  • elisp - Emacs Lisp
  • elixir - Elixir programming language
  • elm - Elm programming language
  • embedded_template - Embedded template languages (use 'embedded_template', not 'embedded-template')
  • erlang - Erlang programming language
  • fixed_form_fortran - Fixed-form Fortran
  • fortran - Modern Fortran
  • go - Go programming language
  • gomod - Go module files (use 'gomod', not 'go-mod')
  • hack - Hack programming language
  • haskell - Haskell programming language
  • hcl - HashiCorp Configuration Language
  • html - HyperText Markup Language
  • java - Java programming language
  • javascript - JavaScript programming language
  • jsdoc - JSDoc documentation comments
  • json - JavaScript Object Notation
  • julia - Julia programming language
  • kotlin - Kotlin programming language
  • lua - Lua programming language
  • make - Makefile syntax
  • markdown - Markdown markup language
  • objc - Objective-C programming language
  • ocaml - OCaml programming language
  • perl - Perl programming language
  • php - PHP programming language
  • python - Python programming language
  • ql - CodeQL query language
  • r - R programming language
  • regex - Regular expressions
  • rst - reStructuredText markup
  • ruby - Ruby programming language
  • rust - Rust programming language
  • scala - Scala programming language
  • sql - SQL database language
  • sqlite - SQLite-specific SQL
  • toml - TOML configuration format
  • tsq - Tree-sitter query language
  • typescript - TypeScript programming language
  • yaml - YAML configuration format

Package Constants

__version__: str = '1.10.2'
__title__: str = 'tree_sitter_languages'
__author__: str = 'Grant Jenks'
__license__: str = 'Apache 2.0'
__copyright__: str = '2022-2023, Grant Jenks'

Types

The functions return standard tree-sitter objects:

# From tree_sitter package (dependency)
class Language:
    """Tree-sitter language parser object"""
    def query(self, source: str) -> Query: ...

class Parser:
    """Tree-sitter parser object"""
    def parse(self, source: bytes) -> Tree: ...
    def set_language(self, language: Language) -> None: ...

class Tree:
    """Parse tree result"""
    @property
    def root_node(self) -> Node: ...

class Node:
    """Tree node"""
    @property
    def text(self) -> bytes: ...
    @property
    def type(self) -> str: ...

class Query:
    """Tree-sitter query object"""
    def captures(self, node: Node) -> List[Tuple[Node, str]]: ...

Advanced Usage Examples

Multi-language Project Analysis

from tree_sitter_languages import get_parser

# Parse different file types in a project
parsers = {
    'python': get_parser('python'),
    'javascript': get_parser('javascript'),
    'css': get_parser('css'),
    'html': get_parser('html')
}

def analyze_file(file_path, content):
    if file_path.endswith('.py'):
        tree = parsers['python'].parse(content.encode())
    elif file_path.endswith('.js'):
        tree = parsers['javascript'].parse(content.encode())
    elif file_path.endswith('.css'):
        tree = parsers['css'].parse(content.encode())
    elif file_path.endswith('.html'):
        tree = parsers['html'].parse(content.encode())
    else:
        return None
    
    return tree.root_node

Finding Code Patterns with Queries

from tree_sitter_languages import get_language, get_parser

# Set up Python parser and language
language = get_language('python')
parser = get_parser('python')

# Parse Python code
python_code = b'''
class Calculator:
    def add(self, a, b):
        return a + b
    
    def multiply(self, a, b):
        return a * b

def standalone_function():
    calc = Calculator()
    return calc.add(1, 2)
'''

tree = parser.parse(python_code)

# Find all method definitions in classes
method_query = language.query('''
(class_definition
  body: (block
    (function_definition
      name: (identifier) @method_name)))
''')

methods = method_query.captures(tree.root_node)
for node, capture_name in methods:
    print(f"Method: {node.text.decode()}")

# Find all function calls
call_query = language.query('(call function: (identifier) @func_name)')
calls = call_query.captures(tree.root_node)
for node, capture_name in calls:
    print(f"Function call: {node.text.decode()}")

Error Handling

Invalid language names will raise exceptions from the underlying tree-sitter library:

from tree_sitter_languages import get_language

try:
    # This will raise an exception
    language = get_language('invalid_language')
except Exception as e:
    print(f"Error: {e}")
    # Handle the error appropriately

The package handles platform-specific binary loading automatically (.so files on Unix/Linux, .dll files on Windows), so no platform-specific code is needed in your application.