CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-tree-sitter-languages

Binary Python wheels for all tree sitter languages

Overview
Eval results
Files

Tree Sitter Languages

Binary Python wheels for all tree sitter languages, eliminating the need to download and compile support for individual languages. This package provides a comprehensive collection of tree-sitter language parsers that can be easily installed via pip, offering a simple API to access any of the included language parsers without the complexity of individual language setup.

Package Information

  • Package Name: tree-sitter-languages
  • Language: Python
  • Installation: pip install tree_sitter_languages

Core Imports

from tree_sitter_languages import get_language, get_parser

Basic Usage

from tree_sitter_languages import get_language, get_parser

# Get a language object for Python
language = get_language('python')

# Get a pre-configured parser for Python  
parser = get_parser('python')

# Parse some Python code
code = b"""
def hello():
    print("Hello, world!")
"""

tree = parser.parse(code)
root_node = tree.root_node

# Query for function definitions
query = language.query('(function_definition name: (identifier) @func)')
captures = query.captures(root_node)

# Print function names
for node, capture_name in captures:
    if capture_name == "func":
        print(f"Found function: {node.text.decode()}")

Capabilities

Language Object Creation

Creates a tree-sitter Language object for the specified language, loading the appropriate binary parser from the bundled language binaries.

def get_language(language: str) -> Language

Parameters:

  • language (str): Language name identifier (one of the 48 supported languages)

Returns:

  • Language: A tree-sitter Language object configured for the specified language

Raises:

  • Exceptions from the underlying tree-sitter library for invalid language names

Parser Creation

Creates a pre-configured tree-sitter Parser object for the specified language, combining language loading and parser setup in one step.

def get_parser(language: str) -> Parser

Parameters:

  • language (str): Language name identifier (one of the 48 supported languages)

Returns:

  • Parser: A tree-sitter Parser object pre-configured with the specified language

Raises:

  • Exceptions from the underlying tree-sitter library for invalid language names

Supported Languages

The package includes binary parsers for the following 48 programming languages:

  • bash - Bash shell scripts
  • c - C programming language
  • c_sharp - C# programming language (use 'c_sharp', not 'c-sharp')
  • commonlisp - Common Lisp
  • cpp - C++ programming language
  • css - Cascading Style Sheets
  • dockerfile - Docker container files
  • dot - Graphviz DOT language
  • elisp - Emacs Lisp
  • elixir - Elixir programming language
  • elm - Elm programming language
  • embedded_template - Embedded template languages (use 'embedded_template', not 'embedded-template')
  • erlang - Erlang programming language
  • fixed_form_fortran - Fixed-form Fortran
  • fortran - Modern Fortran
  • go - Go programming language
  • gomod - Go module files (use 'gomod', not 'go-mod')
  • hack - Hack programming language
  • haskell - Haskell programming language
  • hcl - HashiCorp Configuration Language
  • html - HyperText Markup Language
  • java - Java programming language
  • javascript - JavaScript programming language
  • jsdoc - JSDoc documentation comments
  • json - JavaScript Object Notation
  • julia - Julia programming language
  • kotlin - Kotlin programming language
  • lua - Lua programming language
  • make - Makefile syntax
  • markdown - Markdown markup language
  • objc - Objective-C programming language
  • ocaml - OCaml programming language
  • perl - Perl programming language
  • php - PHP programming language
  • python - Python programming language
  • ql - CodeQL query language
  • r - R programming language
  • regex - Regular expressions
  • rst - reStructuredText markup
  • ruby - Ruby programming language
  • rust - Rust programming language
  • scala - Scala programming language
  • sql - SQL database language
  • sqlite - SQLite-specific SQL
  • toml - TOML configuration format
  • tsq - Tree-sitter query language
  • typescript - TypeScript programming language
  • yaml - YAML configuration format

Package Constants

__version__: str = '1.10.2'
__title__: str = 'tree_sitter_languages'
__author__: str = 'Grant Jenks'
__license__: str = 'Apache 2.0'
__copyright__: str = '2022-2023, Grant Jenks'

Types

The functions return standard tree-sitter objects:

# From tree_sitter package (dependency)
class Language:
    """Tree-sitter language parser object"""
    def query(self, source: str) -> Query: ...

class Parser:
    """Tree-sitter parser object"""
    def parse(self, source: bytes) -> Tree: ...
    def set_language(self, language: Language) -> None: ...

class Tree:
    """Parse tree result"""
    @property
    def root_node(self) -> Node: ...

class Node:
    """Tree node"""
    @property
    def text(self) -> bytes: ...
    @property
    def type(self) -> str: ...

class Query:
    """Tree-sitter query object"""
    def captures(self, node: Node) -> List[Tuple[Node, str]]: ...

Advanced Usage Examples

Multi-language Project Analysis

from tree_sitter_languages import get_parser

# Parse different file types in a project
parsers = {
    'python': get_parser('python'),
    'javascript': get_parser('javascript'),
    'css': get_parser('css'),
    'html': get_parser('html')
}

def analyze_file(file_path, content):
    if file_path.endswith('.py'):
        tree = parsers['python'].parse(content.encode())
    elif file_path.endswith('.js'):
        tree = parsers['javascript'].parse(content.encode())
    elif file_path.endswith('.css'):
        tree = parsers['css'].parse(content.encode())
    elif file_path.endswith('.html'):
        tree = parsers['html'].parse(content.encode())
    else:
        return None
    
    return tree.root_node

Finding Code Patterns with Queries

from tree_sitter_languages import get_language, get_parser

# Set up Python parser and language
language = get_language('python')
parser = get_parser('python')

# Parse Python code
python_code = b'''
class Calculator:
    def add(self, a, b):
        return a + b
    
    def multiply(self, a, b):
        return a * b

def standalone_function():
    calc = Calculator()
    return calc.add(1, 2)
'''

tree = parser.parse(python_code)

# Find all method definitions in classes
method_query = language.query('''
(class_definition
  body: (block
    (function_definition
      name: (identifier) @method_name)))
''')

methods = method_query.captures(tree.root_node)
for node, capture_name in methods:
    print(f"Method: {node.text.decode()}")

# Find all function calls
call_query = language.query('(call function: (identifier) @func_name)')
calls = call_query.captures(tree.root_node)
for node, capture_name in calls:
    print(f"Function call: {node.text.decode()}")

Error Handling

Invalid language names will raise exceptions from the underlying tree-sitter library:

from tree_sitter_languages import get_language

try:
    # This will raise an exception
    language = get_language('invalid_language')
except Exception as e:
    print(f"Error: {e}")
    # Handle the error appropriately

The package handles platform-specific binary loading automatically (.so files on Unix/Linux, .dll files on Windows), so no platform-specific code is needed in your application.

Install with Tessl CLI

npx tessl i tessl/pypi-tree-sitter-languages
Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/tree-sitter-languages@1.10.x