tessl/pypi-asttokens

Annotate AST trees with source code positions

—

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview

Eval results

Files

ASTTokens

Name: tessl/pypi-asttokens
Author: tessl

A Python library that annotates Abstract Syntax Trees (ASTs) with the positions of tokens and text in the source code that generated them. ASTTokens enables tools that work with logical AST nodes to find the particular text that resulted in those nodes, making it essential for automated refactoring, syntax highlighting, and code analysis tools.

Package Information

Package Name: asttokens
Language: Python
Installation: pip install asttokens

Core Imports

import asttokens

For direct access to main classes:

from asttokens import ASTTokens, ASTText, LineNumbers, supports_tokenless

For utility functions:

import asttokens.util
# or
from asttokens.util import walk, visit_tree, is_expr, match_token, token_repr

Basic Usage

import asttokens
import asttokens.util
import ast

# Basic usage - parse and annotate source code
source = "Robot('blue').walk(steps=10*n)"
atok = asttokens.ASTTokens(source, parse=True)

# Find a specific AST node and get its source text
attr_node = next(n for n in ast.walk(atok.tree) if isinstance(n, ast.Attribute))
print(atok.get_text(attr_node))  # Output: Robot('blue').walk

# Get position information
start, end = attr_node.last_token.startpos, attr_node.last_token.endpos
print(atok.text[:start] + 'RUN' + atok.text[end:])  # Output: Robot('blue').RUN(steps=10*n)

# Performance-optimized usage for newer Python versions
if asttokens.supports_tokenless():
    astext = asttokens.ASTText(source, tree=ast.parse(source))
    text = astext.get_text(attr_node)  # Faster for supported nodes

Architecture

ASTTokens provides a layered architecture for AST-to-source mapping:

ASTTokens: Full-featured class that tokenizes source code and marks AST nodes with .first_token and .last_token attributes
ASTText: Performance-optimized alternative that uses AST position information when available, falling back to tokenization
LineNumbers: Utility for converting between character offsets and line/column positions
Token: Enhanced token representation with both line/column and character offset positions

The library supports both standard Python ast module trees and astroid library trees, making it compatible with various Python static analysis tools.

Capabilities

Core AST Processing

Main classes for annotating AST trees with source code positions and extracting text from AST nodes. These provide the primary functionality for mapping between AST structures and their corresponding source code.

class ASTTokens:
    def __init__(self, source_text, parse=False, tree=None, filename='<unknown>', tokens=None): ...
    def get_text(self, node, padded=True) -> str: ...
    def get_text_range(self, node, padded=True) -> tuple[int, int]: ...
    def mark_tokens(self, root_node): ...

class ASTText:
    def __init__(self, source_text, tree=None, filename='<unknown>'): ...
    def get_text(self, node, padded=True) -> str: ...
    def get_text_range(self, node, padded=True) -> tuple[int, int]: ...

Core AST Processing

Token Navigation

Functions and methods for navigating and searching through tokenized source code, finding specific tokens by position, type, or content.

class ASTTokens:
    def get_token_from_offset(self, offset) -> Token: ...
    def get_token(self, lineno, col_offset) -> Token: ...
    def next_token(self, tok, include_extra=False) -> Token: ...
    def prev_token(self, tok, include_extra=False) -> Token: ...
    def find_token(self, start_token, tok_type, tok_str=None, reverse=False) -> Token: ...

Token Navigation

Position Utilities

Utilities for converting between different position representations (line/column vs character offsets) and working with source code positions.

class LineNumbers:
    def __init__(self, text): ...
    def line_to_offset(self, line, column) -> int: ...
    def offset_to_line(self, offset) -> tuple[int, int]: ...
    def from_utf8_col(self, line, utf8_column) -> int: ...

def supports_tokenless(node=None) -> bool: ...

Position Utilities

AST Node Utilities

Helper functions for working with AST nodes, including type checking, tree traversal, and node classification utilities.

def walk(node, include_joined_str=False): ...
def visit_tree(node, previsit, postvisit): ...
def is_expr(node) -> bool: ...
def is_stmt(node) -> bool: ...
def is_module(node) -> bool: ...

AST Node Utilities

Utility Module Access

The asttokens.util module provides additional utility functions for advanced use cases including token manipulation, tree traversal, and node type checking. These functions offer fine-grained control over AST processing beyond the main classes.

import asttokens.util

# Module contains various utility functions accessible as:
# asttokens.util.walk()
# asttokens.util.match_token() 
# asttokens.util.is_expr()
# ... and many others documented in sub-docs

Types

from typing import Tuple, List, Iterator, Optional, Any

class Token:
    """Enhanced token representation with position information."""
    type: int          # Token type from token module
    string: str        # Token text content
    start: Tuple[int, int]    # Starting (row, column) position
    end: Tuple[int, int]      # Ending (row, column) position
    line: str          # Original line text
    index: int         # Token index in token list
    startpos: int      # Starting character offset
    endpos: int        # Ending character offset
    
    def __str__(self) -> str: ...

# Type aliases for AST nodes with token attributes
AstNode = Any  # Union of ast.AST and astroid nodes with .first_token/.last_token
EnhancedAST = Any  # AST with added token attributes

Install with Tessl CLI