CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-refurb

A tool for refurbishing and modernizing Python codebases

Pending
Overview
Eval results
Files

ast-utilities.mddocs/

AST Analysis Utilities

Comprehensive utilities for AST node analysis, type checking, and code pattern matching used by built-in checks and available for custom plugin development.

Capabilities

Core AST Analysis Functions

Fundamental functions for comparing and analyzing AST nodes that form the foundation of check logic.

def is_equivalent(lhs: Node | None, rhs: Node | None) -> bool:
    """
    Compare two AST nodes for structural equivalence.
    
    Performs deep comparison of AST structure, ignoring location information
    but comparing all semantic content including names, literals, operators,
    and nested structures.
    
    Parameters:
    - lhs: First AST node to compare (None allowed)
    - rhs: Second AST node to compare (None allowed)  
    
    Returns:
    True if nodes are structurally equivalent, False otherwise
    
    Examples:
    - is_equivalent(NameExpr("x"), NameExpr("x")) -> True
    - is_equivalent(IntExpr(42), IntExpr(42)) -> True
    - is_equivalent(CallExpr("f", []), CallExpr("g", [])) -> False
    """

def stringify(node: Node) -> str:
    """
    Convert AST node to string representation of equivalent source code.
    
    Reconstructs readable Python source code from AST nodes, handling
    all expression types, operators, literals, and complex structures.
    
    Parameters:
    - node: AST node to convert to string
    
    Returns:
    String representation of the code that would generate this AST node
    
    Examples:
    - stringify(NameExpr("variable")) -> "variable"
    - stringify(CallExpr("func", [IntExpr(1)])) -> "func(1)"
    - stringify(OpExpr("+", NameExpr("x"), IntExpr(5))) -> "x + 5"
    """

Type Analysis Functions

Functions that leverage Mypy's type system for sophisticated type-based analysis.

def get_mypy_type(node: Node) -> Type | SymbolNode | None:
    """
    Extract Mypy type information from AST node.
    
    Retrieves the resolved type information that Mypy has computed for
    an AST node, enabling type-aware analysis and checks.
    
    Parameters:
    - node: AST node to get type information for
    
    Returns:
    Mypy Type object, SymbolNode, or None if type unavailable
    
    Examples:
    - get_mypy_type(IntExpr(42)) -> Instance(int)
    - get_mypy_type(NameExpr("x")) -> Type of variable x
    - get_mypy_type(CallExpr(...)) -> Return type of function call
    """

def is_same_type(ty: Type | SymbolNode | None, *expected: TypeLike) -> bool:
    """
    Check if a type matches any of the expected types.
    
    Compares resolved Mypy types against expected type patterns,
    supporting both exact matches and inheritance relationships.
    
    Parameters:
    - ty: Type to check (from get_mypy_type)
    - expected: Variable number of expected type patterns
    
    Returns:
    True if type matches any expected pattern, False otherwise
    
    Examples:
    - is_same_type(int_type, "builtins.int") -> True
    - is_same_type(list_type, "builtins.list") -> True
    - is_same_type(custom_type, "my.module.MyClass") -> True
    """

Pattern Matching Utilities

Specialized functions for extracting and analyzing common code patterns.

def extract_binary_oper(oper: str, node: OpExpr) -> tuple[Expression, Expression] | None:
    """
    Extract operands from binary operation if it matches expected operator.
    
    Parameters:
    - oper: Expected operator string ("+", "-", "*", "/", etc.)
    - node: Binary operation AST node
    
    Returns:
    Tuple of (left_operand, right_operand) if operator matches, None otherwise
    
    Examples:
    - extract_binary_oper("+", x_plus_y_node) -> (x_expr, y_expr)
    - extract_binary_oper("*", x_plus_y_node) -> None
    """

def get_common_expr_positions(*exprs: Expression) -> tuple[int, int] | None:
    """
    Find common source position range across multiple expressions.
    
    Useful for generating error messages that span multiple related expressions.
    
    Parameters:
    - exprs: Variable number of Expression nodes
    
    Returns:
    Tuple of (start_position, end_position) or None if no common range
    """

def get_fstring_parts(expr: Expression) -> list[tuple[bool, Expression, str]]:
    """
    Parse f-string expression into component parts.
    
    Extracts literal text and embedded expressions from f-string literals
    for detailed analysis of string formatting patterns.
    
    Parameters:
    - expr: F-string expression to parse
    
    Returns:
    List of tuples: (is_expression, ast_node, text_content)
    - is_expression: True for {expr} parts, False for literal text
    - ast_node: AST node for the part
    - text_content: String representation
    """

Code Usage Analysis

Functions for analyzing variable usage patterns and code context.

class ReadCountVisitor:
    """
    AST visitor that counts variable usage within code contexts.
    
    Tracks how many times variables are read, written, or referenced
    within specified code blocks, enabling dead code detection and
    usage pattern analysis.
    
    Attributes:
    - read_count: dict[str, int] - Count of variable reads by name
    - contexts: list[Node] - Code contexts being analyzed
    """
    
    def visit_name_expr(self, node: NameExpr) -> None:
        """Count name expression reads."""
    
    def get_read_count(self, name: str) -> int:
        """Get total read count for variable name."""

def is_name_unused_in_contexts(name: NameExpr, contexts: list[Node]) -> bool:
    """
    Check if a variable name is unused within specified code contexts.
    
    Uses ReadCountVisitor to analyze variable usage patterns and identify
    potentially dead or redundant variable assignments.
    
    Parameters:
    - name: Variable name expression to check
    - contexts: List of AST nodes representing code contexts to search
    
    Returns:
    True if variable is never read in any context, False otherwise
    """

Type Checking Predicates

Functions that identify specific types and patterns in AST nodes.

def is_none_literal(node: Node) -> TypeGuard[NameExpr]:
    """
    Check if AST node represents the None literal.
    
    Parameters:
    - node: AST node to check
    
    Returns:
    True if node is None literal, False otherwise
    """

def is_bool_literal(node: Node) -> TypeGuard[NameExpr]:
    """
    Check if AST node represents a boolean literal (True or False).
    
    Parameters:
    - node: AST node to check
    
    Returns:
    True if node is True or False literal, False otherwise
    """

def is_mapping(expr: Expression) -> bool:
    """
    Check if expression has mapping (dict-like) type.
    
    Uses Mypy type information to determine if expression implements
    the mapping protocol (dict, defaultdict, etc.).
    
    Parameters:
    - expr: Expression to check
    
    Returns:
    True if expression is mapping type, False otherwise
    """

def is_sized(node: Expression) -> bool:
    """
    Check if expression has sized type (implements __len__).
    
    Parameters:
    - node: Expression to check
    
    Returns:
    True if expression implements sized protocol, False otherwise
    """

Path and Module Utilities

Functions for handling filesystem paths and module names in cross-platform analysis.

def normalize_os_path(module: str | None) -> str:
    """
    Normalize module path for cross-platform compatibility.
    
    Converts module paths to standardized format, handling differences
    between Windows and Unix-style paths in import analysis.
    
    Parameters:
    - module: Module path string to normalize
    
    Returns:
    Normalized path string suitable for cross-platform comparison
    """

Usage Examples

from refurb.checks.common import (
    is_equivalent, stringify, get_mypy_type, is_same_type,
    extract_binary_oper, is_none_literal, ReadCountVisitor
)
from mypy.nodes import CallExpr, NameExpr, OpExpr

def custom_check(node: CallExpr, errors: list[Error]) -> None:
    """Example custom check using AST utilities."""
    
    # Check if this is a call to 'len' function
    if isinstance(node.callee, NameExpr) and node.callee.name == "len":
        arg = node.args[0]
        
        # Get type information
        arg_type = get_mypy_type(arg)
        
        # Check for list type specifically
        if is_same_type(arg_type, "builtins.list"):
            # Suggest using collection directly in boolean context
            suggestion = f"Use `{stringify(arg)}` directly in boolean context"
            errors.append(MyError(node.line, node.column, suggestion))

def analyze_binary_ops(node: OpExpr, errors: list[Error]) -> None:
    """Example using pattern matching utilities."""
    
    # Check for x + 0 pattern
    if operands := extract_binary_oper("+", node):
        left, right = operands
        
        # Check if right operand is zero
        if isinstance(right, IntExpr) and right.value == 0:
            # Suggest removing redundant addition
            replacement = stringify(left)
            errors.append(RedundantOpError(
                node.line, node.column,
                f"Replace `{stringify(node)}` with `{replacement}`"
            ))

def check_unused_variables(context: Node, errors: list[Error]) -> None:
    """Example using usage analysis."""
    
    visitor = ReadCountVisitor()
    context.accept(visitor)
    
    # Find variables that are never read
    for var_name, count in visitor.read_count.items():
        if count == 0:
            errors.append(UnusedVariableError(
                context.line, context.column,
                f"Variable '{var_name}' is never used"
            ))

Integration with Built-in Checks

These utilities are extensively used throughout refurb's 94 built-in checks:

  • Type-based checks: Use get_mypy_type and is_same_type for sophisticated type analysis
  • Pattern matching: Use extract_binary_oper and equivalence checking for code patterns
  • Code generation: Use stringify for generating replacement suggestions
  • Usage analysis: Use ReadCountVisitor for dead code detection
  • Cross-platform support: Use path normalization for consistent behavior

The utilities provide a solid foundation for both built-in checks and custom plugin development, abstracting away the complexity of AST manipulation and type analysis.

Install with Tessl CLI

npx tessl i tessl/pypi-refurb

docs

analysis.md

ast-utilities.md

cli.md

configuration.md

errors.md

index.md

plugins.md

tile.json