tessl/pypi-refurb

A tool for refurbishing and modernizing Python codebases

—

Pending

Overview

Eval results

Files

Core Analysis Engine

Name: tessl/pypi-refurb
Author: tessl

The heart of refurb's code analysis, managing integration with Mypy, check execution, and error collection through a sophisticated AST-based analysis pipeline.

Capabilities

Main Analysis Function

The primary function that orchestrates the entire analysis process from Mypy integration through error collection and filtering.

def run_refurb(settings: Settings) -> Sequence[Error | str]:
    """
    Execute refurb analysis on specified files using Mypy for parsing and type analysis.
    
    This function:
    1. Configures and runs Mypy to parse source files and generate ASTs
    2. Loads and filters checks based on settings
    3. Visits each AST node with applicable checks
    4. Collects and filters errors based on ignore rules
    5. Sorts results according to specified criteria
    
    Parameters:
    - settings: Configuration object containing files to analyze, enabled/disabled checks,
               output formatting preferences, and all other analysis options
    
    Returns:
    Sequence containing Error objects for code issues found, or error strings for
    system-level problems (file access, parsing failures, etc.)
    
    Raises:
    - CompileError: When Mypy encounters fatal parsing or compilation errors
    - RecursionError: Caught and handled gracefully for deeply nested code structures
    """

Error Processing Functions

Functions that handle error filtering, sorting, and output formatting.

def should_ignore_error(error: Error | str, settings: Settings) -> bool:
    """
    Determine if an error should be ignored based on comment directives and settings.
    
    Parameters:
    - error: Error object or error string to check
    - settings: Settings containing ignore rules and amendment configurations
    
    Returns:
    True if error should be ignored, False otherwise
    """

def is_ignored_via_comment(error: Error) -> bool:
    """
    Check if error is ignored via '# noqa' style comments in source code.
    
    Supports formats like:
    - # noqa (ignores all errors on line)
    - # noqa: FURB105 (ignores specific error)
    - # noqa: FURB105,FURB123 (ignores multiple errors)
    
    Parameters:
    - error: Error object with filename and line information
    
    Returns:
    True if error is suppressed by comment, False otherwise
    """

def is_ignored_via_amend(error: Error, settings: Settings) -> bool:
    """
    Check if error is ignored via path-specific amendment rules in configuration.
    
    Parameters:
    - error: Error object with filename and error code information
    - settings: Settings containing amendment rules mapping paths to ignored errors
    
    Returns:
    True if error is suppressed by amendment rule, False otherwise
    """

def sort_errors(error: Error | str, settings: Settings) -> tuple[str | int, ...]:
    """
    Generate sort key for error based on settings sort preference.
    
    Parameters:
    - error: Error object or error string to generate key for
    - settings: Settings containing sort_by preference ("filename" or "error")
    
    Returns:
    Tuple suitable for use as sort key, ordered by filename then location,
    or by error code then location, depending on settings
    """

Source Code Analysis

Utility functions for analyzing source code and extracting contextual information.

def get_source_lines(filepath: str) -> list[str]:
    """
    Read and cache source file contents for error processing.
    
    Uses LRU caching to avoid repeated file reads during analysis.
    
    Parameters:
    - filepath: Path to source file to read
    
    Returns:
    List of source code lines (without line endings)
    """

Output Formatting

Functions that format analysis results for different output targets and integrations.

def format_errors(errors: Sequence[Error | str], settings: Settings) -> str:
    """
    Format error sequence for output based on settings preferences.
    
    Parameters:
    - errors: Sequence of Error objects and error strings to format
    - settings: Settings containing format preferences and display options
    
    Returns:
    Formatted string ready for display, including help text when appropriate
    """

def format_as_github_annotation(error: Error | str) -> str:
    """
    Format error as GitHub Actions annotation for CI integration.
    
    Parameters:
    - error: Error object or error string to format
    
    Returns:
    GitHub Actions annotation string with file, line, column, and message
    """

def format_with_color(error: Error | str) -> str:
    """
    Format error with ANSI color codes for terminal display.
    
    Highlights file paths, line numbers, error codes, and diff suggestions
    with appropriate colors for improved readability.
    
    Parameters:
    - error: Error object or error string to format
    
    Returns:
    ANSI color-formatted string for terminal display
    """

Performance Analysis

Functions for analyzing and reporting performance metrics during analysis.

def output_timing_stats(
    settings: Settings,
    mypy_total_time_spent: float,
    mypy_timing_stats: Path | None,
    refurb_timing_stats_in_ms: dict[str, int]
) -> None:
    """
    Export detailed timing information to JSON file for performance analysis.
    
    Generates comprehensive timing data including:
    - Total Mypy build time
    - Per-module Mypy parsing time
    - Per-file refurb checking time
    
    Parameters:
    - settings: Settings containing timing_stats file path
    - mypy_total_time_spent: Total time spent in Mypy build phase
    - mypy_timing_stats: Path to Mypy's detailed timing data
    - refurb_timing_stats_in_ms: Per-file refurb checking times in milliseconds
    """

Analysis Pipeline Integration

The analysis process integrates tightly with Mypy's type checker:

Mypy Configuration: Sets incremental parsing, cache options, and Python version
AST Generation: Uses Mypy to parse source files into typed AST nodes
Type Information: Leverages Mypy's type analysis for sophisticated checks
Error Context: Maintains file paths and line information throughout analysis
Performance Monitoring: Tracks timing for both Mypy and refurb phases

Usage Examples

from refurb.main import run_refurb
from refurb.settings import load_settings

# Basic analysis
settings = load_settings(["src/"])
errors = run_refurb(settings)

# Analysis with custom options
settings = load_settings([
    "src/", 
    "--ignore", "FURB105,FURB123",
    "--format", "github",
    "--timing-stats", "timing.json"
])
errors = run_refurb(settings)

# Process results
for error in errors:
    if isinstance(error, str):
        print(f"System error: {error}")
    else:
        print(f"Issue at {error.filename}:{error.line}: {error.msg}")

Install with Tessl CLI