A Python parser that supports error recovery and round-trip parsing for different Python versions
npx @tessl/cli install tessl/pypi-parso@0.8.0A Python parser that supports error recovery and round-trip parsing for different Python versions. Parso enables parsing and analysis of Python code with the ability to handle syntactically incorrect code and maintain precise position information for all tokens. Originally part of the Jedi project, it provides a small but comprehensive API for parsing Python code and analyzing syntax trees.
pip install parsoimport parsoFor working with specific components:
from parso import parse, load_grammar, Grammar, ParserSyntaxError
from parso import split_lines, python_bytes_to_unicode
from parso.tree import NodeOrLeaf, BaseNode, Leaf
from parso.python.tree import Module, Function, Class, Nameimport parso
# Parse Python code - simplest approach
module = parso.parse('def hello(): return "world"')
print(module.children[0]) # Function definition node
# Parse with specific Python version
module = parso.parse('x := 42', version="3.8") # Walrus operator
expr = module.children[0].children[0]
print(expr.get_code()) # 'x := 42'
# Load grammar for advanced usage
grammar = parso.load_grammar(version="3.9")
module = grammar.parse('hello + 1')
# Navigate the syntax tree
expr = module.children[0] # Expression statement
name = expr.children[0].children[0] # 'hello' name
print(name.value) # 'hello'
print(name.start_pos) # (1, 0)
print(name.end_pos) # (1, 5)
# Handle syntax errors with error recovery
grammar = parso.load_grammar()
module = grammar.parse('def broken(: pass') # Invalid syntax
errors = list(grammar.iter_errors(module))
for error in errors:
print(f"Error: {error.message}")Parso uses a multi-layered architecture for robust Python parsing:
This design enables parso to serve as the foundation for IDEs, linters, code formatters, and other Python code analysis tools that need to work with both valid and invalid Python code.
Main entry points for parsing Python code, including version-specific grammar loading and high-level parsing utilities that handle the most common use cases.
def parse(code=None, **kwargs): ...
def load_grammar(*, version=None, path=None): ...Grammar classes that handle parsing with different Python versions, error recovery, caching, and advanced parsing options for production use.
class Grammar:
def parse(self, code=None, **kwargs): ...
def iter_errors(self, node): ...
def refactor(self, base_node, node_to_str_map): ...
class PythonGrammar(Grammar): ...Base classes and methods for navigating and manipulating the parsed syntax tree, including position tracking, sibling navigation, and code regeneration.
class NodeOrLeaf:
def get_root_node(self): ...
def get_next_sibling(self): ...
def get_previous_sibling(self): ...
def get_code(self, include_prefix=True): ...
def dump(self, *, indent=4): ...
class BaseNode(NodeOrLeaf): ...
class Leaf(NodeOrLeaf): ...Python-specific node types representing functions, classes, imports, control flow, and all Python language constructs with specialized methods for analysis.
class Module(Scope):
def get_used_names(self): ...
def iter_imports(self): ...
class Function(ClassOrFunc):
def get_params(self): ...
def iter_yield_exprs(self): ...
def is_generator(self): ...
class Class(ClassOrFunc):
def get_super_arglist(self): ...
class Name(PythonLeaf):
def is_definition(self, include_setitem=False): ...
def get_definition(self, import_name_always=False, include_setitem=False): ...Low-level tokenization functions and classes for converting Python source code into tokens, handling encoding, f-strings, and Python version differences.
def tokenize(code, version_info, start_pos=(1, 0)): ...
def tokenize_lines(lines, version_info, start_pos=(1, 0)): ...
class PythonTokenTypes: ...Error detection, syntax error reporting, and code quality analysis including PEP 8 normalization and custom rule systems.
class ParserSyntaxError(Exception): ...
class ErrorFinder: ...
class Normalizer: ...
class PEP8Normalizer(Normalizer): ...Utility functions for text processing, version handling, encoding detection, and file I/O operations that support the parsing infrastructure.
def split_lines(string, keepends=False): ...
def python_bytes_to_unicode(source, encoding='utf-8', errors='strict'): ...
def parse_version_string(version=None): ...
def version_info(): ...
class FileIO: ...
class PythonVersionInfo: ...class PythonVersionInfo:
major: int
minor: int
class Version:
major: int
minor: int
micro: intclass ParserSyntaxError(Exception):
message: str
error_leaf: ErrorLeaf
class InternalParseError(Exception):
msg: str
type: Any
value: str
start_pos: tuple[int, int]