or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

core-parsing.mdexceptions.mdindex.mdtokens-lexing.mdtree-processing.mdutilities.md
tile.json

tessl/pypi-lark-parser

A modern general-purpose parsing library for Python that can parse any context-free grammar efficiently

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/lark-parser@0.12.x

To install, run

npx @tessl/cli install tessl/pypi-lark-parser@0.12.0

index.mddocs/

Lark Parser

A modern general-purpose parsing library for Python that can parse any context-free grammar efficiently with minimal code. Lark provides multiple parsing algorithms, automatically builds annotated parse trees, supports EBNF grammar syntax, and offers full Unicode support with automatic line/column tracking.

Package Information

  • Package Name: lark-parser
  • Language: Python
  • Installation: pip install lark-parser

Core Imports

from lark import Lark, Tree, Token

Common imports for visitors and transformers:

from lark import Transformer, Visitor, v_args

Exception handling:

from lark import (ParseError, LexError, GrammarError, UnexpectedToken,
                  UnexpectedInput, UnexpectedCharacters, UnexpectedEOF, LarkError)

Basic Usage

from lark import Lark

# Define a simple grammar
grammar = """
    start: sum
    sum: product ("+" product)*
    product: number ("*" number)*
    number: NUMBER

    %import common.NUMBER
    %import common.WS
    %ignore WS
"""

# Create parser
parser = Lark(grammar)

# Parse text
text = "2 + 3 * 4"
tree = parser.parse(text)
print(tree.pretty())

# Result:
# start
#   sum
#     product
#       number    2
#     product
#       number    3
#       number    4

Architecture

Lark follows a modular design with clear separation of concerns:

  • Lark: Main parser interface that coordinates grammar loading, lexing, and parsing
  • Grammar: EBNF grammar definitions with rule declarations and terminal imports
  • Lexer: Tokenizes input text according to terminal definitions
  • Parser: Transforms token stream into parse trees using selected algorithm (Earley, LALR, CYK)
  • Tree: Parse tree nodes containing rule data and child elements
  • Visitors/Transformers: Process parse trees for extraction, transformation, or interpretation

This architecture enables flexible parsing workflows where users can choose parsing algorithms, customize lexing behavior, and process results using various tree traversal patterns.

Capabilities

Core Parsing

Main parsing functionality including the Lark class, configuration options, parsing algorithms, and grammar loading. This provides the primary interface for creating parsers and parsing text.

class Lark:
    def __init__(self, grammar: str, **options): ...
    def parse(self, text: str, start: str = None) -> Tree: ...
    def lex(self, text: str) -> Iterator[Token]: ...
class LarkOptions:
    parser: str  # "earley", "lalr", "cyk"
    lexer: str   # "auto", "standard", "contextual", "dynamic"
    start: Union[str, List[str]]
    debug: bool
    transformer: Optional[Transformer]

Core Parsing

Tree Processing

Parse tree representation and processing including the Tree class for AST nodes, visitor patterns for tree traversal, and transformer classes for tree modification and data extraction.

class Tree:
    def __init__(self, data: str, children: list, meta=None): ...
    def pretty(self, indent_str: str = '  ') -> str: ...
    def find_data(self, data: str) -> Iterator[Tree]: ...
class Transformer:
    def transform(self, tree: Tree) -> Any: ...
    def __default__(self, data: str, children: list, meta) -> Any: ...
def v_args(inline: bool = False, meta: bool = False, tree: bool = False): ...

Tree Processing

Tokens and Lexing

Token representation and lexical analysis including the Token class for lexical units, lexer configuration, and indentation handling for Python-like languages.

class Token(str):
    def __new__(cls, type_: str, value: str, start_pos=None, line=None, column=None): ...
    type: str
    line: int
    column: int
class Indenter:
    def process(self, stream: Iterator[Token]) -> Iterator[Token]: ...

Tokens and Lexing

Exception Handling

Comprehensive error handling including parse errors, lexical errors, grammar errors, and unexpected input handling with context information and error recovery.

class ParseError(Exception): ...
class LexError(ParseError): ...
class GrammarError(LarkError): ...
class UnexpectedInput(ParseError):
    def get_context(self, text: str, span: int = 40) -> str: ...
class UnexpectedToken(UnexpectedInput):
    token: Token
    accepts: Set[str]

Exception Handling

Utilities and Tools

Additional utilities including AST generation helpers, tree reconstruction, standalone parser generation, serialization, and visualization tools.

def create_transformer(ast_module, transformer=None): ...
class Reconstructor:
    def __init__(self, parser: Lark): ...
    def reconstruct(self, tree: Tree) -> str: ...
def gen_standalone(lark_instance: Lark, out=None, compress: bool = False): ...

Utilities and Tools

Types

# Core types
Tree = Tree
Token = Token

# Parser configuration
LarkOptions = LarkOptions
PostLex = PostLex
LexerConf = LexerConf
ParserConf = ParserConf

# Interactive parsing
InteractiveParser = InteractiveParser
ImmutableInteractiveParser = ImmutableInteractiveParser

# Grammar building
Symbol = Symbol
Terminal = Terminal
NonTerminal = NonTerminal
Rule = Rule
RuleOptions = RuleOptions

# Visitor/Transformer types  
Transformer = Transformer
Visitor = Visitor
Interpreter = Interpreter

# AST utilities
Ast = Ast
AsList = AsList
Reconstructor = Reconstructor

# Exception types
LarkError = LarkError
ParseError = ParseError
LexError = LexError
GrammarError = GrammarError
UnexpectedInput = UnexpectedInput
UnexpectedToken = UnexpectedToken
UnexpectedCharacters = UnexpectedCharacters
UnexpectedEOF = UnexpectedEOF
VisitError = VisitError
Discard = Discard