Pythonic Pandoc filters library for programmatic document manipulation and transformation
npx @tessl/cli install tessl/pypi-panflute@2.3.0A Python library that provides a pythonic alternative to pandocfilters for creating Pandoc document filters. Panflute enables programmatic manipulation and transformation of documents in various formats (Markdown, LaTeX, HTML, etc.) by providing an intuitive object-oriented API for working with Pandoc's abstract syntax tree (AST).
pip install panfluteimport panflute as pfFor specific components:
from panflute import Doc, Para, Str, run_filter, stringifyCheck version:
import panflute as pf
print(pf.__version__) # '2.3.1'import panflute as pf
def capitalize_strings(elem, doc):
"""Filter function to capitalize all strings in a document."""
if isinstance(elem, pf.Str):
return pf.Str(elem.text.upper())
if __name__ == '__main__':
pf.run_filter(capitalize_strings)Creating documents programmatically:
import panflute as pf
# Create a simple document
doc = pf.Doc(
pf.Header(pf.Str('My Title'), level=1),
pf.Para(pf.Str('Hello '), pf.Strong(pf.Str('world')), pf.Str('!')),
pf.Para(pf.Link(pf.Str('Visit our site'), url='https://example.com'))
)
# Convert to JSON for Pandoc
pf.dump(doc)Panflute's architecture is built around Pandoc's AST hierarchy:
The library provides comprehensive tree traversal through the walk() method, enabling powerful document transformations while preserving the AST structure and relationships.
Core functions for reading and writing Pandoc JSON documents, running filter functions, and managing document processing workflows.
def load(input_stream=None) -> Doc: ...
def dump(doc: Doc, output_stream=None): ...
def run_filter(action: callable, **kwargs): ...
def run_filters(actions: list, **kwargs): ...Complete set of Pandoc AST element classes for building and manipulating documents, including blocks, inlines, metadata, and table components.
class Doc(Element): ...
class Para(Block): ...
class Header(Block): ...
class Str(Inline): ...
class Emph(Inline): ...
class Link(Inline): ...
class Table(Block): ...
class TableRow(Element): ...
class TableCell(Element): ...Utility functions for text extraction, document conversion, YAML processing, and external tool integration.
def stringify(element, newlines=True) -> str: ...
def convert_text(text, input_format='markdown', output_format='panflute', **kwargs): ...
def yaml_filter(element, doc, **kwargs): ...
def shell(args, wait=True, msg=None): ...CLI tools for running panflute as a Pandoc filter with automatic filter discovery and execution capabilities.
def main(): ...
def panfl(): ...
def stdio(filters=None, **kwargs): ...class Element:
"""
Base class for all Pandoc elements.
Provides core functionality for element tree traversal, JSON serialization,
and parent-child relationship tracking.
Properties:
- parent: parent element (None for root)
- location: attribute name in parent where this element is stored
- index: position in parent container (for list elements)
- tag: element type name (read-only)
Methods:
- walk(action, doc=None, stop_if=None): traverse element tree applying action
- to_json(): serialize element to Pandoc JSON format
"""
parent: Element | None
location: str | None
index: int | None
@property
def tag(self) -> str: ...
def walk(self, action: callable, doc: Doc = None, stop_if: callable = None) -> Element: ...
def to_json(self) -> dict: ...
class Block(Element):
"""Base class for block-level elements (paragraphs, headers, lists, etc.)."""
class Inline(Element):
"""Base class for inline elements (text, emphasis, links, etc.)."""
class MetaValue(Element):
"""Base class for metadata elements (strings, lists, maps, etc.)."""class ListContainer:
"""
Wrapper around a list to track elements' parents.
This class shouldn't be instantiated directly by users,
but by the elements that contain it.
Parameters:
- *args: elements contained in the list
- oktypes: type or tuple of types allowed as items (default: object)
- parent: the parent element
Methods:
- walk(action, doc=None, stop_if=None): apply action to all contained elements
- to_json(): convert to JSON representation
"""
def __init__(self, *args, oktypes=object, parent=None): ...
class DictContainer:
"""
Wrapper around a dict to track elements' parents.
This class shouldn't be instantiated directly by users,
but by the elements that contain it.
Parameters:
- *args: elements contained in the dict (sequence of tuples)
- oktypes: type or tuple of types allowed as items (default: object)
- parent: the parent element
- **kwargs: additional key-value pairs
Methods:
- walk(action, doc=None, stop_if=None): apply action to all contained elements
- to_json(): convert to JSON representation
"""
def __init__(self, *args, oktypes=object, parent=None, **kwargs): ...
# Version information
__version__: str # Library version string (e.g., '2.3.1')