tessl/pypi-markdown-it-py

Python port of markdown-it providing CommonMark-compliant markdown parsing with configurable syntax and pluggable architecture

—

Pending

Overview

Eval results

Files

Core Parsing and Rendering

Name: tessl/pypi-markdown-it-py
Author: tessl

Main parsing functionality for converting markdown text to HTML or tokens, with support for all CommonMark features and configurable parsing behavior.

Capabilities

Main Parser Class

The MarkdownIt class is the primary interface for markdown processing, coordinating all parsing components and providing the main API for text conversion.

class MarkdownIt:
    def __init__(
        self,
        config: str | dict = "commonmark",
        options_update: dict = None,
        *,
        renderer_cls: callable = RendererHTML,
    ):
        """
        Initialize markdown parser.
        
        Parameters:
        - config: preset name ('commonmark', 'default', 'zero', 'gfm-like') or config dict
        - options_update: additional options to merge into preset
        - renderer_cls: renderer class for output generation
        """

Rendering Methods

Convert markdown text directly to output format (HTML by default).

def render(self, src: str, env: dict = None) -> str:
    """
    Render markdown string to HTML.
    
    Parameters:
    - src: markdown text to parse
    - env: environment sandbox for metadata (optional)
    
    Returns:
    - str: rendered HTML output
    """

def renderInline(self, src: str, env: dict = None) -> str:
    """
    Render single paragraph content without wrapping in <p> tags.
    
    Parameters:
    - src: inline markdown text
    - env: environment sandbox (optional)
    
    Returns:
    - str: rendered HTML without paragraph wrapper
    """

Usage Example:

from markdown_it import MarkdownIt

md = MarkdownIt()

# Full document rendering
html = md.render("# Title\n\nParagraph with **bold** text.")
# Returns: '<h1>Title</h1>\n<p>Paragraph with <strong>bold</strong> text.</p>\n'

# Inline rendering (no <p> wrapper)
inline_html = md.renderInline("Text with **bold** and _italic_.")
# Returns: 'Text with <strong>bold</strong> and <em>italic</em>.'

Parsing Methods

Convert markdown text to structured token representation for advanced processing.

def parse(self, src: str, env: dict = None) -> list[Token]:
    """
    Parse markdown to token stream.
    
    Parameters:
    - src: markdown text to parse
    - env: environment sandbox for metadata (optional)
    
    Returns:
    - list[Token]: list of parsed tokens representing document structure
    """

def parseInline(self, src: str, env: dict = None) -> list[Token]:
    """
    Parse inline content only, skipping block rules.
    
    Parameters:
    - src: inline markdown text
    - env: environment sandbox (optional)
    
    Returns:
    - list[Token]: tokens with single inline element containing parsed children
    """

Usage Example:

from markdown_it import MarkdownIt

md = MarkdownIt()

# Parse to tokens for processing
tokens = md.parse("# Header\n\n*Emphasis* and **strong**.")

for token in tokens:
    print(f"Type: {token.type}, Tag: {token.tag}, Content: {token.content}")

# Output:
# Type: heading_open, Tag: h1, Content: 
# Type: inline, Tag: , Content: Header
# Type: heading_close, Tag: h1, Content: 
# Type: paragraph_open, Tag: p, Content: 
# Type: inline, Tag: , Content: *Emphasis* and **strong**.
# Type: paragraph_close, Tag: p, Content:

Parser Components Access

Access internal parser components for advanced customization.

# Properties providing access to internal parsers
@property
def inline(self) -> ParserInline:
    """Inline parser instance for processing inline elements."""

@property  
def block(self) -> ParserBlock:
    """Block parser instance for processing block elements."""

@property
def core(self) -> ParserCore:
    """Core parser instance coordinating parsing pipeline."""

@property
def renderer(self) -> RendererProtocol:
    """Renderer instance for output generation."""

# Dictionary-style access
def __getitem__(self, name: str) -> Any:
    """Access parser components by name ('inline', 'block', 'core', 'renderer')."""

Usage Example:

from markdown_it import MarkdownIt

md = MarkdownIt()

# Access parser components
block_parser = md.block
inline_parser = md['inline']
renderer = md.renderer

# Check active rules
print("Active block rules:", md.block.ruler.get_active_rules())
print("Active inline rules:", md.inline.ruler.get_active_rules())

Environment and Metadata

The environment parameter provides a sandbox for passing data between parsing stages and retrieving metadata.

# Environment type definition
EnvType = dict[str, Any]

Usage Example:

from markdown_it import MarkdownIt

md = MarkdownIt()
env = {}

# Parse with environment to collect metadata
html = md.render("[Link text][ref]\n\n[ref]: https://example.com", env)

# Environment now contains references
print(env)
# Output: {'references': {'ref': {'href': 'https://example.com', 'title': ''}}}

Error Handling

The parser raises exceptions for invalid input or configuration:

TypeError: Invalid input data types
KeyError: Unknown preset names
ValueError: Invalid configurations or rule names

try:
    md = MarkdownIt('invalid-preset')
except KeyError as e:
    print(f"Unknown preset: {e}")

try:
    html = md.render(123)  # Invalid input type
except TypeError as e:
    print(f"Invalid input: {e}")

Install with Tessl CLI