CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-myst-parser

An extended CommonMark compliant parser, with bridges to docutils and Sphinx

Pending
Overview
Eval results
Files

parsing.mddocs/

Parsing System

Dual parser implementations for Sphinx and docutils environments, with factory functions for creating configured markdown-it-py parsers and comprehensive directive/option parsing support.

Capabilities

Sphinx Parser

MyST parser implementation for Sphinx documentation system, providing seamless integration with Sphinx's parsing pipeline and configuration system.

class MystParser:
    """
    Main Sphinx parser for MyST Markdown files.
    
    Attributes:
        supported: Tuple of supported file extensions
    """
    supported: tuple[str, ...] = ("md", "markdown", "myst")
    
    def parse(self, inputstring: str, document) -> None:
        """
        Parse MyST markdown source to populate docutils document.
        
        Args:
            inputstring: MyST markdown source text
            document: Docutils document node to populate
        """

Usage example:

from myst_parser.parsers.sphinx_ import MystParser
from docutils.utils import new_document
from docutils.frontend import OptionParser
from docutils.parsers.rst import Parser as RstParser

# Create Sphinx parser
parser = MystParser()

# Check supported extensions
print(parser.supported)  # ("md", "markdown", "myst")

# Create document for parsing
settings = OptionParser(components=(RstParser,)).get_default_values()
document = new_document('<rst-doc>', settings=settings)

# Parse MyST content
source = """
# MyST Document

This is MyST markdown with {emphasis}`substitution syntax`.

:::{note}
This is a directive.
:::
"""

parser.parse(source, document)
print(document.pformat())

Docutils Parser

Pure docutils parser implementation for MyST markdown, enabling use of MyST outside of Sphinx environments with full docutils compatibility.

class Parser:
    """
    Docutils parser for MyST markdown.
    
    Provides MyST parsing capabilities in pure docutils environments
    without Sphinx dependencies.
    """
    
    def parse(self, inputstring: str, document) -> None:
        """
        Parse MyST markdown source to populate docutils document.
        
        Args:
            inputstring: MyST markdown source text
            document: Docutils document node to populate
        """

class Unset:
    """Sentinel class for unset settings."""

# Module constant
DOCUTILS_UNSET: Unset

Usage example:

from myst_parser.parsers.docutils_ import Parser
from docutils.core import publish_doctree, publish_parts

# Create docutils parser
parser = Parser()

# Parse to document tree
source = """
# MyST Document

- [x] Completed task
- [ ] Pending task

:::{admonition} Note
This is an admonition.
:::
"""

# Method 1: Direct parsing
document = publish_doctree(source, parser=parser)
print(document.pformat())

# Method 2: Publish with specific output
html_output = publish_parts(
    source,
    parser=parser,
    writer_name='html5'
)
print(html_output['html_body'])

Markdown Parser Factory

Factory function for creating configured markdown-it-py parsers with MyST extensions and custom renderers, providing the core parsing engine.

def create_md_parser(config: MdParserConfig, renderer) -> MarkdownIt:
    """
    Create markdown parser with MyST configuration.
    
    Args:
        config: MdParserConfig instance with parsing options
        renderer: Renderer instance for token processing
        
    Returns:
        Configured MarkdownIt parser instance with MyST extensions
    """

Usage example:

from myst_parser.parsers.mdit import create_md_parser
from myst_parser.config.main import MdParserConfig
from myst_parser.mdit_to_docutils.base import DocutilsRenderer, make_document

# Create configuration
config = MdParserConfig(
    enable_extensions={"tasklist", "deflist", "substitution"},
    heading_anchors=2,
    footnote_sort=True
)

# Create document and renderer
document = make_document("example.md")
renderer = DocutilsRenderer(document)

# Create configured parser
md_parser = create_md_parser(config, renderer)

# Parse content
source = """
# Document Title

Term
: Definition here

- [x] Completed
- [ ] Pending

{emphasis}`Substitution text`
"""

# Render to docutils
result = md_parser.render(source)
print(document.pformat())

Directive Parsing

Comprehensive directive parsing system for MyST directives, providing validation, argument processing, and warning generation.

class ParseWarnings:
    """
    Dataclass for parsing warnings.
    
    Attributes:
        warnings: List of warning messages
        errors: List of error messages
    """
    warnings: list[str] = field(default_factory=list)
    errors: list[str] = field(default_factory=list)

class DirectiveParsingResult:
    """
    Dataclass for directive parsing results.
    
    Attributes:
        arguments: Parsed directive arguments
        options: Parsed directive options
        content: Directive content lines
        warnings: Parsing warnings and errors
    """
    arguments: list[str] = field(default_factory=list)
    options: dict[str, any] = field(default_factory=dict)
    content: list[str] = field(default_factory=list)
    warnings: ParseWarnings = field(default_factory=ParseWarnings)

def parse_directive_text(
    directive_class,
    first_line: str,
    content: list[str],
    validate_options: bool = True,
    **kwargs
) -> DirectiveParsingResult:
    """
    Parse and validate directive text.
    
    Args:
        directive_class: Docutils directive class
        first_line: First line containing directive name and arguments
        content: List of directive content lines
        validate_options: Whether to validate directive options
        **kwargs: Additional parsing options
        
    Returns:
        DirectiveParsingResult with parsed components and warnings
        
    Raises:
        MarkupError: When directive parsing fails critically
    """

Usage example:

from myst_parser.parsers.directives import parse_directive_text, DirectiveParsingResult
from docutils.parsers.rst.directives import admonitions

# Parse note directive
result = parse_directive_text(
    admonitions.Note,
    "note Important Information",
    ["This is important content.", "", "With multiple paragraphs."],
    validate_options=True
)

print(result.arguments)  # ["Important Information"]
print(result.content)    # ["This is important content.", "", "With multiple paragraphs."]
print(result.warnings.warnings)  # Any parsing warnings

# Parse directive with options
result = parse_directive_text(
    admonitions.Note,
    "note",
    [":class: custom-note", ":name: my-note", "", "Content here"],
    validate_options=True
)

print(result.options)  # {"class": ["custom-note"], "name": "my-note"}

Option String Parsing

Advanced option string parsing for directive options, supporting complex option formats with proper tokenization and validation.

class Position:
    """Line/column position tracking."""
    line: int
    column: int

class StreamBuffer:
    """Buffer for parsing option strings."""
    
    def __init__(self, data: str): ...
    def read(self, n: int = -1) -> str: ...
    def peek(self, n: int = 1) -> str: ...

class Token:
    """Base token class."""
    value: str
    position: Position

class KeyToken(Token):
    """Token representing option key."""

class ValueToken(Token):
    """Token representing option value."""

class ColonToken(Token):
    """Token representing colon separator."""

class TokenizeError(Exception):
    """Exception raised during tokenization."""

class State:
    """Parser state management."""
    
    def __init__(self): ...
    def push_state(self, new_state: str): ...
    def pop_state(self) -> str: ...

def options_to_items(options_string: str) -> list[tuple[str, str]]:
    """
    Parse options string to key-value items.
    
    Args:
        options_string: Directive options as string
        
    Returns:
        List of (key, value) tuples
        
    Raises:
        TokenizeError: When tokenization fails
    """

Usage example:

from myst_parser.parsers.options import options_to_items, TokenizeError

# Parse simple options
options = ":class: custom-class\n:name: my-element"
items = options_to_items(options)
print(items)  # [("class", "custom-class"), ("name", "my-element")]

# Parse complex options with multiple values
options = """
:class: class1 class2 class3
:style: color: red; font-weight: bold
:data-attr: complex-value
"""

try:
    items = options_to_items(options)
    for key, value in items:
        print(f"{key}: {value}")
except TokenizeError as e:
    print(f"Parsing error: {e}")

Types

Parsing-related type definitions:

class ParseWarnings:
    """Container for directive parsing warnings and errors."""
    warnings: list[str]
    errors: list[str]

class DirectiveParsingResult:
    """Results from directive parsing operation."""
    arguments: list[str]
    options: dict[str, any]
    content: list[str]
    warnings: ParseWarnings

class Position:
    """Position tracking for parsing operations."""
    line: int
    column: int

class TokenizeError(Exception):
    """Exception raised during option string tokenization."""

class Unset:
    """Sentinel class for unset configuration values."""

# Re-exported from docutils
MarkupError = docutils.parsers.rst.directives.MarkupError

Install with Tessl CLI

npx tessl i tessl/pypi-myst-parser

docs

cli-tools.md

configuration.md

document-rendering.md

index.md

inventory-warnings.md

parsing.md

sphinx-extension.md

tile.json