0
# markdown-it-py
1
2
A Python port of the popular markdown-it JavaScript library, providing CommonMark-compliant markdown parsing with configurable syntax and pluggable architecture. Offers high performance, security features, and extensibility for markdown processing in Python applications.
3
4
## Package Information
5
6
- **Package Name**: markdown-it-py
7
- **Language**: Python
8
- **Installation**: `pip install markdown-it-py[plugins]`
9
- **Optional Dependencies**:
10
- `pip install markdown-it-py[linkify]` for URL autoconversion
11
- `pip install markdown-it-py[plugins]` for syntax extensions
12
13
## Core Imports
14
15
```python
16
from markdown_it import MarkdownIt
17
```
18
19
For advanced usage:
20
21
```python
22
from markdown_it import MarkdownIt
23
from markdown_it.token import Token
24
from markdown_it.tree import SyntaxTreeNode
25
from markdown_it.renderer import RendererHTML
26
```
27
28
## Basic Usage
29
30
```python
31
from markdown_it import MarkdownIt
32
33
# Basic markdown parsing
34
md = MarkdownIt()
35
html = md.render("# Hello World\n\nThis is **bold** text.")
36
print(html)
37
# Output: <h1>Hello World</h1>\n<p>This is <strong>bold</strong> text.</p>\n
38
39
# Using presets and enabling features
40
md = (
41
MarkdownIt('commonmark', {'breaks': True, 'html': True})
42
.enable(['table', 'strikethrough'])
43
)
44
45
markdown_text = """
46
# Table Example
47
48
| Header 1 | Header 2 |
49
|----------|----------|
50
| Cell 1 | Cell 2 |
51
52
~~Strikethrough text~~
53
"""
54
55
html = md.render(markdown_text)
56
print(html)
57
58
# Parse to tokens for advanced processing
59
tokens = md.parse(markdown_text)
60
for token in tokens:
61
print(f"{token.type}: {token.tag}")
62
```
63
64
## Architecture
65
66
markdown-it-py follows a multi-stage parsing architecture:
67
68
- **MarkdownIt**: Main parser class coordinating all components
69
- **ParserCore**: Orchestrates the parsing pipeline through rule chains
70
- **ParserBlock**: Processes block-level elements (paragraphs, headers, lists)
71
- **ParserInline**: Handles inline elements (emphasis, links, code spans)
72
- **Renderer**: Converts parsed tokens to output format (HTML by default)
73
- **Token**: Represents parsed markdown elements with metadata
74
- **Ruler**: Manages parsing rules and their execution order
75
76
This design enables complete customization of parsing behavior through rule modification, custom renderers, and plugin integration.
77
78
## Capabilities
79
80
### Core Parsing and Rendering
81
82
Main parsing functionality for converting markdown text to HTML or tokens, with support for all CommonMark features and configurable presets.
83
84
```python { .api }
85
class MarkdownIt:
86
def render(self, src: str, env: dict = None) -> str: ...
87
def parse(self, src: str, env: dict = None) -> list[Token]: ...
88
def renderInline(self, src: str, env: dict = None) -> str: ...
89
def parseInline(self, src: str, env: dict = None) -> list[Token]: ...
90
```
91
92
[Core Parsing](./core-parsing.md)
93
94
### Configuration and Presets
95
96
Parser configuration system with built-in presets (commonmark, default, zero, gfm-like) and rule management for customizing parsing behavior.
97
98
```python { .api }
99
class MarkdownIt:
100
def configure(self, presets: str | dict, options_update: dict = None) -> MarkdownIt: ...
101
def enable(self, names: str | list[str], ignoreInvalid: bool = False) -> MarkdownIt: ...
102
def disable(self, names: str | list[str], ignoreInvalid: bool = False) -> MarkdownIt: ...
103
def use(self, plugin: callable, *params, **options) -> MarkdownIt: ...
104
```
105
106
[Configuration](./configuration.md)
107
108
### Token System
109
110
Structured representation of parsed markdown elements with metadata, attributes, and hierarchical relationships for advanced processing and custom rendering.
111
112
```python { .api }
113
class Token:
114
type: str
115
tag: str
116
attrs: dict[str, str | int | float]
117
content: str
118
children: list[Token] | None
119
def as_dict(self) -> dict: ...
120
def copy(self, **changes) -> Token: ...
121
```
122
123
[Token System](./token-system.md)
124
125
### Rendering and Output
126
127
HTML rendering system with customizable render rules and support for custom renderers to generate different output formats.
128
129
```python { .api }
130
class RendererHTML:
131
def render(self, tokens: list[Token], options: dict, env: dict) -> str: ...
132
def renderInline(self, tokens: list[Token], options: dict, env: dict) -> str: ...
133
def add_render_rule(self, name: str, function: callable, fmt: str = "html") -> None: ...
134
```
135
136
[Rendering](./rendering.md)
137
138
### Syntax Tree Processing
139
140
Tree representation utilities for converting linear token streams into hierarchical structures for advanced document analysis and manipulation.
141
142
```python { .api }
143
class SyntaxTreeNode:
144
def __init__(self, tokens: list[Token] = (), *, create_root: bool = True): ...
145
def to_pretty(self, *, indent: int = 2, show_text: bool = False) -> str: ...
146
def to_tokens(self) -> list[Token]: ...
147
```
148
149
[Syntax Tree](./syntax-tree.md)
150
151
### Link Processing and Security
152
153
URL validation, normalization, and link processing utilities with built-in security features to prevent XSS attacks.
154
155
```python { .api }
156
class MarkdownIt:
157
def validateLink(self, url: str) -> bool: ...
158
def normalizeLink(self, url: str) -> str: ...
159
def normalizeLinkText(self, link: str) -> str: ...
160
```
161
162
[Link Processing](./link-processing.md)
163
164
### Command Line Interface
165
166
Command-line tools for converting markdown files to HTML with batch processing and interactive modes.
167
168
```python { .api }
169
def main(args: list[str] = None) -> int: ...
170
def convert(filenames: list[str]) -> None: ...
171
def interactive() -> None: ...
172
```
173
174
[CLI](./cli.md)
175
176
## Types
177
178
```python { .api }
179
# Type aliases and protocols
180
EnvType = dict[str, Any] # Environment sandbox for parsing
181
182
class OptionsType(TypedDict):
183
maxNesting: int # Internal protection, recursion limit
184
html: bool # Enable HTML tags in source
185
linkify: bool # Enable autoconversion of URL-like texts to links
186
typographer: bool # Enable smartquotes and replacements
187
quotes: str # Quote characters
188
xhtmlOut: bool # Use '/' to close single tags (<br />)
189
breaks: bool # Convert newlines in paragraphs into <br>
190
langPrefix: str # CSS language prefix for fenced blocks
191
highlight: callable | None # Highlighter function: (content, lang, attrs) -> str
192
store_labels: bool # Store link labels in token metadata (optional)
193
194
class PresetType(TypedDict):
195
options: OptionsType
196
components: dict[str, dict[str, list[str]]] # Component rules configuration
197
198
class OptionsDict(dict):
199
"""Options dictionary with attribute access to core configuration options."""
200
201
# Properties for each option with getters/setters
202
maxNesting: int
203
html: bool
204
linkify: bool
205
typographer: bool
206
quotes: str
207
xhtmlOut: bool
208
breaks: bool
209
langPrefix: str
210
highlight: callable | None
211
212
class RendererProtocol(Protocol):
213
__output__: str
214
def render(self, tokens: list[Token], options: OptionsDict, env: EnvType) -> Any: ...
215
```