0
# Processor Classes
1
2
Reusable markdown processor classes for efficient batch processing, advanced configuration, and when you need to convert multiple documents with the same settings.
3
4
## Capabilities
5
6
### Markdown Processor Class
7
8
Main processor class that can be configured once and reused for multiple conversions, providing better performance for batch processing.
9
10
```python { .api }
11
class Markdown:
12
def __init__(
13
self,
14
html4tags: bool = False,
15
tab_width: int = 4,
16
safe_mode: Optional[Literal['replace', 'escape']] = None,
17
extras: Optional[Union[list[str], dict[str, Any]]] = None,
18
link_patterns: Optional[Iterable[tuple[re.Pattern, Union[str, Callable[[re.Match], str]]]]] = None,
19
footnote_title: Optional[str] = None,
20
footnote_return_symbol: Optional[str] = None,
21
use_file_vars: bool = False,
22
cli: bool = False
23
):
24
"""
25
Initialize a reusable markdown processor.
26
27
Parameters: (same as markdown() function)
28
- html4tags: Use HTML 4 style for empty element tags
29
- tab_width: Number of spaces per tab for code block indentation
30
- safe_mode: Sanitize literal HTML ('escape' or 'replace')
31
- extras: List of extra names or dict of extra_name -> extra_arg
32
- link_patterns: Auto-link regex patterns as (pattern, replacement) tuples
33
- footnote_title: Title attribute for footnote links
34
- footnote_return_symbol: Symbol for footnote return links
35
- use_file_vars: Look for Emacs-style file variables to enable extras
36
- cli: Enable CLI-specific behavior for command-line usage
37
"""
38
39
def convert(self, text: str) -> UnicodeWithAttrs:
40
"""
41
Convert markdown text to HTML using configured settings.
42
43
Parameters:
44
- text: Markdown text to convert
45
46
Returns:
47
UnicodeWithAttrs: HTML string with optional metadata and toc_html attributes
48
"""
49
50
def reset(self) -> None:
51
"""
52
Reset internal state for clean processing of next document.
53
54
Called automatically by convert(), but can be called manually
55
to clear cached state between conversions.
56
"""
57
```
58
59
**Usage Examples:**
60
61
```python
62
from markdown2 import Markdown
63
64
# Create processor with specific configuration
65
markdowner = Markdown(
66
extras=["tables", "footnotes", "header-ids", "toc"],
67
safe_mode="escape",
68
tab_width=2
69
)
70
71
# Convert multiple documents with same settings
72
html1 = markdowner.convert(document1)
73
html2 = markdowner.convert(document2)
74
html3 = markdowner.convert(document3)
75
76
# Advanced configuration with extras options
77
processor = Markdown(
78
extras={
79
"header-ids": {"prefix": "section-"},
80
"toc": {"depth": 3},
81
"breaks": {"on_newline": True},
82
"html-classes": {"table": "table table-striped"}
83
}
84
)
85
86
html = processor.convert(markdown_text)
87
```
88
89
### Pre-configured Processor
90
91
Convenience class with commonly used extras pre-enabled for typical use cases.
92
93
```python { .api }
94
class MarkdownWithExtras(Markdown):
95
"""
96
Markdown processor with common extras pre-configured.
97
98
Pre-enabled extras:
99
- footnotes: Support footnotes as used on daringfireball.net
100
- fenced-code-blocks: GitHub-style fenced code blocks with optional syntax highlighting
101
"""
102
```
103
104
**Usage Examples:**
105
106
```python
107
from markdown2 import MarkdownWithExtras
108
109
# Use pre-configured processor
110
markdowner = MarkdownWithExtras()
111
html = markdowner.convert(markdown_text)
112
113
# Add additional extras to the pre-configured set
114
processor = MarkdownWithExtras(
115
extras=["tables", "header-ids"] # Adds to the existing extras
116
)
117
html = processor.convert(markdown_text)
118
```
119
120
### Enhanced Return Type
121
122
Special string subclass that can carry additional attributes from processing.
123
124
```python { .api }
125
class UnicodeWithAttrs(str):
126
"""
127
String subclass for markdown HTML output with optional attributes.
128
129
Attributes:
130
- metadata: Dict of document metadata (from 'metadata' extra)
131
- toc_html: HTML string for table of contents (from 'toc' extra)
132
"""
133
metadata: Optional[dict[str, str]]
134
toc_html: Optional[str]
135
```
136
137
**Usage Examples:**
138
139
```python
140
import markdown2
141
142
# Convert with metadata and TOC extras
143
html = markdown2.markdown(
144
"""---
145
title: My Document
146
author: John Doe
147
---
148
149
# Chapter 1
150
Content here...
151
152
## Section 1.1
153
More content...
154
""",
155
extras=["metadata", "toc", "header-ids"]
156
)
157
158
# Access metadata
159
if html.metadata:
160
print(f"Title: {html.metadata['title']}")
161
print(f"Author: {html.metadata['author']}")
162
163
# Access table of contents
164
if html.toc_html:
165
print("Table of Contents HTML:", html.toc_html)
166
167
# Still works as regular string
168
print("HTML length:", len(html))
169
print("HTML content:", str(html))
170
```
171
172
## Performance Considerations
173
174
### When to Use Classes vs Functions
175
176
**Use `Markdown` class when:**
177
- Converting multiple documents with same settings
178
- Need to reuse processor configuration
179
- Processing documents in batches
180
- Want to avoid reconfiguring extras repeatedly
181
182
**Use `markdown()` function when:**
183
- Converting single documents
184
- Each document needs different settings
185
- Simple one-off conversions
186
- Prototyping or quick scripts
187
188
### Memory and State Management
189
190
```python
191
from markdown2 import Markdown
192
193
# Good: Reuse processor for batch processing
194
processor = Markdown(extras=["tables", "footnotes"])
195
results = [processor.convert(doc) for doc in documents]
196
197
# Less efficient: Recreating processor each time
198
results = [markdown2.markdown(doc, extras=["tables", "footnotes"]) for doc in documents]
199
```
200
201
The `Markdown` class maintains internal state between conversions, so creating one instance and reusing it is more efficient for batch processing.
202
203
## Error Handling
204
205
```python { .api }
206
class MarkdownError(Exception):
207
"""Base exception class for markdown processing errors."""
208
pass
209
```
210
211
**Usage Examples:**
212
213
```python
214
from markdown2 import Markdown, MarkdownError
215
216
try:
217
processor = Markdown(extras=["invalid-extra"])
218
html = processor.convert(text)
219
except MarkdownError as e:
220
print(f"Markdown processing error: {e}")
221
except Exception as e:
222
print(f"Unexpected error: {e}")
223
```