0
# Document I/O
1
2
Core functions for reading and writing Pandoc JSON documents, running filter functions, and managing document processing workflows. These functions handle the fundamental operations of loading documents from Pandoc, processing them with filters, and outputting results.
3
4
## Capabilities
5
6
### Document Loading
7
8
Load Pandoc JSON documents from input streams and convert them to panflute Doc objects.
9
10
```python { .api }
11
def load(input_stream=None) -> Doc:
12
"""
13
Load JSON-encoded document and return a Doc element.
14
15
Parameters:
16
- input_stream: text stream used as input (default is sys.stdin)
17
18
Returns:
19
Doc: Parsed document with format and API version set
20
21
Example:
22
import panflute as pf
23
24
# Load from stdin (typical filter usage)
25
doc = pf.load()
26
27
# Load from file
28
with open('document.json', encoding='utf-8') as f:
29
doc = pf.load(f)
30
31
# Load from string
32
import io
33
json_str = '{"pandoc-api-version":[1,23],"meta":{},"blocks":[]}'
34
doc = pf.load(io.StringIO(json_str))
35
"""
36
```
37
38
### Document Output
39
40
Convert panflute Doc objects to Pandoc JSON format and write to output streams.
41
42
```python { .api }
43
def dump(doc: Doc, output_stream=None):
44
"""
45
Dump a Doc object into JSON-encoded text string.
46
47
Parameters:
48
- doc: Document to serialize
49
- output_stream: text stream used as output (default is sys.stdout)
50
51
Example:
52
import panflute as pf
53
54
doc = pf.Doc(pf.Para(pf.Str('Hello world')))
55
56
# Dump to stdout (typical filter usage)
57
pf.dump(doc)
58
59
# Dump to file
60
with open('output.json', 'w', encoding='utf-8') as f:
61
pf.dump(doc, f)
62
63
# Dump to string
64
import io
65
with io.StringIO() as f:
66
pf.dump(doc, f)
67
json_output = f.getvalue()
68
"""
69
```
70
71
### Single Filter Execution
72
73
Run a single filter function on a document with optional preprocessing and postprocessing.
74
75
```python { .api }
76
def run_filter(action: callable,
77
prepare: callable = None,
78
finalize: callable = None,
79
input_stream = None,
80
output_stream = None,
81
doc: Doc = None,
82
stop_if: callable = None,
83
**kwargs):
84
"""
85
Apply a filter function to each element in a document.
86
87
Parameters:
88
- action: function taking (element, doc) that processes elements
89
- prepare: function executed before filtering (receives doc)
90
- finalize: function executed after filtering (receives doc)
91
- input_stream: input source (default stdin)
92
- output_stream: output destination (default stdout)
93
- doc: existing Doc to process instead of loading from stream
94
- stop_if: function taking (element) to stop traversal early
95
- **kwargs: additional arguments passed to action function
96
97
Returns:
98
Doc: processed document if doc parameter provided, otherwise None
99
100
Example:
101
import panflute as pf
102
103
def emphasize_words(elem, doc):
104
if isinstance(elem, pf.Str) and 'important' in elem.text:
105
return pf.Emph(elem)
106
107
def prepare_doc(doc):
108
doc.emphasis_count = 0
109
110
def finalize_doc(doc):
111
pf.debug(f"Added {doc.emphasis_count} emphasis elements")
112
113
if __name__ == '__main__':
114
pf.run_filter(emphasize_words, prepare=prepare_doc, finalize=finalize_doc)
115
"""
116
```
117
118
### Multiple Filter Execution
119
120
Run multiple filter functions sequentially on a document.
121
122
```python { .api }
123
def run_filters(actions: list,
124
prepare: callable = None,
125
finalize: callable = None,
126
input_stream = None,
127
output_stream = None,
128
doc: Doc = None,
129
stop_if: callable = None,
130
**kwargs):
131
"""
132
Apply multiple filter functions sequentially to a document.
133
134
Parameters:
135
- actions: list of functions, each taking (element, doc)
136
- prepare: function executed before filtering
137
- finalize: function executed after all filtering
138
- input_stream: input source (default stdin)
139
- output_stream: output destination (default stdout)
140
- doc: existing Doc to process instead of loading from stream
141
- stop_if: function taking (element) to stop traversal early
142
- **kwargs: additional arguments passed to all action functions
143
144
Returns:
145
Doc: processed document if doc parameter provided, otherwise None
146
147
Example:
148
import panflute as pf
149
150
def convert_quotes(elem, doc):
151
if isinstance(elem, pf.Str):
152
return pf.Str(elem.text.replace('"', '"').replace('"', '"'))
153
154
def add_emphasis(elem, doc):
155
if isinstance(elem, pf.Str) and elem.text.isupper():
156
return pf.Strong(elem)
157
158
filters = [convert_quotes, add_emphasis]
159
160
if __name__ == '__main__':
161
pf.run_filters(filters)
162
"""
163
```
164
165
### Legacy Compatibility Functions
166
167
Wrapper functions providing backward compatibility with pandocfilters.
168
169
```python { .api }
170
def toJSONFilter(*args, **kwargs):
171
"""Wrapper for run_filter() - backward compatibility with pandocfilters."""
172
173
def toJSONFilters(*args, **kwargs):
174
"""Wrapper for run_filters() - backward compatibility with pandocfilters."""
175
```
176
177
## Usage Examples
178
179
### Basic Filter Pipeline
180
181
```python
182
import panflute as pf
183
184
def remove_emphasis(elem, doc):
185
"""Remove all emphasis elements, keeping their content."""
186
if isinstance(elem, pf.Emph):
187
return list(elem.content)
188
189
def count_words(elem, doc):
190
"""Count words in the document."""
191
if isinstance(elem, pf.Str):
192
doc.word_count = getattr(doc, 'word_count', 0) + len(elem.text.split())
193
194
def prepare(doc):
195
"""Initialize document processing."""
196
doc.word_count = 0
197
pf.debug("Starting document processing...")
198
199
def finalize(doc):
200
"""Complete document processing."""
201
pf.debug(f"Processing complete. Word count: {doc.word_count}")
202
203
if __name__ == '__main__':
204
pf.run_filters([remove_emphasis, count_words],
205
prepare=prepare, finalize=finalize)
206
```
207
208
### Programmatic Document Processing
209
210
```python
211
import panflute as pf
212
import io
213
214
# Create a document programmatically
215
doc = pf.Doc(
216
pf.Header(pf.Str('Sample Document'), level=1),
217
pf.Para(pf.Str('This is a '), pf.Emph(pf.Str('sample')), pf.Str(' document.')),
218
metadata={'author': pf.MetaString('John Doe')}
219
)
220
221
def uppercase_filter(elem, doc):
222
if isinstance(elem, pf.Str):
223
return pf.Str(elem.text.upper())
224
225
# Process the document
226
processed_doc = pf.run_filters([uppercase_filter], doc=doc)
227
228
# Output as JSON
229
with io.StringIO() as output:
230
pf.dump(processed_doc, output)
231
json_result = output.getvalue()
232
print(json_result)
233
```