or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

cli.mddocument-elements.mddocument-io.mdindex.mdtext-processing.md

document-io.mddocs/

0

# Document I/O

1

2

Core functions for reading and writing Pandoc JSON documents, running filter functions, and managing document processing workflows. These functions handle the fundamental operations of loading documents from Pandoc, processing them with filters, and outputting results.

3

4

## Capabilities

5

6

### Document Loading

7

8

Load Pandoc JSON documents from input streams and convert them to panflute Doc objects.

9

10

```python { .api }

11

def load(input_stream=None) -> Doc:

12

"""

13

Load JSON-encoded document and return a Doc element.

14

15

Parameters:

16

- input_stream: text stream used as input (default is sys.stdin)

17

18

Returns:

19

Doc: Parsed document with format and API version set

20

21

Example:

22

import panflute as pf

23

24

# Load from stdin (typical filter usage)

25

doc = pf.load()

26

27

# Load from file

28

with open('document.json', encoding='utf-8') as f:

29

doc = pf.load(f)

30

31

# Load from string

32

import io

33

json_str = '{"pandoc-api-version":[1,23],"meta":{},"blocks":[]}'

34

doc = pf.load(io.StringIO(json_str))

35

"""

36

```

37

38

### Document Output

39

40

Convert panflute Doc objects to Pandoc JSON format and write to output streams.

41

42

```python { .api }

43

def dump(doc: Doc, output_stream=None):

44

"""

45

Dump a Doc object into JSON-encoded text string.

46

47

Parameters:

48

- doc: Document to serialize

49

- output_stream: text stream used as output (default is sys.stdout)

50

51

Example:

52

import panflute as pf

53

54

doc = pf.Doc(pf.Para(pf.Str('Hello world')))

55

56

# Dump to stdout (typical filter usage)

57

pf.dump(doc)

58

59

# Dump to file

60

with open('output.json', 'w', encoding='utf-8') as f:

61

pf.dump(doc, f)

62

63

# Dump to string

64

import io

65

with io.StringIO() as f:

66

pf.dump(doc, f)

67

json_output = f.getvalue()

68

"""

69

```

70

71

### Single Filter Execution

72

73

Run a single filter function on a document with optional preprocessing and postprocessing.

74

75

```python { .api }

76

def run_filter(action: callable,

77

prepare: callable = None,

78

finalize: callable = None,

79

input_stream = None,

80

output_stream = None,

81

doc: Doc = None,

82

stop_if: callable = None,

83

**kwargs):

84

"""

85

Apply a filter function to each element in a document.

86

87

Parameters:

88

- action: function taking (element, doc) that processes elements

89

- prepare: function executed before filtering (receives doc)

90

- finalize: function executed after filtering (receives doc)

91

- input_stream: input source (default stdin)

92

- output_stream: output destination (default stdout)

93

- doc: existing Doc to process instead of loading from stream

94

- stop_if: function taking (element) to stop traversal early

95

- **kwargs: additional arguments passed to action function

96

97

Returns:

98

Doc: processed document if doc parameter provided, otherwise None

99

100

Example:

101

import panflute as pf

102

103

def emphasize_words(elem, doc):

104

if isinstance(elem, pf.Str) and 'important' in elem.text:

105

return pf.Emph(elem)

106

107

def prepare_doc(doc):

108

doc.emphasis_count = 0

109

110

def finalize_doc(doc):

111

pf.debug(f"Added {doc.emphasis_count} emphasis elements")

112

113

if __name__ == '__main__':

114

pf.run_filter(emphasize_words, prepare=prepare_doc, finalize=finalize_doc)

115

"""

116

```

117

118

### Multiple Filter Execution

119

120

Run multiple filter functions sequentially on a document.

121

122

```python { .api }

123

def run_filters(actions: list,

124

prepare: callable = None,

125

finalize: callable = None,

126

input_stream = None,

127

output_stream = None,

128

doc: Doc = None,

129

stop_if: callable = None,

130

**kwargs):

131

"""

132

Apply multiple filter functions sequentially to a document.

133

134

Parameters:

135

- actions: list of functions, each taking (element, doc)

136

- prepare: function executed before filtering

137

- finalize: function executed after all filtering

138

- input_stream: input source (default stdin)

139

- output_stream: output destination (default stdout)

140

- doc: existing Doc to process instead of loading from stream

141

- stop_if: function taking (element) to stop traversal early

142

- **kwargs: additional arguments passed to all action functions

143

144

Returns:

145

Doc: processed document if doc parameter provided, otherwise None

146

147

Example:

148

import panflute as pf

149

150

def convert_quotes(elem, doc):

151

if isinstance(elem, pf.Str):

152

return pf.Str(elem.text.replace('"', '"').replace('"', '"'))

153

154

def add_emphasis(elem, doc):

155

if isinstance(elem, pf.Str) and elem.text.isupper():

156

return pf.Strong(elem)

157

158

filters = [convert_quotes, add_emphasis]

159

160

if __name__ == '__main__':

161

pf.run_filters(filters)

162

"""

163

```

164

165

### Legacy Compatibility Functions

166

167

Wrapper functions providing backward compatibility with pandocfilters.

168

169

```python { .api }

170

def toJSONFilter(*args, **kwargs):

171

"""Wrapper for run_filter() - backward compatibility with pandocfilters."""

172

173

def toJSONFilters(*args, **kwargs):

174

"""Wrapper for run_filters() - backward compatibility with pandocfilters."""

175

```

176

177

## Usage Examples

178

179

### Basic Filter Pipeline

180

181

```python

182

import panflute as pf

183

184

def remove_emphasis(elem, doc):

185

"""Remove all emphasis elements, keeping their content."""

186

if isinstance(elem, pf.Emph):

187

return list(elem.content)

188

189

def count_words(elem, doc):

190

"""Count words in the document."""

191

if isinstance(elem, pf.Str):

192

doc.word_count = getattr(doc, 'word_count', 0) + len(elem.text.split())

193

194

def prepare(doc):

195

"""Initialize document processing."""

196

doc.word_count = 0

197

pf.debug("Starting document processing...")

198

199

def finalize(doc):

200

"""Complete document processing."""

201

pf.debug(f"Processing complete. Word count: {doc.word_count}")

202

203

if __name__ == '__main__':

204

pf.run_filters([remove_emphasis, count_words],

205

prepare=prepare, finalize=finalize)

206

```

207

208

### Programmatic Document Processing

209

210

```python

211

import panflute as pf

212

import io

213

214

# Create a document programmatically

215

doc = pf.Doc(

216

pf.Header(pf.Str('Sample Document'), level=1),

217

pf.Para(pf.Str('This is a '), pf.Emph(pf.Str('sample')), pf.Str(' document.')),

218

metadata={'author': pf.MetaString('John Doe')}

219

)

220

221

def uppercase_filter(elem, doc):

222

if isinstance(elem, pf.Str):

223

return pf.Str(elem.text.upper())

224

225

# Process the document

226

processed_doc = pf.run_filters([uppercase_filter], doc=doc)

227

228

# Output as JSON

229

with io.StringIO() as output:

230

pf.dump(processed_doc, output)

231

json_result = output.getvalue()

232

print(json_result)

233

```