or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

cli.mdhigh-level.mdindex.mdparsing.mdrendering.mdutilities.md

parsing.mddocs/

0

# Parsing and AST Manipulation

1

2

Comprehensive parsing functionality with full control over the Abstract Syntax Tree (AST). The Parser class converts Markdown text into a tree of Node objects, which can then be manipulated programmatically.

3

4

## Capabilities

5

6

### Parser Class

7

8

Main parsing class that converts CommonMark Markdown text into an Abstract Syntax Tree representation.

9

10

```python { .api }

11

class Parser:

12

def __init__(self, options={}):

13

"""

14

Initialize a new Parser instance.

15

16

Args:

17

options (dict): Configuration options for parsing behavior

18

"""

19

20

def parse(self, my_input):

21

"""

22

Parse CommonMark text into an AST.

23

24

Args:

25

my_input (str): CommonMark Markdown text to parse

26

27

Returns:

28

Node: Root node of the parsed AST

29

"""

30

```

31

32

### Node Class

33

34

Represents individual nodes in the Abstract Syntax Tree with methods for tree manipulation and traversal.

35

36

```python { .api }

37

class Node:

38

def __init__(self, node_type, sourcepos):

39

"""

40

Create a new Node.

41

42

Args:

43

node_type (str): Type of the node (e.g., 'document', 'paragraph', 'text', 'heading')

44

sourcepos (SourcePos): Source position information [[start_line, start_col], [end_line, end_col]]

45

"""

46

47

def walker(self):

48

"""

49

Create a NodeWalker for traversing this node and its descendants.

50

51

Returns:

52

NodeWalker: Iterator for tree traversal

53

"""

54

55

def append_child(self, child):

56

"""

57

Append a child node to this node.

58

59

Args:

60

child (Node): Node to append as child

61

"""

62

63

def prepend_child(self, child):

64

"""

65

Prepend a child node to this node.

66

67

Args:

68

child (Node): Node to prepend as child

69

"""

70

71

def unlink(self):

72

"""

73

Remove this node from its parent, unlinking it from the tree.

74

"""

75

76

def insert_after(self, sibling):

77

"""

78

Insert this node after the specified sibling node.

79

80

Args:

81

sibling (Node): Node after which to insert this node

82

"""

83

84

def insert_before(self, sibling):

85

"""

86

Insert this node before the specified sibling node.

87

88

Args:

89

sibling (Node): Node before which to insert this node

90

"""

91

92

def pretty(self):

93

"""

94

Print pretty-printed representation of this node to stdout.

95

Uses pprint to display the node's internal dictionary structure.

96

97

Returns:

98

None: Prints to stdout rather than returning a value

99

"""

100

101

def normalize(self):

102

"""

103

Normalize the node by combining adjacent text nodes.

104

"""

105

106

def is_container(self):

107

"""

108

Check if this node can contain other nodes.

109

110

Returns:

111

bool: True if node can contain children, False otherwise

112

"""

113

```

114

115

### NodeWalker Class

116

117

Iterator for traversing AST nodes in document order, providing fine-grained control over tree traversal.

118

119

```python { .api }

120

class NodeWalker:

121

def __init__(self, root):

122

"""

123

Create a NodeWalker starting at the specified root node.

124

125

Args:

126

root (Node): Root node to start traversal from

127

"""

128

129

def nxt(self):

130

"""

131

Get the next node in the traversal.

132

133

Returns:

134

WalkEvent or None: Dictionary with 'node' (Node) and 'entering' (bool) keys,

135

or None if traversal is complete

136

"""

137

138

def resume_at(self, node, entering):

139

"""

140

Resume traversal at a specific node and entering state.

141

142

Args:

143

node (Node): Node to resume at

144

entering (bool): Whether we're entering or exiting the node

145

"""

146

```

147

148

## Usage Examples

149

150

### Basic Parsing

151

152

```python

153

from commonmark import Parser

154

155

parser = Parser()

156

markdown = """

157

# Hello World

158

159

This is a paragraph with **bold** text.

160

"""

161

162

ast = parser.parse(markdown)

163

ast.pretty() # Print AST structure to stdout

164

```

165

166

### AST Manipulation

167

168

```python

169

from commonmark import Parser

170

from commonmark.node import Node

171

172

parser = Parser()

173

ast = parser.parse("# Original Title")

174

175

# Create a new text node

176

new_text = Node('text', [[1, 1], [1, 9]])

177

new_text.literal = "New Title"

178

179

# Replace the title text

180

title_node = ast.first_child # Header node

181

old_text = title_node.first_child # Original text

182

title_node.append_child(new_text)

183

old_text.unlink()

184

```

185

186

### Tree Traversal

187

188

```python

189

from commonmark import Parser

190

191

parser = Parser()

192

ast = parser.parse("""

193

# Title

194

195

Some text with **bold** and *italic*.

196

""")

197

198

walker = ast.walker()

199

event = walker.nxt()

200

while event:

201

node, entering = event['node'], event['entering']

202

if entering:

203

print(f"Entering: {node.t}")

204

if hasattr(node, 'literal') and node.literal:

205

print(f" Content: '{node.literal}'")

206

event = walker.nxt()

207

```

208

209

## Types

210

211

```python { .api }

212

# Source position format: [[start_line, start_col], [end_line, end_col]]

213

SourcePos = list[list[int, int], list[int, int]]

214

215

# Node types

216

NodeType = str # 'document', 'paragraph', 'text', 'strong', 'emph', 'heading', etc.

217

218

# Walking event structure

219

WalkEvent = dict[str, Node | bool] # {'node': Node, 'entering': bool} or None

220

```

221

222

### Node Properties

223

224

Common node properties that can be accessed:

225

226

- `node.t`: Node type (e.g., 'document', 'paragraph', 'text', 'strong', 'emph', 'heading')

227

- `node.literal`: Text content for text nodes (str or None)

228

- `node.first_child`: First child node (Node or None)

229

- `node.last_child`: Last child node (Node or None)

230

- `node.parent`: Parent node (Node or None)

231

- `node.nxt`: Next sibling node (Node or None)

232

- `node.prv`: Previous sibling node (Node or None)

233

- `node.sourcepos`: Source position information [[start_line, start_col], [end_line, end_col]]

234

- `node.string_content`: String content for container nodes (str)

235

- `node.info`: Info string for code blocks (str or None)

236

- `node.destination`: URL for links and images (str or None)

237

- `node.title`: Title for links and images (str or None)

238

- `node.level`: Heading level 1-6 for heading nodes (int or None)

239

- `node.list_data`: List metadata dictionary for list and list item nodes (dict)