Tessl Tile for pypi/mistune@3.1.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

core-parsing.md directives.md index.md parsing.md plugins.md renderers.md utilities.md

parsing.mddocs/

0
# Block and Inline Parsing
1

2
Low-level parsing components that handle the conversion of Markdown text into structured tokens. The parsing system is split into block-level elements (paragraphs, headings, lists) and inline elements (bold, italic, links), with state management for tracking parsing progress and context.
3

4
## Capabilities
5

6
### Block Parser
7

8
Handles block-level Markdown elements like headings, paragraphs, lists, code blocks, and blockquotes.
9

10
```python { .api }
11
class BlockParser(Parser[BlockState]):
12
    """
13
    Parser for block-level Markdown elements.
14
    
15
    Handles elements that form document structure: headings, paragraphs, 
16
    lists, code blocks, blockquotes, tables, etc.
17
    """
18
    
19
    def __init__(self):
20
        """Initialize block parser with default rules."""
21
    
22
    def parse(self, state: BlockState, rules: Optional[List[str]] = None) -> None:
23
        """
24
        Parse state source and populate with block tokens.
25
        
26
        Parameters:
27
        - state: BlockState to parse and populate with tokens
28
        - rules: Optional list of rules to use for parsing
29
        """
30
```
31

32
### Inline Parser
33

34
Processes inline Markdown elements within block content like emphasis, links, code spans, and images.
35

36
```python { .api }
37
class InlineParser(Parser[InlineState]):
38
    """
39
    Parser for inline-level Markdown elements.
40
    
41
    Handles elements within block content: bold, italic, links, 
42
    images, code spans, line breaks, etc.
43
    """
44
    
45
    def __init__(self, hard_wrap: bool = False):
46
        """
47
        Initialize inline parser.
48
        
49
        Parameters:
50
        - hard_wrap: Whether to convert line breaks to <br> tags
51
        """
52
    
53
    def __call__(self, text: str, env: MutableMapping[str, Any]) -> List[Dict[str, Any]]:
54
        """
55
        Process text and return inline tokens.
56
        
57
        Parameters:
58
        - text: Text to process
59
        - env: Environment mapping for parsing context
60
        
61
        Returns:
62
        List of inline tokens
63
        """
64
```
65

66
### Block State
67

68
State management for block-level parsing including cursor position, token accumulation, and parsing environment.
69

70
```python { .api }
71
class BlockState:
72
    """
73
    State management for block-level parsing.
74
    
75
    Tracks parsing progress, accumulated tokens, and contextual information
76
    during the block parsing process.
77
    
78
    Attributes:
79
    - src: str - Source text being parsed
80
    - tokens: List[Dict[str, Any]] - Accumulated parsed tokens
81
    - cursor: int - Current position in source text
82
    - cursor_max: int - Maximum position (length of source)
83
    - list_tight: bool - Whether current list is tight formatting
84
    - parent: Any - Parent parsing context
85
    - env: MutableMapping[str, Any] - Environment variables and data
86
    """
87
    
88
    def __init__(self, parent: Optional[Any] = None):
89
        """
90
        Initialize block parsing state.
91
        
92
        Parameters:
93
        - parent: Parent state context
94
        """
95
    
96
    def child_state(self, src: str) -> Self:
97
        """
98
        Create child state for nested parsing.
99
        
100
        Parameters:
101
        - src: Source text for child state
102
        
103
        Returns:
104
        New BlockState instance with this state as parent
105
        """
106
    
107
    def process(self, text: str) -> Self:
108
        """
109
        Process text and return populated state.
110
        
111
        Parameters:
112
        - text: Text to process
113
        
114
        Returns:
115
        Self with populated tokens and updated cursor
116
        """
117
```
118

119
### Inline State
120

121
State management for inline-level parsing within block elements.
122

123
```python { .api }
124
class InlineState:
125
    """
126
    State management for inline-level parsing.
127
    
128
    Tracks parsing of inline elements within block content including
129
    position tracking and environment data.
130
    
131
    Attributes:
132
    - src: str - Source text being parsed
133
    - tokens: List[Dict[str, Any]] - Accumulated inline tokens
134
    - pos: int - Current position in source text
135
    - env: MutableMapping[str, Any] - Environment variables and data
136
    """
137
    
138
    def __init__(self):
139
        """Initialize inline parsing state."""
140
    
141
    def append_token(self, token: Dict[str, Any]) -> None:
142
        """
143
        Add token to the token list.
144
        
145
        Parameters:
146
        - token: Token to add
147
        """
148
```
149

150
### Base Parser
151

152
Abstract base class providing common parsing functionality.
153

154
```python { .api }
155
ST = TypeVar('ST', bound=Union[BlockState, InlineState])
156

157
class Parser(Generic[ST]):
158
    """
159
    Base parser class with common parsing functionality.
160
    
161
    Provides rule registration, method dispatch, and parsing utilities
162
    for both block and inline parsers.
163
    """
164
    
165
    def register(
166
        self, 
167
        name: str, 
168
        pattern: Union[str, None], 
169
        func: Callable, 
170
        before: Optional[str] = None
171
    ) -> None:
172
        """
173
        Register a new parsing rule.
174
        
175
        Parameters:
176
        - name: Rule name
177
        - pattern: Regex pattern string or None
178
        - func: Function to handle matches
179
        - before: Insert rule before this existing rule
180
        """
181
```
182

183
## Usage Examples
184

185
### Custom Block Rule
186

187
Adding a custom block-level element:
188

189
```python
190
from mistune import create_markdown, BlockParser
191
import re
192

193
def custom_block_plugin(md):
194
    """Add support for custom block syntax: :::type content :::"""
195
    
196
    def parse_custom_block(block, m, state):
197
        block_type = m.group(1)
198
        content = m.group(2).strip()
199
        
200
        # Parse content as nested blocks
201
        child = state.child_state(content)
202
        block.parse(content, child)
203
        
204
        return {
205
            'type': 'custom_block',
206
            'attrs': {'block_type': block_type},
207
            'children': child.tokens
208
        }
209
    
210
    # Register rule with block parser
211
    md.block.register(
212
        'custom_block',
213
        r'^:::(\w+)\n(.*?)\n:::$',
214
        parse_custom_block
215
    )
216
    
217
    # Add renderer method
218
    def render_custom_block(text, block_type):
219
        return f'<div class="custom-{block_type}">{text}</div>\n'
220
    
221
    md.renderer.register('custom_block', render_custom_block)
222

223
# Use custom plugin
224
md = create_markdown()
225
md.use(custom_block_plugin)
226

227
result = md("""
228
:::warning
229
This is a **warning** block.
230
:::
231
""")
232
```
233

234
### Custom Inline Rule  
235

236
Adding a custom inline element:
237

238
```python
239
from mistune import create_markdown
240
import re
241

242
def emoji_plugin(md):
243
    """Add support for emoji syntax: :emoji_name:"""
244
    
245
    def parse_emoji(inline, m, state):
246
        emoji_name = m.group(1)
247
        return 'emoji', emoji_name
248
    
249
    # Register with inline parser
250
    md.inline.register('emoji', r':(\w+):', parse_emoji)
251
    
252
    # Add renderer method
253
    def render_emoji(emoji_name):
254
        emoji_map = {
255
            'smile': '😊',
256
            'heart': '❤️', 
257
            'thumbsup': '👍'
258
        }
259
        return emoji_map.get(emoji_name, f':{emoji_name}:')
260
    
261
    md.renderer.register('emoji', render_emoji)
262

263
# Use emoji plugin
264
md = create_markdown()
265
md.use(emoji_plugin)
266

267
result = md('Hello :smile: world :heart:!')
268
# Output: Hello 😊 world ❤️!
269
```
270

271
### State Access and Analysis
272

273
Accessing parsing state for analysis:
274

275
```python
276
from mistune import create_markdown
277

278
md = create_markdown()
279

280
# Parse with state access
281
text = """
282
# Heading 1
283

284
This is a paragraph with **bold** text.
285

286
## Heading 2
287

288
- List item 1
289
- List item 2
290
"""
291

292
output, state = md.parse(text)
293

294
# Analyze tokens
295
def analyze_tokens(tokens, level=0):
296
    indent = "  " * level
297
    for token in tokens:
298
        print(f"{indent}Token: {token['type']}")
299
        if 'attrs' in token:
300
            print(f"{indent}  Attrs: {token['attrs']}")
301
        if 'children' in token:
302
            analyze_tokens(token['children'], level + 1)
303

304
analyze_tokens(state.tokens)
305

306
# Access environment data
307
print(f"Environment: {state.env}")
308
```
309

310
### Parser Customization
311

312
Customizing parser behavior:
313

314
```python
315
from mistune import BlockParser, InlineParser, Markdown, HTMLRenderer
316

317
# Create custom parsers
318
block = BlockParser()
319
inline = InlineParser(hard_wrap=True)  # Convert line breaks to <br>
320

321
# Remove specific rules by modifying rules list
322
block.rules.remove('block_quote')  # Disable blockquotes
323
inline.rules.remove('emphasis')    # Disable italic text
324

325
# Create parser with custom components
326
renderer = HTMLRenderer(escape=False)
327
md = Markdown(renderer=renderer, block=block, inline=inline)
328

329
result = md('This is *not italic*\nThis is a line break.')
330
```
331

332
## Token Structure
333

334
Understanding the token format for custom processing:
335

336
```python
337
# Block token structure
338
block_token = {
339
    'type': 'heading',           # Token type
340
    'attrs': {'level': 1},       # Element attributes
341
    'children': [                # Child tokens (for container elements)
342
        {
343
            'type': 'text',
344
            'raw': 'Heading Text'
345
        }
346
    ]
347
}
348

349
# Inline token structure  
350
inline_token = {
351
    'type': 'strong',            # Token type
352
    'children': [                # Child tokens
353
        {
354
            'type': 'text',
355
            'raw': 'Bold Text'
356
        }
357
    ]
358
}
359

360
# Leaf token structure
361
text_token = {
362
    'type': 'text',              # Token type
363
    'raw': 'Plain text content' # Raw text content
364
}
365
```
366

367
This parsing architecture provides the flexibility to extend mistune with custom syntax while maintaining high performance through optimized parsing algorithms and clear separation between block and inline processing stages.

Version

Tile

Files

parsing.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

parsing.mddocs/