0
# Loading and Parsing
1
2
Comprehensive YAML loading capabilities with multiple security levels and processing stages. Includes high-level loading functions and low-level parsing operations for advanced use cases.
3
4
## Capabilities
5
6
### High-Level Loading
7
8
Load YAML documents into Python objects using different loader classes with varying security levels and feature sets.
9
10
```python { .api }
11
def load(stream, Loader):
12
"""
13
Parse the first YAML document in a stream and produce the corresponding Python object.
14
15
Args:
16
stream (str | bytes | IO): YAML content as string, bytes, or file-like object
17
Loader (type): Loader class to use (SafeLoader, FullLoader, Loader, UnsafeLoader)
18
19
Returns:
20
Any: Python object representing the YAML document
21
22
Raises:
23
YAMLError: If the document cannot be parsed
24
MarkedYAMLError: If there's a syntax error with position information
25
"""
26
27
def load_all(stream, Loader):
28
"""
29
Parse all YAML documents in a stream and produce corresponding Python objects.
30
31
Args:
32
stream (str | bytes | IO): YAML content containing multiple documents
33
Loader (type): Loader class to use
34
35
Yields:
36
Any: Python objects representing each YAML document
37
38
Raises:
39
YAMLError: If any document cannot be parsed
40
"""
41
42
def full_load(stream):
43
"""
44
Parse the first YAML document in a stream and produce the corresponding Python object.
45
46
Resolve most tags but with security restrictions. Recommended for trusted input.
47
48
Args:
49
stream (str | bytes | IO): YAML content
50
51
Returns:
52
Any: Python object representing the YAML document
53
"""
54
55
def full_load_all(stream):
56
"""
57
Parse all YAML documents in a stream and produce corresponding Python objects.
58
59
Resolve most tags but with security restrictions.
60
61
Args:
62
stream (str | bytes | IO): YAML content containing multiple documents
63
64
Yields:
65
Any: Python objects representing each YAML document
66
"""
67
68
def unsafe_load(stream):
69
"""
70
Parse the first YAML document in a stream and produce the corresponding Python object.
71
72
Resolve all tags, even those known to be unsafe on untrusted input.
73
WARNING: This can execute arbitrary Python code.
74
75
Args:
76
stream (str | bytes | IO): YAML content
77
78
Returns:
79
Any: Python object representing the YAML document
80
"""
81
82
def unsafe_load_all(stream):
83
"""
84
Parse all YAML documents in a stream and produce corresponding Python objects.
85
86
Resolve all tags, even those known to be unsafe on untrusted input.
87
WARNING: This can execute arbitrary Python code.
88
89
Args:
90
stream (str | bytes | IO): YAML content containing multiple documents
91
92
Yields:
93
Any: Python objects representing each YAML document
94
"""
95
```
96
97
#### Usage Examples
98
99
```python
100
import yaml
101
from yaml import SafeLoader, FullLoader, Loader
102
103
yaml_content = """
104
person:
105
name: John Doe
106
age: 30
107
birth_date: 1993-05-15
108
skills:
109
- Python
110
- YAML
111
metadata:
112
created: 2023-01-01T10:00:00Z
113
"""
114
115
# Safe loading (basic types only)
116
data_safe = yaml.load(yaml_content, SafeLoader)
117
print(type(data_safe['person']['birth_date'])) # str
118
119
# Full loading (more types, but restricted)
120
data_full = yaml.full_load(yaml_content)
121
print(type(data_full['person']['birth_date'])) # datetime.date
122
123
# Explicit loader specification
124
data_explicit = yaml.load(yaml_content, Loader=FullLoader)
125
126
# Loading multiple documents
127
multi_doc = """
128
---
129
doc: 1
130
name: First Document
131
---
132
doc: 2
133
name: Second Document
134
"""
135
136
for doc in yaml.load_all(multi_doc, SafeLoader):
137
print(f"Document {doc['doc']}: {doc['name']}")
138
```
139
140
### Low-Level Parsing
141
142
Access lower-level parsing stages for advanced processing and custom workflows.
143
144
```python { .api }
145
def scan(stream, Loader=Loader):
146
"""
147
Scan a YAML stream and produce scanning tokens.
148
149
Args:
150
stream (str | bytes | IO): YAML content
151
Loader (type, optional): Loader class to use for scanning
152
153
Yields:
154
Token: Scanning tokens (StreamStartToken, ScalarToken, etc.)
155
"""
156
157
def parse(stream, Loader=Loader):
158
"""
159
Parse a YAML stream and produce parsing events.
160
161
Args:
162
stream (str | bytes | IO): YAML content
163
Loader (type, optional): Loader class to use for parsing
164
165
Yields:
166
Event: Parsing events (StreamStartEvent, ScalarEvent, etc.)
167
"""
168
169
def compose(stream, Loader=Loader):
170
"""
171
Parse the first YAML document in a stream and produce the corresponding representation tree.
172
173
Args:
174
stream (str | bytes | IO): YAML content
175
Loader (type, optional): Loader class to use
176
177
Returns:
178
Node | None: Root node of the representation tree, or None if no document
179
"""
180
181
def compose_all(stream, Loader=Loader):
182
"""
183
Parse all YAML documents in a stream and produce corresponding representation trees.
184
185
Args:
186
stream (str | bytes | IO): YAML content containing multiple documents
187
Loader (type, optional): Loader class to use
188
189
Yields:
190
Node: Root nodes of representation trees
191
"""
192
```
193
194
#### Usage Examples
195
196
```python
197
import yaml
198
from yaml import SafeLoader
199
200
yaml_content = "name: John\nage: 30"
201
202
# Scanning - produces tokens
203
tokens = list(yaml.scan(yaml_content, SafeLoader))
204
for token in tokens:
205
print(f"{type(token).__name__}: {token}")
206
207
# Parsing - produces events
208
events = list(yaml.parse(yaml_content, SafeLoader))
209
for event in events:
210
print(f"{type(event).__name__}: {event}")
211
212
# Composing - produces representation tree
213
node = yaml.compose(yaml_content, SafeLoader)
214
print(f"Root node: {type(node).__name__}")
215
print(f"Tag: {node.tag}")
216
print(f"Value type: {type(node.value)}")
217
218
# Walking the tree
219
if hasattr(node, 'value'):
220
for key_node, value_node in node.value:
221
print(f" {key_node.value}: {value_node.value}")
222
```
223
224
## Security Levels
225
226
### SafeLoader (Safest)
227
- Only basic YAML types (string, number, boolean, list, dict, null)
228
- No Python object instantiation
229
- Safe for completely untrusted input
230
231
### FullLoader (Recommended)
232
- Most YAML types including dates, timestamps
233
- Some Python types but with restrictions
234
- Prevents known dangerous operations
235
- Good balance of features and security
236
237
### Loader/UnsafeLoader (Dangerous)
238
- All YAML types including arbitrary Python objects
239
- Can execute Python code during loading
240
- Only for completely trusted input
241
- Required for full PyYAML feature set
242
243
## Supported Input Types
244
245
All loading functions accept:
246
247
- **str**: YAML content as Unicode string
248
- **bytes**: YAML content as UTF-8/UTF-16/UTF-32 encoded bytes
249
- **IO[str]**: Text file-like objects (open files, StringIO, etc.)
250
- **IO[bytes]**: Binary file-like objects with YAML content
251
252
### File Loading Examples
253
254
```python
255
import yaml
256
257
# From file path (manual)
258
with open('config.yaml', 'r') as f:
259
config = yaml.safe_load(f)
260
261
# From binary file
262
with open('config.yaml', 'rb') as f:
263
config = yaml.safe_load(f) # Auto-detects encoding
264
265
# From StringIO
266
from io import StringIO
267
yaml_io = StringIO("key: value")
268
data = yaml.safe_load(yaml_io)
269
270
# From URL or network source
271
import urllib.request
272
with urllib.request.urlopen('https://example.com/config.yaml') as response:
273
config = yaml.safe_load(response)
274
```
275
276
## Error Handling
277
278
Loading operations can raise various exceptions:
279
280
```python
281
try:
282
data = yaml.safe_load(yaml_content)
283
except yaml.YAMLError as e:
284
print(f"YAML Error: {e}")
285
except yaml.MarkedYAMLError as e:
286
print(f"Syntax Error at line {e.problem_mark.line + 1}: {e.problem}")
287
```
288
289
Common error scenarios:
290
- Malformed YAML syntax
291
- Invalid Unicode sequences
292
- Circular references in data
293
- Memory limits exceeded for large documents
294
- Security restrictions violated (with restricted loaders)
295
296
## Node Classes
297
298
Representation tree nodes used by the compose functions and low-level processing:
299
300
```python { .api }
301
class Node:
302
"""
303
Base class for all YAML representation nodes.
304
305
Attributes:
306
tag (str): YAML tag for the node
307
value: Node-specific value (varies by subclass)
308
start_mark (Mark): Position where node starts
309
end_mark (Mark): Position where node ends
310
"""
311
312
class ScalarNode(Node):
313
"""
314
Represents scalar values (strings, numbers, booleans, null).
315
316
Attributes:
317
tag (str): YAML tag
318
value (str): Raw scalar value as string
319
start_mark (Mark): Start position
320
end_mark (Mark): End position
321
style (str): Scalar style (' ', '|', '>', '"', "'", etc.)
322
"""
323
324
class SequenceNode(CollectionNode):
325
"""
326
Represents YAML sequences (lists/arrays).
327
328
Attributes:
329
tag (str): YAML tag
330
value (list[Node]): List of child nodes
331
start_mark (Mark): Start position
332
end_mark (Mark): End position
333
flow_style (bool): True for flow style [a, b], False for block style
334
"""
335
336
class MappingNode(CollectionNode):
337
"""
338
Represents YAML mappings (dictionaries/objects).
339
340
Attributes:
341
tag (str): YAML tag
342
value (list[tuple[Node, Node]]): List of (key_node, value_node) pairs
343
start_mark (Mark): Start position
344
end_mark (Mark): End position
345
flow_style (bool): True for flow style {a: b}, False for block style
346
"""
347
```
348
349
## Event Classes
350
351
Event objects produced by parsing functions for low-level YAML processing:
352
353
```python { .api }
354
class Event:
355
"""Base class for all YAML events."""
356
357
class StreamStartEvent(Event):
358
"""Start of YAML stream."""
359
360
class StreamEndEvent(Event):
361
"""End of YAML stream."""
362
363
class DocumentStartEvent(Event):
364
"""
365
Start of YAML document.
366
367
Attributes:
368
explicit (bool): True if explicit document start (---)
369
version (tuple): YAML version (e.g., (1, 1))
370
tags (dict): Tag directive mappings
371
"""
372
373
class DocumentEndEvent(Event):
374
"""
375
End of YAML document.
376
377
Attributes:
378
explicit (bool): True if explicit document end (...)
379
"""
380
381
class ScalarEvent(NodeEvent):
382
"""
383
Scalar value event.
384
385
Attributes:
386
anchor (str): Anchor name if present
387
tag (str): YAML tag
388
implicit (tuple[bool, bool]): Implicit tag resolution flags
389
value (str): Scalar value
390
style (str): Scalar style
391
"""
392
393
class SequenceStartEvent(CollectionStartEvent):
394
"""Start of sequence/list."""
395
396
class SequenceEndEvent(CollectionEndEvent):
397
"""End of sequence/list."""
398
399
class MappingStartEvent(CollectionStartEvent):
400
"""Start of mapping/dictionary."""
401
402
class MappingEndEvent(CollectionEndEvent):
403
"""End of mapping/dictionary."""
404
405
class AliasEvent(NodeEvent):
406
"""
407
Alias reference event.
408
409
Attributes:
410
anchor (str): Referenced anchor name
411
"""
412
```