0
# PyYAML
1
2
A comprehensive YAML processing framework for Python that provides complete YAML 1.1 parsing and emission capabilities with Unicode support, pickle integration, and extensible API. The library features both pure Python and high-performance C extension implementations through LibYAML bindings, offering developers flexibility between portability and speed.
3
4
## Package Information
5
6
- **Package Name**: PyYAML
7
- **Language**: Python
8
- **Installation**: `pip install PyYAML`
9
- **C Extensions**: Optional LibYAML bindings for high performance
10
11
## Core Imports
12
13
```python
14
import yaml
15
```
16
17
Most commonly used functions are available directly from the main module:
18
19
```python
20
from yaml import safe_load, safe_dump, load, dump
21
```
22
23
For advanced usage with specific loaders/dumpers:
24
25
```python
26
from yaml import SafeLoader, SafeDumper, FullLoader, Loader, UnsafeLoader
27
```
28
29
## Basic Usage
30
31
```python
32
import yaml
33
34
# Safe loading (recommended for untrusted input)
35
data = yaml.safe_load("""
36
name: John Doe
37
age: 30
38
skills:
39
- Python
40
- YAML
41
- JSON
42
""")
43
44
print(data['name']) # John Doe
45
print(data['skills']) # ['Python', 'YAML', 'JSON']
46
47
# Safe dumping
48
yaml_string = yaml.safe_dump(data, default_flow_style=False)
49
print(yaml_string)
50
51
# Loading multiple documents
52
documents = yaml.safe_load_all("""
53
---
54
document: 1
55
name: First
56
---
57
document: 2
58
name: Second
59
""")
60
61
for doc in documents:
62
print(f"Document {doc['document']}: {doc['name']}")
63
```
64
65
## Architecture
66
67
PyYAML follows a multi-stage processing pipeline:
68
69
- **Reader**: Handles input stream encoding and character validation
70
- **Scanner**: Converts character stream to tokens (lexical analysis)
71
- **Parser**: Converts tokens to events (syntactic analysis)
72
- **Composer**: Converts events to representation tree (semantic analysis)
73
- **Constructor**: Converts representation tree to Python objects
74
- **Representer**: Converts Python objects to representation tree
75
- **Serializer**: Converts representation tree to events
76
- **Emitter**: Converts events to YAML text
77
78
This modular design allows for extensive customization at each processing stage through inheritance and configuration.
79
80
## Capabilities
81
82
### Safe Operations
83
84
High-level functions designed for safe processing of YAML data, recommended for handling untrusted input. These operations use restricted loaders that only handle basic YAML types.
85
86
```python { .api }
87
def safe_load(stream: str | bytes | IO) -> Any
88
def safe_load_all(stream: str | bytes | IO) -> Iterator[Any]
89
def safe_dump(data: Any, stream: IO = None, **kwds) -> str | None
90
def safe_dump_all(documents: Iterable[Any], stream: IO = None, **kwds) -> str | None
91
```
92
93
[Safe Operations](./safe-operations.md)
94
95
### Loading and Parsing
96
97
Comprehensive YAML loading capabilities with multiple security levels and processing stages. Includes low-level parsing functions for advanced use cases.
98
99
```python { .api }
100
def load(stream: str | bytes | IO, Loader: type) -> Any
101
def load_all(stream: str | bytes | IO, Loader: type) -> Iterator[Any]
102
def full_load(stream: str | bytes | IO) -> Any
103
def full_load_all(stream: str | bytes | IO) -> Iterator[Any]
104
def parse(stream: str | bytes | IO, Loader: type = Loader) -> Iterator[Event]
105
def compose(stream: str | bytes | IO, Loader: type = Loader) -> Node | None
106
```
107
108
[Loading and Parsing](./loading-parsing.md)
109
110
### Dumping and Serialization
111
112
YAML output generation with extensive formatting options and multiple dumper classes for different security levels and feature sets.
113
114
```python { .api }
115
def dump(data: Any, stream: IO = None, Dumper: type = Dumper, **kwds) -> str | None
116
def dump_all(documents: Iterable[Any], stream: IO = None, Dumper: type = Dumper, **kwds) -> str | None
117
def emit(events: Iterable[Event], stream: IO = None, Dumper: type = Dumper, **kwds) -> str | None
118
def serialize(node: Node, stream: IO = None, Dumper: type = Dumper, **kwds) -> str | None
119
```
120
121
[Dumping and Serialization](./dumping-serialization.md)
122
123
### Loaders and Dumpers
124
125
Comprehensive set of loader and dumper classes providing different security levels and performance characteristics, including optional C-based implementations.
126
127
```python { .api }
128
class BaseLoader: ...
129
class SafeLoader: ...
130
class FullLoader: ...
131
class Loader: ...
132
class UnsafeLoader: ...
133
class BaseDumper: ...
134
class SafeDumper: ...
135
class Dumper: ...
136
# C Extensions (when LibYAML available)
137
class CBaseLoader: ...
138
class CSafeLoader: ...
139
class CFullLoader: ...
140
class CLoader: ...
141
class CUnsafeLoader: ...
142
class CBaseDumper: ...
143
class CSafeDumper: ...
144
class CDumper: ...
145
```
146
147
[Loaders and Dumpers](./loaders-dumpers.md)
148
149
### Customization and Extension
150
151
Advanced customization capabilities for extending YAML processing with custom constructors, representers, and resolvers.
152
153
```python { .api }
154
def add_constructor(tag: str, constructor: Callable, Loader: type = None) -> None
155
def add_representer(data_type: type, representer: Callable, Dumper: type = Dumper) -> None
156
def add_implicit_resolver(tag: str, regexp: Pattern, first: str = None, Loader: type = None, Dumper: type = Dumper) -> None
157
```
158
159
[Customization and Extension](./customization.md)
160
161
### Error Handling
162
163
Comprehensive exception hierarchy for handling different types of YAML processing errors with detailed position information.
164
165
```python { .api }
166
class YAMLError(Exception): ...
167
class MarkedYAMLError(YAMLError): ...
168
class ReaderError(YAMLError): ...
169
class ScannerError(MarkedYAMLError): ...
170
class ParserError(MarkedYAMLError): ...
171
```
172
173
[Error Handling](./error-handling.md)
174
175
### Utility Functions
176
177
Legacy and utility functions for compatibility and special use cases.
178
179
```python { .api }
180
def warnings(settings: dict = None) -> dict:
181
"""
182
Deprecated warnings control function.
183
184
This function is deprecated and no longer functional. It returns an empty
185
dictionary for backwards compatibility but does not affect YAML processing.
186
187
Args:
188
settings (dict, optional): Ignored settings parameter
189
190
Returns:
191
dict: Empty dictionary
192
193
Note:
194
This function is deprecated and will be removed in a future version.
195
"""
196
```
197
198
## Types
199
200
### Core Types
201
202
```python { .api }
203
# Type aliases for common input/output types
204
from typing import Union, Type, IO, Any, Iterator
205
from re import Pattern
206
207
Stream = Union[str, bytes, IO[str], IO[bytes]]
208
LoaderType = Union[Type[BaseLoader], Type[SafeLoader], Type[FullLoader], Type[Loader], Type[UnsafeLoader]]
209
DumperType = Union[Type[BaseDumper], Type[SafeDumper], Type[Dumper]]
210
211
# Position tracking for error reporting
212
class Mark:
213
"""Position marker for error reporting and debugging."""
214
name: str # Source name (filename, etc.)
215
index: int # Character index in stream
216
line: int # Line number (0-based)
217
column: int # Column number (0-based)
218
buffer: str # Buffer content around position
219
pointer: int # Pointer position in buffer
220
221
# Low-level processing components
222
class Token:
223
"""Base class for lexical tokens."""
224
start_mark: Mark
225
end_mark: Mark
226
227
class Event:
228
"""Base class for parsing events."""
229
start_mark: Mark
230
end_mark: Mark
231
232
class Node:
233
"""Base class for representation tree nodes."""
234
tag: str
235
value: Any
236
start_mark: Mark
237
end_mark: Mark
238
```
239
240
### Module Constants
241
242
```python { .api }
243
__version__: str # '6.0.2'
244
__with_libyaml__: bool # True if C extensions available
245
```