PyYAML-based module to produce a bit more pretty and readable YAML-serialized data
npx @tessl/cli install tessl/pypi-pyaml@25.7.00
# PyAML
1
2
PyYAML-based Python module to produce human-readable, pretty-printed YAML-serialized data. It extends PyYAML with better formatting options specifically designed for readability, version control friendliness, and human editability rather than perfect serialization fidelity.
3
4
## Package Information
5
6
- **Package Name**: pyaml
7
- **Language**: Python
8
- **Installation**: `pip install pyaml`
9
- **Optional Dependencies**: `pip install unidecode` (for better anchor naming from non-ASCII keys)
10
11
## Core Imports
12
13
```python
14
import pyaml
15
```
16
17
For specific functions:
18
19
```python
20
from pyaml import dump, pprint, debug, PYAMLSort
21
```
22
23
## Basic Usage
24
25
```python
26
import pyaml
27
28
# Basic data serialization
29
data = {
30
'name': 'John Doe',
31
'age': 30,
32
'skills': ['Python', 'YAML', 'Data'],
33
'config': {
34
'debug': True,
35
'timeout': 60
36
}
37
}
38
39
# Pretty-print to string
40
yaml_string = pyaml.dump(data)
41
print(yaml_string)
42
43
# Pretty-print directly to stdout
44
pyaml.pprint(data)
45
46
# Debug mode (shows repr of unknown types)
47
pyaml.debug(data, some_complex_object)
48
49
# Write to file
50
with open('output.yaml', 'w') as f:
51
pyaml.dump(data, f)
52
```
53
54
## Capabilities
55
56
### Main Dump Functions
57
58
Core functions for converting Python data structures to pretty-printed YAML format.
59
60
```python { .api }
61
def dump(data, dst=None, safe=None, force_embed=True, vspacing=True,
62
string_val_style=None, sort_dicts=None, multiple_docs=False,
63
width=100, repr_unknown=False, **pyyaml_kws):
64
"""
65
Serialize data as pretty-YAML to specified dst file-like object,
66
or return as str with dst=str (default) or encoded to bytes with dst=bytes.
67
68
Parameters:
69
- data: Data to serialize
70
- dst: Destination (None/str for string, bytes for bytes, file-like object)
71
- safe: (deprecated) Safety flag, ignored in pyaml >= 23.x with warnings
72
- force_embed: bool, default=True, avoid anchor/reference syntax
73
- vspacing: bool/dict, default=True, add vertical spacing between sections
74
- string_val_style: str, force string value style ('|', '>', "'", '"', 'plain')
75
- sort_dicts: PYAMLSort enum, dictionary sorting behavior
76
- multiple_docs: bool, default=False, multiple document mode
77
- width: int, default=100, line width hint
78
- repr_unknown: bool/int, default=False, represent unknown types as repr strings
79
- **pyyaml_kws: Additional PyYAML dumper keywords
80
81
Returns:
82
str (default), bytes, or None (when writing to file)
83
"""
84
85
def dump_all(data, *args, **kwargs):
86
"""
87
Alias to dump(data, multiple_docs=True) for API compatibility with PyYAML.
88
89
Parameters:
90
- data: List of documents to serialize
91
- *args, **kwargs: Same as dump()
92
93
Returns:
94
str, bytes, or None (when writing to file)
95
"""
96
97
def dumps(data, **kwargs):
98
"""
99
Alias to dump() for API compatibility with stdlib conventions.
100
101
Parameters:
102
- data: Data to serialize
103
- **kwargs: Same as dump()
104
105
Returns:
106
str
107
"""
108
```
109
110
### Print and Debug Functions
111
112
Convenient functions for debugging and console output.
113
114
```python { .api }
115
def pprint(*data, **kwargs):
116
"""
117
Similar to how print() works, with any number of arguments and stdout-default.
118
119
Parameters:
120
- *data: Any number of data objects to print
121
- file: Output file (default: sys.stdout)
122
- dst: Alias for file parameter
123
- **kwargs: Same dump() parameters
124
125
Returns:
126
None
127
"""
128
129
def debug(*data, **kwargs):
130
"""
131
Same as pprint, but also repr-printing any non-YAML types.
132
133
Parameters:
134
- *data: Any number of data objects to debug
135
- **kwargs: Same as pprint(), with repr_unknown=True implied
136
137
Returns:
138
None
139
"""
140
141
def p(*data, **kwargs):
142
"""
143
Alias for pprint() function.
144
145
Parameters:
146
- *data: Any number of data objects to print
147
- **kwargs: Same as pprint()
148
149
Returns:
150
None
151
"""
152
153
def print(*data, **kwargs):
154
"""
155
Alias for pprint() function (overrides built-in print when imported).
156
157
Parameters:
158
- *data: Any number of data objects to print
159
- **kwargs: Same as pprint()
160
161
Returns:
162
None
163
"""
164
```
165
166
### Utility Functions
167
168
Helper functions for advanced YAML formatting.
169
170
```python { .api }
171
def dump_add_vspacing(yaml_str, split_lines=40, split_count=2,
172
oneline_group=False, oneline_split=False):
173
"""
174
Add some newlines to separate overly long YAML lists/mappings.
175
176
Parameters:
177
- yaml_str: str, YAML string to process
178
- split_lines: int, default=40, min number of lines to trigger splitting
179
- split_count: int, default=2, min count of items to split
180
- oneline_group: bool, default=False, don't split consecutive oneliner items
181
- oneline_split: bool, default=False, split long lists of oneliner values
182
183
Returns:
184
str: YAML string with added vertical spacing
185
"""
186
187
def add_representer(data_type, representer):
188
"""
189
Add custom representer for data types (alias to PYAMLDumper.add_representer).
190
191
Parameters:
192
- data_type: Type to add representer for
193
- representer: Function to handle representation
194
195
Returns:
196
None
197
"""
198
199
def safe_replacement(path, *open_args, mode=None, xattrs=None, **open_kws):
200
"""
201
Context manager to atomically create/replace file-path in-place unless errors are raised.
202
203
Parameters:
204
- path: str, file path to replace
205
- *open_args: Arguments for tempfile.NamedTemporaryFile
206
- mode: File mode (preserves original if None)
207
- xattrs: Extended attributes (auto-detected if None)
208
- **open_kws: Additional keywords for tempfile.NamedTemporaryFile
209
210
Returns:
211
Context manager yielding temporary file object
212
"""
213
214
def file_line_iter(src, sep='\\0\\n', bs=128*2**10):
215
"""
216
Generator for src-file chunks, split by any of the separator chars.
217
218
Parameters:
219
- src: File-like object to read from
220
- sep: str, separator characters (default: null and newline)
221
- bs: int, buffer size in bytes (default: 256KB)
222
223
Yields:
224
str: File chunks split by separators
225
"""
226
```
227
228
### Command Line Interface
229
230
CLI functionality accessible via `python -m pyaml` or `pyaml` command.
231
232
```python { .api }
233
def main(argv=None, stdin=sys.stdin, stdout=sys.stdout, stderr=sys.stderr):
234
"""
235
Command-line interface main function.
236
237
Parameters:
238
- argv: list, command line arguments (default: sys.argv[1:])
239
- stdin: Input stream (default: sys.stdin)
240
- stdout: Output stream (default: sys.stdout)
241
- stderr: Error stream (default: sys.stderr)
242
243
Returns:
244
None
245
246
Command-line options:
247
- path: Path to YAML file to read (default: stdin)
248
- -r, --replace: Replace file in-place with prettified version
249
- -w, --width CHARS: Max line width hint
250
- -v, --vspacing N[/M][g]: Custom vertical spacing thresholds
251
- -l, --lines: Read input as null/newline-separated entries
252
- -q, --quiet: Disable output validation and suppress warnings
253
"""
254
```
255
256
### Advanced Configuration Classes
257
258
Classes for advanced YAML dumping configuration.
259
260
```python { .api }
261
class PYAMLDumper(yaml.dumper.SafeDumper):
262
"""
263
Custom YAML dumper with pretty-printing enhancements.
264
265
Constructor Parameters:
266
- *args: Arguments passed to SafeDumper
267
- sort_dicts: PYAMLSort enum or bool, dictionary sorting behavior
268
- force_embed: bool, default=True, avoid anchor/reference syntax
269
- string_val_style: str, force string value style
270
- anchor_len_max: int, default=40, max anchor name length
271
- repr_unknown: bool/int, default=False, represent unknown types
272
- **kws: Additional PyYAML dumper keywords
273
274
Key Methods:
275
- represent_str(): Custom string representation with style selection
276
- represent_mapping(): Custom mapping representation with sorting
277
- represent_undefined(): Handle non-YAML types (namedtuples, enums, dataclasses)
278
- anchor_node(): Generate meaningful anchor names from context
279
- pyaml_transliterate(): Static method for anchor name generation
280
"""
281
282
class UnsafePYAMLDumper:
283
"""
284
Compatibility alias for PYAMLDumper (legacy from pyaml < 23.x).
285
In older versions this was a separate unsafe dumper class.
286
"""
287
288
class PYAMLSort:
289
"""
290
Enum for dictionary sorting options.
291
292
Values:
293
- none: No sorting, sets PyYAML sort_keys=False (preserves insertion order)
294
- keys: Sort by dictionary keys, sets PyYAML sort_keys=True
295
- oneline_group: Custom sorting to group single-line values together
296
"""
297
```
298
299
## Types
300
301
```python { .api }
302
# Type aliases for clarity
303
from typing import Union, Dict, List, Any, Optional, TextIO, BinaryIO
304
305
YAMLData = Union[Dict[str, Any], List[Any], str, int, float, bool, None]
306
Destination = Union[None, str, bytes, TextIO, BinaryIO]
307
StringStyle = Union[str, None] # '|', '>', "'", '"', 'plain', None
308
VSpacingConfig = Union[bool, Dict[str, Union[int, bool]]]
309
```
310
311
## Usage Examples
312
313
### Advanced Formatting
314
315
```python
316
import pyaml
317
from pyaml import PYAMLSort
318
319
data = {
320
'long_text': '''This is a very long string that contains
321
multiple lines and should be formatted nicely
322
for human readability.''',
323
'config': {
324
'enabled': True,
325
'timeout': 30,
326
'retries': 3
327
},
328
'items': ['apple', 'banana', 'cherry', 'date']
329
}
330
331
# Force literal block style for strings
332
yaml_str = pyaml.dump(data, string_val_style='|')
333
334
# Group single-line items together
335
yaml_str = pyaml.dump(data, sort_dicts=PYAMLSort.oneline_group)
336
337
# Custom vertical spacing
338
yaml_str = pyaml.dump(data, vspacing={
339
'split_lines': 20,
340
'split_count': 3,
341
'oneline_group': True
342
})
343
344
# Allow references for duplicate data
345
yaml_str = pyaml.dump(data, force_embed=False)
346
347
# Debug mode for complex objects
348
import datetime
349
complex_data = {
350
'timestamp': datetime.datetime.now(),
351
'data': data
352
}
353
pyaml.debug(complex_data) # Shows repr of datetime object
354
```
355
356
### File Operations
357
358
```python
359
import pyaml
360
361
# Write to file
362
with open('config.yaml', 'w') as f:
363
pyaml.dump(data, f)
364
365
# Write multiple documents
366
documents = [config1, config2, config3]
367
with open('multi-doc.yaml', 'w') as f:
368
pyaml.dump_all(documents, f)
369
370
# Get as bytes for network transmission
371
yaml_bytes = pyaml.dump(data, dst=bytes)
372
```
373
374
### Command Line Usage
375
376
```bash
377
# Pretty-print a YAML file
378
python -m pyaml config.yaml
379
380
# Process from stdin
381
cat data.json | python -m pyaml
382
383
# Replace file in-place
384
python -m pyaml -r config.yaml
385
386
# Custom width and spacing
387
python -m pyaml -w 120 -v 30/3g config.yaml
388
389
# Process line-separated JSON/YAML entries
390
python -m pyaml -l logfile.jsonl
391
```
392
393
## Built-in Type Representers
394
395
PyAML automatically provides enhanced representers for common Python types:
396
397
- **`bool`**: Uses 'yes'/'no' instead of PyYAML's 'true'/'false' for better readability
398
- **`NoneType`**: Represents `None` as empty string instead of 'null'
399
- **`str`**: Custom string representation with automatic style selection (literal, folded, etc.)
400
- **`collections.defaultdict`**: Represented as regular dict
401
- **`collections.OrderedDict`**: Represented as regular dict with key ordering preserved
402
- **`set`**: Represented as YAML list
403
- **`pathlib.Path`**: Converted to string representation
404
- **Unknown Types**: Handled by `represent_undefined()` method which supports:
405
- Named tuples (via `_asdict()`)
406
- Mapping-like objects (via `collections.abc.Mapping`)
407
- Enum values (with comment showing enum name)
408
- Dataclasses (via `dataclasses.asdict()`)
409
- Objects with `tolist()` method (e.g., NumPy arrays)
410
- Fallback to `repr()` when `repr_unknown=True`
411
412
## Error Handling
413
414
PyAML may raise these exceptions:
415
416
- **`yaml.representer.RepresenterError`**: When encountering unsupported data types (unless `repr_unknown=True`)
417
- **`TypeError`**: When using incompatible dst and pyyaml stream parameters
418
- **`yaml.constructor.ConstructorError`**: When input data cannot be safely loaded during CLI validation
419
- **Standard file I/O exceptions**: When writing to files or reading from stdin
420
421
## Integration Notes
422
423
- **PyYAML Compatibility**: All PyYAML dumper options can be passed as `**pyyaml_kws`
424
- **Custom Types**: Use `pyaml.add_representer()` to handle custom data types
425
- **Performance**: For simple serialization needs, consider using PyYAML directly to avoid additional dependencies
426
- **Output Stability**: Output format may change between versions as new formatting improvements are added