CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-pyaml

PyYAML-based module to produce a bit more pretty and readable YAML-serialized data

Pending
Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Pending

The risk profile of this skill

Overview
Eval results
Files

index.mddocs/

PyAML

PyYAML-based Python module to produce human-readable, pretty-printed YAML-serialized data. It extends PyYAML with better formatting options specifically designed for readability, version control friendliness, and human editability rather than perfect serialization fidelity.

Package Information

  • Package Name: pyaml
  • Language: Python
  • Installation: pip install pyaml
  • Optional Dependencies: pip install unidecode (for better anchor naming from non-ASCII keys)

Core Imports

import pyaml

For specific functions:

from pyaml import dump, pprint, debug, PYAMLSort

Basic Usage

import pyaml

# Basic data serialization
data = {
    'name': 'John Doe',
    'age': 30,
    'skills': ['Python', 'YAML', 'Data'],
    'config': {
        'debug': True,
        'timeout': 60
    }
}

# Pretty-print to string
yaml_string = pyaml.dump(data)
print(yaml_string)

# Pretty-print directly to stdout
pyaml.pprint(data)

# Debug mode (shows repr of unknown types)
pyaml.debug(data, some_complex_object)

# Write to file
with open('output.yaml', 'w') as f:
    pyaml.dump(data, f)

Capabilities

Main Dump Functions

Core functions for converting Python data structures to pretty-printed YAML format.

def dump(data, dst=None, safe=None, force_embed=True, vspacing=True, 
         string_val_style=None, sort_dicts=None, multiple_docs=False, 
         width=100, repr_unknown=False, **pyyaml_kws):
    """
    Serialize data as pretty-YAML to specified dst file-like object,
    or return as str with dst=str (default) or encoded to bytes with dst=bytes.
    
    Parameters:
    - data: Data to serialize
    - dst: Destination (None/str for string, bytes for bytes, file-like object)
    - safe: (deprecated) Safety flag, ignored in pyaml >= 23.x with warnings
    - force_embed: bool, default=True, avoid anchor/reference syntax
    - vspacing: bool/dict, default=True, add vertical spacing between sections
    - string_val_style: str, force string value style ('|', '>', "'", '"', 'plain')
    - sort_dicts: PYAMLSort enum, dictionary sorting behavior
    - multiple_docs: bool, default=False, multiple document mode
    - width: int, default=100, line width hint
    - repr_unknown: bool/int, default=False, represent unknown types as repr strings
    - **pyyaml_kws: Additional PyYAML dumper keywords
    
    Returns:
    str (default), bytes, or None (when writing to file)
    """

def dump_all(data, *args, **kwargs):
    """
    Alias to dump(data, multiple_docs=True) for API compatibility with PyYAML.
    
    Parameters:
    - data: List of documents to serialize
    - *args, **kwargs: Same as dump()
    
    Returns:
    str, bytes, or None (when writing to file)
    """

def dumps(data, **kwargs):
    """
    Alias to dump() for API compatibility with stdlib conventions.
    
    Parameters:
    - data: Data to serialize
    - **kwargs: Same as dump()
    
    Returns:
    str
    """

Print and Debug Functions

Convenient functions for debugging and console output.

def pprint(*data, **kwargs):
    """
    Similar to how print() works, with any number of arguments and stdout-default.
    
    Parameters:
    - *data: Any number of data objects to print
    - file: Output file (default: sys.stdout) 
    - dst: Alias for file parameter
    - **kwargs: Same dump() parameters
    
    Returns:
    None
    """

def debug(*data, **kwargs):
    """
    Same as pprint, but also repr-printing any non-YAML types.
    
    Parameters:
    - *data: Any number of data objects to debug
    - **kwargs: Same as pprint(), with repr_unknown=True implied
    
    Returns:
    None
    """

def p(*data, **kwargs):
    """
    Alias for pprint() function.
    
    Parameters:
    - *data: Any number of data objects to print
    - **kwargs: Same as pprint()
    
    Returns:
    None
    """

def print(*data, **kwargs):
    """
    Alias for pprint() function (overrides built-in print when imported).
    
    Parameters:
    - *data: Any number of data objects to print  
    - **kwargs: Same as pprint()
    
    Returns:
    None
    """

Utility Functions

Helper functions for advanced YAML formatting.

def dump_add_vspacing(yaml_str, split_lines=40, split_count=2, 
                      oneline_group=False, oneline_split=False):
    """
    Add some newlines to separate overly long YAML lists/mappings.
    
    Parameters:
    - yaml_str: str, YAML string to process
    - split_lines: int, default=40, min number of lines to trigger splitting
    - split_count: int, default=2, min count of items to split
    - oneline_group: bool, default=False, don't split consecutive oneliner items
    - oneline_split: bool, default=False, split long lists of oneliner values
    
    Returns:
    str: YAML string with added vertical spacing
    """

def add_representer(data_type, representer):
    """
    Add custom representer for data types (alias to PYAMLDumper.add_representer).
    
    Parameters:
    - data_type: Type to add representer for
    - representer: Function to handle representation
    
    Returns:
    None
    """

def safe_replacement(path, *open_args, mode=None, xattrs=None, **open_kws):
    """
    Context manager to atomically create/replace file-path in-place unless errors are raised.
    
    Parameters:
    - path: str, file path to replace
    - *open_args: Arguments for tempfile.NamedTemporaryFile
    - mode: File mode (preserves original if None)
    - xattrs: Extended attributes (auto-detected if None)
    - **open_kws: Additional keywords for tempfile.NamedTemporaryFile
    
    Returns:
    Context manager yielding temporary file object
    """

def file_line_iter(src, sep='\\0\\n', bs=128*2**10):
    """
    Generator for src-file chunks, split by any of the separator chars.
    
    Parameters:
    - src: File-like object to read from
    - sep: str, separator characters (default: null and newline)
    - bs: int, buffer size in bytes (default: 256KB)
    
    Yields:
    str: File chunks split by separators
    """

Command Line Interface

CLI functionality accessible via python -m pyaml or pyaml command.

def main(argv=None, stdin=sys.stdin, stdout=sys.stdout, stderr=sys.stderr):
    """
    Command-line interface main function.
    
    Parameters:
    - argv: list, command line arguments (default: sys.argv[1:])
    - stdin: Input stream (default: sys.stdin)
    - stdout: Output stream (default: sys.stdout) 
    - stderr: Error stream (default: sys.stderr)
    
    Returns:
    None
    
    Command-line options:
    - path: Path to YAML file to read (default: stdin)
    - -r, --replace: Replace file in-place with prettified version
    - -w, --width CHARS: Max line width hint
    - -v, --vspacing N[/M][g]: Custom vertical spacing thresholds
    - -l, --lines: Read input as null/newline-separated entries
    - -q, --quiet: Disable output validation and suppress warnings
    """

Advanced Configuration Classes

Classes for advanced YAML dumping configuration.

class PYAMLDumper(yaml.dumper.SafeDumper):
    """
    Custom YAML dumper with pretty-printing enhancements.
    
    Constructor Parameters:
    - *args: Arguments passed to SafeDumper
    - sort_dicts: PYAMLSort enum or bool, dictionary sorting behavior  
    - force_embed: bool, default=True, avoid anchor/reference syntax
    - string_val_style: str, force string value style
    - anchor_len_max: int, default=40, max anchor name length
    - repr_unknown: bool/int, default=False, represent unknown types
    - **kws: Additional PyYAML dumper keywords
    
    Key Methods:
    - represent_str(): Custom string representation with style selection
    - represent_mapping(): Custom mapping representation with sorting
    - represent_undefined(): Handle non-YAML types (namedtuples, enums, dataclasses)
    - anchor_node(): Generate meaningful anchor names from context
    - pyaml_transliterate(): Static method for anchor name generation
    """

class UnsafePYAMLDumper:
    """
    Compatibility alias for PYAMLDumper (legacy from pyaml < 23.x).
    In older versions this was a separate unsafe dumper class.
    """

class PYAMLSort:
    """
    Enum for dictionary sorting options.
    
    Values:
    - none: No sorting, sets PyYAML sort_keys=False (preserves insertion order)
    - keys: Sort by dictionary keys, sets PyYAML sort_keys=True
    - oneline_group: Custom sorting to group single-line values together
    """

Types

# Type aliases for clarity
from typing import Union, Dict, List, Any, Optional, TextIO, BinaryIO

YAMLData = Union[Dict[str, Any], List[Any], str, int, float, bool, None]
Destination = Union[None, str, bytes, TextIO, BinaryIO]
StringStyle = Union[str, None]  # '|', '>', "'", '"', 'plain', None
VSpacingConfig = Union[bool, Dict[str, Union[int, bool]]]

Usage Examples

Advanced Formatting

import pyaml
from pyaml import PYAMLSort

data = {
    'long_text': '''This is a very long string that contains
multiple lines and should be formatted nicely
for human readability.''',
    'config': {
        'enabled': True,
        'timeout': 30,
        'retries': 3
    },
    'items': ['apple', 'banana', 'cherry', 'date']
}

# Force literal block style for strings
yaml_str = pyaml.dump(data, string_val_style='|')

# Group single-line items together
yaml_str = pyaml.dump(data, sort_dicts=PYAMLSort.oneline_group)

# Custom vertical spacing
yaml_str = pyaml.dump(data, vspacing={
    'split_lines': 20,
    'split_count': 3,
    'oneline_group': True
})

# Allow references for duplicate data
yaml_str = pyaml.dump(data, force_embed=False)

# Debug mode for complex objects
import datetime
complex_data = {
    'timestamp': datetime.datetime.now(),
    'data': data
}
pyaml.debug(complex_data)  # Shows repr of datetime object

File Operations

import pyaml

# Write to file
with open('config.yaml', 'w') as f:
    pyaml.dump(data, f)

# Write multiple documents
documents = [config1, config2, config3]
with open('multi-doc.yaml', 'w') as f:
    pyaml.dump_all(documents, f)

# Get as bytes for network transmission
yaml_bytes = pyaml.dump(data, dst=bytes)

Command Line Usage

# Pretty-print a YAML file
python -m pyaml config.yaml

# Process from stdin
cat data.json | python -m pyaml

# Replace file in-place
python -m pyaml -r config.yaml

# Custom width and spacing
python -m pyaml -w 120 -v 30/3g config.yaml

# Process line-separated JSON/YAML entries
python -m pyaml -l logfile.jsonl

Built-in Type Representers

PyAML automatically provides enhanced representers for common Python types:

  • bool: Uses 'yes'/'no' instead of PyYAML's 'true'/'false' for better readability
  • NoneType: Represents None as empty string instead of 'null'
  • str: Custom string representation with automatic style selection (literal, folded, etc.)
  • collections.defaultdict: Represented as regular dict
  • collections.OrderedDict: Represented as regular dict with key ordering preserved
  • set: Represented as YAML list
  • pathlib.Path: Converted to string representation
  • Unknown Types: Handled by represent_undefined() method which supports:
    • Named tuples (via _asdict())
    • Mapping-like objects (via collections.abc.Mapping)
    • Enum values (with comment showing enum name)
    • Dataclasses (via dataclasses.asdict())
    • Objects with tolist() method (e.g., NumPy arrays)
    • Fallback to repr() when repr_unknown=True

Error Handling

PyAML may raise these exceptions:

  • yaml.representer.RepresenterError: When encountering unsupported data types (unless repr_unknown=True)
  • TypeError: When using incompatible dst and pyyaml stream parameters
  • yaml.constructor.ConstructorError: When input data cannot be safely loaded during CLI validation
  • Standard file I/O exceptions: When writing to files or reading from stdin

Integration Notes

  • PyYAML Compatibility: All PyYAML dumper options can be passed as **pyyaml_kws
  • Custom Types: Use pyaml.add_representer() to handle custom data types
  • Performance: For simple serialization needs, consider using PyYAML directly to avoid additional dependencies
  • Output Stability: Output format may change between versions as new formatting improvements are added

docs

index.md

tile.json