or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

index.md
tile.json

tessl/pypi-pybase64

Fast Base64 encoding/decoding library with SIMD optimizations

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/pybase64@1.4.x

To install, run

npx @tessl/cli install tessl/pypi-pybase64@1.4.0

index.mddocs/

PyBase64

Fast Base64 encoding/decoding library that provides a high-performance wrapper around the optimized libbase64 C library. PyBase64 offers the same API as Python's built-in base64 module for easy integration while delivering significantly faster performance through SIMD optimizations (AVX2, AVX512-VBMI, Neon) and native C implementations.

Package Information

  • Package Name: pybase64
  • Language: Python
  • Installation: pip install pybase64
  • Documentation: https://pybase64.readthedocs.io/en/stable
  • License: BSD-2-Clause
  • CLI Tool: Available as pybase64 command or python -m pybase64

Core Imports

import pybase64

For specific functions:

from pybase64 import b64encode, b64decode, standard_b64encode, urlsafe_b64decode

Basic Usage

import pybase64

# Basic encoding/decoding
data = b'Hello, World!'
encoded = pybase64.b64encode(data)
decoded = pybase64.b64decode(encoded)

print(encoded)  # b'SGVsbG8sIFdvcmxkIQ=='
print(decoded)  # b'Hello, World!'

# URL-safe encoding
url_encoded = pybase64.urlsafe_b64encode(data)
url_decoded = pybase64.urlsafe_b64decode(url_encoded)

# Custom alphabet
custom_encoded = pybase64.b64encode(data, altchars=b'_:')
custom_decoded = pybase64.b64decode(custom_encoded, altchars=b'_:')

# Validation for security-critical applications
secure_decoded = pybase64.b64decode(encoded, validate=True)

# Version and performance info
print(pybase64.get_version())  # Shows SIMD optimizations in use

Architecture

PyBase64 provides a dual-implementation architecture for optimal performance:

  • C Extension (_pybase64): High-performance implementation using libbase64 with SIMD optimizations
  • Python Fallback (_fallback): Pure Python implementation using built-in base64 module when C extension unavailable
  • Automatic Selection: Runtime detection automatically chooses best available implementation
  • SIMD Detection: Runtime CPU feature detection enables optimal instruction sets (AVX2, AVX512-VBMI, Neon)

This design ensures maximum performance when possible while maintaining compatibility across all Python environments including PyPy and free-threaded builds.

Capabilities

Core Encoding Functions

Primary Base64 encoding functions with full alphabet customization and optimal performance through C extensions.

def b64encode(s: Buffer, altchars: str | Buffer | None = None) -> bytes:
    """
    Encode bytes using Base64 alphabet.
    
    Parameters:
    - s: bytes-like object to encode
    - altchars: optional 2-character string/bytes for custom alphabet (replaces '+' and '/')
    
    Returns:
    bytes: Base64 encoded data
    
    Raises:
    BufferError: if buffer is not C-contiguous
    TypeError: for invalid input types
    ValueError: for non-ASCII strings in altchars
    """

def b64encode_as_string(s: Buffer, altchars: str | Buffer | None = None) -> str:
    """
    Encode bytes using Base64 alphabet, return as string.
    
    Parameters:
    - s: bytes-like object to encode
    - altchars: optional 2-character string/bytes for custom alphabet
    
    Returns:
    str: Base64 encoded data as ASCII string
    """

def encodebytes(s: Buffer) -> bytes:
    """
    Encode bytes with MIME-style line breaks every 76 characters.
    
    Parameters:
    - s: bytes-like object to encode
    
    Returns:
    bytes: Base64 encoded data with newlines per RFC 2045 (MIME)
    """

Core Decoding Functions

Base64 decoding functions with validation options and alternative alphabet support for maximum security and flexibility.

def b64decode(s: str | Buffer, altchars: str | Buffer | None = None, validate: bool = False) -> bytes:
    """
    Decode Base64 encoded data.
    
    Parameters:
    - s: string or bytes-like object to decode
    - altchars: optional 2-character alternative alphabet
    - validate: if True, strictly validate input (recommended for security)
    
    Returns:
    bytes: decoded data
    
    Raises:
    binascii.Error: for invalid padding or characters (when validate=True)
    """

def b64decode_as_bytearray(s: str | Buffer, altchars: str | Buffer | None = None, validate: bool = False) -> bytearray:
    """
    Decode Base64 encoded data, return as bytearray.
    
    Parameters:
    - s: string or bytes-like object to decode
    - altchars: optional 2-character alternative alphabet
    - validate: if True, strictly validate input
    
    Returns:
    bytearray: decoded data as mutable bytearray
    
    Raises:
    binascii.Error: for invalid padding or characters (when validate=True)
    """

Standard Base64 Functions

Convenience functions for standard Base64 alphabet encoding/decoding, compatible with Python's base64 module.

def standard_b64encode(s: Buffer) -> bytes:
    """
    Encode using standard Base64 alphabet (+/).
    
    Parameters:
    - s: bytes-like object to encode
    
    Returns:
    bytes: standard Base64 encoded data
    """

def standard_b64decode(s: str | Buffer) -> bytes:
    """
    Decode standard Base64 encoded data.
    
    Parameters:
    - s: string or bytes-like object to decode
    
    Returns:
    bytes: decoded data
    
    Raises:
    binascii.Error: for invalid input
    """

URL-Safe Base64 Functions

URL and filesystem safe Base64 encoding/decoding using modified alphabet (-_ instead of +/) for web applications and file names.

def urlsafe_b64encode(s: Buffer) -> bytes:
    """
    Encode using URL-safe Base64 alphabet (-_).
    
    Parameters:
    - s: bytes-like object to encode
    
    Returns:
    bytes: URL-safe Base64 encoded data
    """

def urlsafe_b64decode(s: str | Buffer) -> bytes:
    """
    Decode URL-safe Base64 encoded data.
    
    Parameters:
    - s: string or bytes-like object to decode
    
    Returns:
    bytes: decoded data
    
    Raises:
    binascii.Error: for invalid input
    """

Utility Functions

Version and license information functions for runtime introspection and compliance reporting.

def get_version() -> str:
    """
    Get pybase64 version with optimization status.
    
    Returns:
    str: version string with C extension and SIMD status
         e.g., "1.4.2 (C extension active - AVX2)"
    """

def get_license_text() -> str:
    """
    Get complete license information.
    
    Returns:
    str: license text including libbase64 license information
    """

SIMD Detection Functions

Internal functions for SIMD optimization control and introspection (available when C extension is active).

def _get_simd_flags_compile() -> int:
    """
    Get compile-time SIMD flags used when building the C extension.
    
    Returns:
    int: bitmask of SIMD instruction sets available at compile time
    """

def _get_simd_flags_runtime() -> int:
    """
    Get runtime SIMD flags detected on current CPU.
    
    Returns:
    int: bitmask of SIMD instruction sets available at runtime
    """

def _get_simd_name(flags: int) -> str:
    """
    Get human-readable name for SIMD instruction set.
    
    Parameters:
    - flags: SIMD flags bitmask
    
    Returns:
    str: SIMD instruction set name (e.g., "AVX2", "fallback")
    """

def _get_simd_path() -> int:
    """
    Get currently active SIMD path flags.
    
    Returns:
    int: active SIMD flags for current execution path
    """

def _set_simd_path(flags: int) -> None:
    """
    Set SIMD path for optimization (advanced users only).
    
    Parameters:
    - flags: SIMD flags to activate
    
    Note: Only available when C extension is active
    """

Command-Line Interface

PyBase64 provides a comprehensive command-line tool for encoding, decoding, and benchmarking Base64 operations.

# Main command with version and help
pybase64 --version
pybase64 --license  
pybase64 -h

# Encoding subcommand
pybase64 encode <input_file> [-o <output_file>] [-u|--url] [-a <altchars>]

# Decoding subcommand  
pybase64 decode <input_file> [-o <output_file>] [-u|--url] [-a <altchars>] [--no-validation]

# Benchmarking subcommand
pybase64 benchmark <input_file> [-d <duration>]

The CLI can also be invoked using Python module syntax:

python -m pybase64 <subcommand> [arguments...]

Module Attributes

Package version and exported symbols for version checking and introspection.

__version__: str  # Package version string
__all__: tuple[str, ...]  # Exported public API symbols

Type Definitions

# Type alias for bytes-like objects (version-dependent import)
if sys.version_info < (3, 12):
    from typing_extensions import Buffer
else:
    from collections.abc import Buffer

# Protocol for decode functions
class Decode(Protocol):
    __name__: str
    __module__: str
    def __call__(self, s: str | Buffer, altchars: str | Buffer | None = None, validate: bool = False) -> bytes: ...

# Protocol for encode functions  
class Encode(Protocol):
    __name__: str
    __module__: str
    def __call__(self, s: Buffer, altchars: Buffer | None = None) -> bytes: ...

# Protocol for encodebytes-style functions
class EncodeBytes(Protocol):
    __name__: str
    __module__: str
    def __call__(self, s: Buffer) -> bytes: ...

Usage Examples

Performance-Optimized Decoding

import pybase64

# For maximum security and performance, use validate=True
# This enables optimized validation in the C extension
data = b'SGVsbG8sIFdvcmxkIQ=='
decoded = pybase64.b64decode(data, validate=True)

Custom Alphabet Usage

import pybase64

# Create data with custom alphabet for specific protocols
data = b'binary data here'
encoded = pybase64.b64encode(data, altchars=b'@&')
# Result uses @ and & instead of + and /

# Decode with same custom alphabet
decoded = pybase64.b64decode(encoded, altchars=b'@&')

MIME-Compatible Encoding

import pybase64

# Encode with line breaks for email/MIME compatibility
large_data = b'x' * 200  # Large binary data
mime_encoded = pybase64.encodebytes(large_data)
# Result has newlines every 76 characters per RFC 2045

Runtime Performance Information

import pybase64

# Check if C extension and SIMD optimizations are active
version_info = pybase64.get_version()
print(version_info)
# Output examples:
# "1.4.2 (C extension active - AVX2)"
# "1.4.2 (C extension inactive)"  # Fallback mode

Command-Line Usage Examples

# Encode a file using standard Base64
pybase64 encode input.txt -o encoded.txt

# Decode with validation (recommended for security)  
pybase64 decode encoded.txt -o decoded.txt

# URL-safe encoding for web applications
pybase64 encode data.bin -u -o urlsafe.txt

# Custom alphabet encoding
pybase64 encode data.bin -a '@&' -o custom.txt

# Benchmark performance on your system
pybase64 benchmark test_data.bin

# Pipe operations (using stdin/stdout)
echo "Hello World" | pybase64 encode -
cat encoded.txt | pybase64 decode - > decoded.txt

# Check version and license
pybase64 --version
pybase64 --license

# Using Python module syntax
python -m pybase64 encode input.txt

Error Handling

All decoding functions may raise binascii.Error for:

  • Incorrect Base64 padding
  • Invalid characters in input (when validate=True)
  • Malformed Base64 strings

Encoding functions may raise:

  • BufferError for non-contiguous memory buffers
  • TypeError for invalid input types
  • ValueError for non-ASCII characters in custom alphabets

Performance Notes

  • Use validate=True for security-critical applications - it's optimized in the C extension
  • C extension provides 5-20x performance improvement over Python's built-in base64
  • SIMD optimizations (AVX2, AVX512-VBMI, Neon) are automatically detected and used when available
  • For maximum performance, use b64decode and b64encode directly rather than wrapper functions
  • PyPy and free-threaded Python builds are fully supported with automatic fallback