tessl/pypi-imagecodecs

Image transformation, compression, and decompression codecs for scientific computing

—

Pending

Overview

Eval results

Files

Utilities and Metadata

Name: tessl/pypi-imagecodecs
Author: tessl

Package information, version checking, codec availability testing, and checksum functions for data integrity verification. These utilities provide essential functionality for package management, debugging, and data validation workflows.

Capabilities

Version Information

Get comprehensive version information about the package and all available codecs.

def version(astype=None, /):
    """
    Return version information about all codecs and dependencies.
    
    All extension modules are imported into the process during this call.
    
    Parameters:
    - astype: type | None - Return type format:
        None or str: Multi-line string with all version info
        tuple: Tuple of (package_version, {codec: version, ...})
        dict: Dictionary with detailed version information
    
    Returns:
    str | tuple[str, ...] | dict[str, str]: Version information in requested format
    """

def cython_version():
    """
    Return Cython version string.
    
    Returns:
    str: Cython version used to compile extensions
    """

def numpy_abi_version():
    """
    Return NumPy ABI version string.
    
    Returns:
    str: NumPy ABI version for binary compatibility
    """

def imcd_version():
    """
    Return imcd library version string.
    
    Returns:
    str: Internal IMCD library version
    """

Codec Availability

Check availability and get version information for specific codecs.

# Each codec has associated version and check functions
def {codec}_version():
    """
    Return {codec} library version string.
    
    Returns:
    str: Version of underlying {codec} library
    
    Raises:
    DelayedImportError: If {codec} codec is not available
    """

def {codec}_check(data):
    """
    Check if data is {codec} encoded.
    
    Parameters:
    - data: bytes | bytearray | mmap.mmap - Data to check
    
    Returns:
    bool | None: True if {codec} format detected, None if uncertain
    """

# Constants classes provide availability information
class CODEC_CONSTANTS:
    available: bool  # True if codec is available
    # ... codec-specific constants

Checksum Functions

HDF5-compatible checksum functions for data integrity verification.

def h5checksum_fletcher32(data, value=None):
    """
    Return Fletcher-32 checksum compatible with HDF5.
    
    Parameters:
    - data: bytes | bytearray | mmap.mmap - Data to checksum
    - value: int | None - Initial checksum value for incremental calculation
    
    Returns:
    int: Fletcher-32 checksum value
    """

def h5checksum_lookup3(data, value=None):
    """
    Return Jenkins lookup3 checksum compatible with HDF5.
    
    Parameters:
    - data: bytes | bytearray | mmap.mmap - Data to checksum
    - value: int | None - Initial hash value for incremental calculation
    
    Returns:
    int: Jenkins lookup3 hash value
    """

def h5checksum_crc(data, value=None):
    """
    Return CRC checksum compatible with HDF5.
    
    Parameters:
    - data: bytes | bytearray | mmap.mmap - Data to checksum
    - value: int | None - Initial CRC value for incremental calculation
    
    Returns:
    int: CRC checksum value
    """

def h5checksum_metadata(data, value=None):
    """
    Return checksum of metadata compatible with HDF5.
    
    Parameters:
    - data: bytes | bytearray | mmap.mmap - Metadata to checksum
    - value: int | None - Initial checksum value for incremental calculation
    
    Returns:
    int: Metadata checksum value
    """

def h5checksum_hash_string(data, value=None):
    """
    Return hash of bytes string compatible with HDF5.
    
    Parameters:
    - data: bytes | bytearray | mmap.mmap - String data to hash
    - value: int | None - Initial hash value for incremental calculation
    
    Returns:
    int: String hash value
    """

def h5checksum_version():
    """
    Return h5checksum library version string.
    
    Returns:
    str: Version of h5checksum library
    """

Package Introspection

Functions for exploring the package structure and available functionality.

def __dir__():
    """
    Return list of all accessible attributes in the package.
    
    This includes all codecs, functions, and constants that can be accessed
    through the lazy loading mechanism.
    
    Returns:
    list[str]: List of accessible attribute names
    """

def __getattr__(name):
    """
    Lazy loading mechanism for codec modules and functions.
    
    This function is called when accessing attributes not directly imported,
    enabling on-demand loading of codec modules.
    
    Parameters:
    - name: str - Attribute name to load
    
    Returns:
    Any: The requested attribute (function, class, or constant)
    
    Raises:
    DelayedImportError: If the requested codec is not available
    AttributeError: If the attribute does not exist
    """

# Special constants for codec management
_codecs: dict  # Dictionary of all available codecs
_extensions: dict  # Dictionary mapping file extensions to codecs

Exception Classes

Structured exception hierarchy for error handling.

class DelayedImportError(ImportError):
    """
    Delayed ImportError raised when optional codec dependencies are not available.
    
    This exception is raised during lazy loading when a codec's underlying
    library is not installed or cannot be imported.
    """
    
    def __init__(self, name: str) -> None:
        """
        Initialize DelayedImportError.
        
        Parameters:
        - name: str - Name of the missing codec or library
        """

class ImcdError(Exception):
    """
    Base exception class for IMCD codec errors.
    
    This is the base class for all codec-specific exceptions in the package.
    """

class NoneError(Exception):
    """
    Exception for NONE codec operations.
    
    Raised when operations on the NONE codec fail (should be rare).
    """

class NumpyError(Exception):
    """
    Exception for NumPy codec operations.
    
    Raised when NumPy-based codec operations fail.
    """

Usage Examples

Package Information and Debugging

import imagecodecs

# Get comprehensive version information
print("=== Imagecodecs Version Information ===")
version_info = imagecodecs.version()
print(version_info)

# Get specific version details
print(f"\nPackage version: {imagecodecs.__version__}")
print(f"Cython version: {imagecodecs.cython_version()}")
print(f"NumPy ABI version: {imagecodecs.numpy_abi_version()}")
print(f"IMCD version: {imagecodecs.imcd_version()}")

# Get structured version data
version_dict = imagecodecs.version(astype=dict)
print(f"\nAvailable codecs: {len(version_dict.get('codecs', {}))}")

# Check specific codec availability
codecs_to_check = ['jpeg', 'png', 'webp', 'avif', 'jpegxl', 'heif']
print("\n=== Codec Availability ===")
for codec_name in codecs_to_check:
    try:
        codec_class = getattr(imagecodecs, codec_name.upper())
        available = codec_class.available
        if available:
            version_func = getattr(imagecodecs, f'{codec_name}_version')
            version = version_func()
            print(f"{codec_name.upper()}: ✓ available (v{version})")
        else:
            print(f"{codec_name.upper()}: ✗ not available")
    except (AttributeError, imagecodecs.DelayedImportError):
        print(f"{codec_name.upper()}: ✗ not available")

Codec Discovery and Introspection

import imagecodecs

# Discover all available attributes
all_attributes = imagecodecs.__dir__()
print(f"Total attributes: {len(all_attributes)}")

# Filter for codec functions
encode_functions = [attr for attr in all_attributes if attr.endswith('_encode')]
decode_functions = [attr for attr in all_attributes if attr.endswith('_decode')]
check_functions = [attr for attr in all_attributes if attr.endswith('_check')]
version_functions = [attr for attr in all_attributes if attr.endswith('_version')]

print(f"Encode functions: {len(encode_functions)}")
print(f"Decode functions: {len(decode_functions)}")
print(f"Check functions: {len(check_functions)}")
print(f"Version functions: {len(version_functions)}")

# Find codec constants classes
codec_constants = [attr for attr in all_attributes 
                  if attr.isupper() and not attr.startswith('_')]
print(f"Codec constants: {len(codec_constants)}")

# Test lazy loading
print("\n=== Testing Lazy Loading ===")
try:
    # This will trigger lazy loading if not already loaded
    jpeg_encode = imagecodecs.jpeg_encode
    print("JPEG codec loaded successfully")
except imagecodecs.DelayedImportError as e:
    print(f"JPEG codec not available: {e}")

# Get codec and extension mappings
if hasattr(imagecodecs, '_codecs'):
    print(f"Internal codec registry: {len(imagecodecs._codecs)} entries")
if hasattr(imagecodecs, '_extensions'):
    print(f"File extension mappings: {len(imagecodecs._extensions)} entries")

Data Integrity Verification

import imagecodecs
import numpy as np

# Generate test data
test_data = np.random.randint(0, 256, 10000, dtype=np.uint8).tobytes()

print("=== Checksum Verification ===")

# Calculate various checksums
fletcher32 = imagecodecs.h5checksum_fletcher32(test_data)
lookup3 = imagecodecs.h5checksum_lookup3(test_data)
crc = imagecodecs.h5checksum_crc(test_data)

print(f"Fletcher-32: 0x{fletcher32:08x}")
print(f"Jenkins lookup3: 0x{lookup3:08x}")
print(f"CRC: 0x{crc:08x}")

# Incremental checksum calculation
chunk_size = 1000
fletcher32_inc = 0
lookup3_inc = 0
crc_inc = 0

for i in range(0, len(test_data), chunk_size):
    chunk = test_data[i:i+chunk_size]
    fletcher32_inc = imagecodecs.h5checksum_fletcher32(chunk, fletcher32_inc)
    lookup3_inc = imagecodecs.h5checksum_lookup3(chunk, lookup3_inc)
    crc_inc = imagecodecs.h5checksum_crc(chunk, crc_inc)

print(f"\nIncremental checksums:")
print(f"Fletcher-32: 0x{fletcher32_inc:08x} {'✓' if fletcher32_inc == fletcher32 else '✗'}")
print(f"Jenkins lookup3: 0x{lookup3_inc:08x} {'✓' if lookup3_inc == lookup3 else '✗'}")
print(f"CRC: 0x{crc_inc:08x} {'✓' if crc_inc == crc else '✗'}")

# Verify data integrity after compression/decompression
compressed = imagecodecs.zlib_encode(test_data)
decompressed = imagecodecs.zlib_decode(compressed)

decompressed_fletcher32 = imagecodecs.h5checksum_fletcher32(decompressed)
integrity_check = decompressed_fletcher32 == fletcher32
print(f"\nData integrity after compression: {'✓ PASS' if integrity_check else '✗ FAIL'}")

Format Detection

import imagecodecs
import numpy as np

# Create sample data in different formats
test_image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)

# Encode in various formats
formats = {
    'JPEG': imagecodecs.jpeg_encode(test_image, level=85),
    'PNG': imagecodecs.png_encode(test_image),
    'WebP': imagecodecs.webp_encode(test_image) if imagecodecs.WEBP.available else None,
    'ZLIB': imagecodecs.zlib_encode(test_image.tobytes()),
    'GZIP': imagecodecs.gzip_encode(test_image.tobytes()),
}

print("=== Format Detection ===")
for format_name, data in formats.items():
    if data is None:
        print(f"{format_name}: Not available")
        continue
        
    # Test format detection
    check_results = {}
    check_functions = [
        ('jpeg_check', imagecodecs.jpeg_check),
        ('png_check', imagecodecs.png_check),
        ('webp_check', imagecodecs.webp_check if imagecodecs.WEBP.available else None),
        ('zlib_check', imagecodecs.zlib_check),
        ('gzip_check', imagecodecs.gzip_check),
    ]
    
    for check_name, check_func in check_functions:
        if check_func is None:
            continue
        try:
            result = check_func(data)
            if result:
                check_results[check_name] = result
        except Exception as e:
            pass
    
    detected_formats = [name.replace('_check', '').upper() for name in check_results.keys()]
    print(f"{format_name}: Detected as {detected_formats if detected_formats else 'Unknown'}")

Error Handling and Fallbacks

import imagecodecs
import numpy as np

def safe_encode(data, preferred_codecs=['avif', 'webp', 'jpeg'], **kwargs):
    """
    Encode image with fallback to available codecs.
    """
    for codec_name in preferred_codecs:
        try:
            # Check if codec is available
            codec_class = getattr(imagecodecs, codec_name.upper())
            if not codec_class.available:
                continue
                
            # Try to encode
            encode_func = getattr(imagecodecs, f'{codec_name}_encode')
            encoded = encode_func(data, **kwargs)
            print(f"Encoded with {codec_name.upper()}")
            return encoded, codec_name
            
        except imagecodecs.DelayedImportError:
            print(f"{codec_name.upper()} not available")
            continue
        except Exception as e:
            print(f"{codec_name.upper()} encoding failed: {e}")
            continue
    
    # Fallback to always-available codec
    encoded = imagecodecs.zlib_encode(data.tobytes())
    print("Fell back to ZLIB compression")
    return encoded, 'zlib'

# Test with sample image
test_image = np.random.randint(0, 256, (256, 256, 3), dtype=np.uint8)

try:
    encoded_data, used_codec = safe_encode(test_image, level=85)
    print(f"Successfully encoded with {used_codec}")
except Exception as e:
    print(f"All encoding methods failed: {e}")

# Test exception handling
print("\n=== Exception Handling ===")
try:
    # Try to use a non-existent codec
    result = imagecodecs.__getattr__('nonexistent_codec')
except AttributeError as e:
    print(f"AttributeError: {e}")

try:
    # Try to access unavailable codec
    if not imagecodecs.AVIF.available:
        imagecodecs.avif_encode(test_image)
except imagecodecs.DelayedImportError as e:
    print(f"DelayedImportError: {e}")

Codec Performance Benchmarking

import imagecodecs
import numpy as np
import time

def benchmark_codecs(image, codecs_to_test=None):
    """
    Benchmark compression performance of available codecs.
    """
    if codecs_to_test is None:
        codecs_to_test = ['jpeg', 'png', 'webp', 'zlib', 'zstd', 'lz4']
    
    results = []
    original_size = image.nbytes
    
    for codec_name in codecs_to_test:
        try:
            # Check availability
            codec_class = getattr(imagecodecs, codec_name.upper())
            if not codec_class.available:
                continue
            
            encode_func = getattr(imagecodecs, f'{codec_name}_encode')
            decode_func = getattr(imagecodecs, f'{codec_name}_decode')
            
            # Benchmark encoding
            start_time = time.time()
            if codec_name in ['zlib', 'zstd', 'lz4']:
                encoded = encode_func(image.tobytes())
            else:
                encoded = encode_func(image)
            encode_time = time.time() - start_time
            
            # Benchmark decoding
            start_time = time.time()
            if codec_name in ['zlib', 'zstd', 'lz4']:
                decoded = decode_func(encoded)
            else:
                decoded = decode_func(encoded)
            decode_time = time.time() - start_time
            
            compressed_size = len(encoded)
            compression_ratio = original_size / compressed_size
            
            results.append({
                'codec': codec_name.upper(),
                'compressed_size': compressed_size,
                'compression_ratio': compression_ratio,
                'encode_time': encode_time * 1000,  # ms
                'decode_time': decode_time * 1000,  # ms
            })
            
        except (AttributeError, imagecodecs.DelayedImportError, Exception) as e:
            print(f"Skipping {codec_name}: {e}")
            continue
    
    return results

# Run benchmark
test_image = np.random.randint(0, 256, (512, 512, 3), dtype=np.uint8)
benchmark_results = benchmark_codecs(test_image)

print("=== Codec Performance Benchmark ===")
print(f"{'Codec':<8} {'Size (KB)':<10} {'Ratio':<8} {'Enc (ms)':<10} {'Dec (ms)':<10}")
print("-" * 60)

for result in sorted(benchmark_results, key=lambda x: x['compression_ratio'], reverse=True):
    print(f"{result['codec']:<8} "
          f"{result['compressed_size']/1024:<10.1f} "
          f"{result['compression_ratio']:<8.1f} "
          f"{result['encode_time']:<10.1f} "
          f"{result['decode_time']:<10.1f}")

Constants and Configuration

Package Constants

# Version information
__version__: str  # Package version string

# Internal registries (read-only)
_codecs: dict  # Codec function registry
_extensions: dict  # File extension to codec mappings
_MODULES: dict  # Module loading configuration
_ATTRIBUTES: dict  # Attribute to module mappings
_COMPATIBILITY: dict  # Backward compatibility aliases

# Always-available codecs
NONE: type  # NONE codec constants
NUMPY: type  # NumPy codec constants

Checksum Constants

class H5CHECKSUM:
    available: bool
    
    # Checksum algorithm identifiers
    FLETCHER32 = 'fletcher32'
    LOOKUP3 = 'lookup3'
    CRC = 'crc'

Performance Considerations

Version Information

version() imports all codec modules, which may be slow on first call
Cache version information if needed frequently
Use version(astype=dict) for programmatic access to version data

Lazy Loading

Attributes are loaded on first access, causing slight delay
Pre-load frequently used codecs at startup if performance is critical
Use __dir__() to discover available functionality without loading

Checksum Performance

HDF5 checksums are optimized for incremental calculation
Use appropriate chunk sizes for incremental checksums
Fletcher-32 is generally fastest, CRC provides best error detection

Error Handling Patterns

import imagecodecs

# Check codec availability before use
if imagecodecs.WEBP.available:
    encoded = imagecodecs.webp_encode(image)
else:
    encoded = imagecodecs.jpeg_encode(image)  # Fallback

# Handle delayed import errors
try:
    result = imagecodecs.avif_encode(image)
except imagecodecs.DelayedImportError:
    result = imagecodecs.jpeg_encode(image)  # Fallback

# Comprehensive error handling
def safe_decode(data, possible_formats=['jpeg', 'png', 'webp']):
    for fmt in possible_formats:
        try:
            check_func = getattr(imagecodecs, f'{fmt}_check')
            if check_func(data):
                decode_func = getattr(imagecodecs, f'{fmt}_decode')
                return decode_func(data)
        except (AttributeError, imagecodecs.DelayedImportError, Exception):
            continue
    raise ValueError("Unable to decode data with any available codec")

Install with Tessl CLI