0
# Utilities and Metadata
1
2
Package information, version checking, codec availability testing, and checksum functions for data integrity verification. These utilities provide essential functionality for package management, debugging, and data validation workflows.
3
4
## Capabilities
5
6
### Version Information
7
8
Get comprehensive version information about the package and all available codecs.
9
10
```python { .api }
11
def version(astype=None, /):
12
"""
13
Return version information about all codecs and dependencies.
14
15
All extension modules are imported into the process during this call.
16
17
Parameters:
18
- astype: type | None - Return type format:
19
None or str: Multi-line string with all version info
20
tuple: Tuple of (package_version, {codec: version, ...})
21
dict: Dictionary with detailed version information
22
23
Returns:
24
str | tuple[str, ...] | dict[str, str]: Version information in requested format
25
"""
26
27
def cython_version():
28
"""
29
Return Cython version string.
30
31
Returns:
32
str: Cython version used to compile extensions
33
"""
34
35
def numpy_abi_version():
36
"""
37
Return NumPy ABI version string.
38
39
Returns:
40
str: NumPy ABI version for binary compatibility
41
"""
42
43
def imcd_version():
44
"""
45
Return imcd library version string.
46
47
Returns:
48
str: Internal IMCD library version
49
"""
50
```
51
52
### Codec Availability
53
54
Check availability and get version information for specific codecs.
55
56
```python { .api }
57
# Each codec has associated version and check functions
58
def {codec}_version():
59
"""
60
Return {codec} library version string.
61
62
Returns:
63
str: Version of underlying {codec} library
64
65
Raises:
66
DelayedImportError: If {codec} codec is not available
67
"""
68
69
def {codec}_check(data):
70
"""
71
Check if data is {codec} encoded.
72
73
Parameters:
74
- data: bytes | bytearray | mmap.mmap - Data to check
75
76
Returns:
77
bool | None: True if {codec} format detected, None if uncertain
78
"""
79
80
# Constants classes provide availability information
81
class CODEC_CONSTANTS:
82
available: bool # True if codec is available
83
# ... codec-specific constants
84
```
85
86
### Checksum Functions
87
88
HDF5-compatible checksum functions for data integrity verification.
89
90
```python { .api }
91
def h5checksum_fletcher32(data, value=None):
92
"""
93
Return Fletcher-32 checksum compatible with HDF5.
94
95
Parameters:
96
- data: bytes | bytearray | mmap.mmap - Data to checksum
97
- value: int | None - Initial checksum value for incremental calculation
98
99
Returns:
100
int: Fletcher-32 checksum value
101
"""
102
103
def h5checksum_lookup3(data, value=None):
104
"""
105
Return Jenkins lookup3 checksum compatible with HDF5.
106
107
Parameters:
108
- data: bytes | bytearray | mmap.mmap - Data to checksum
109
- value: int | None - Initial hash value for incremental calculation
110
111
Returns:
112
int: Jenkins lookup3 hash value
113
"""
114
115
def h5checksum_crc(data, value=None):
116
"""
117
Return CRC checksum compatible with HDF5.
118
119
Parameters:
120
- data: bytes | bytearray | mmap.mmap - Data to checksum
121
- value: int | None - Initial CRC value for incremental calculation
122
123
Returns:
124
int: CRC checksum value
125
"""
126
127
def h5checksum_metadata(data, value=None):
128
"""
129
Return checksum of metadata compatible with HDF5.
130
131
Parameters:
132
- data: bytes | bytearray | mmap.mmap - Metadata to checksum
133
- value: int | None - Initial checksum value for incremental calculation
134
135
Returns:
136
int: Metadata checksum value
137
"""
138
139
def h5checksum_hash_string(data, value=None):
140
"""
141
Return hash of bytes string compatible with HDF5.
142
143
Parameters:
144
- data: bytes | bytearray | mmap.mmap - String data to hash
145
- value: int | None - Initial hash value for incremental calculation
146
147
Returns:
148
int: String hash value
149
"""
150
151
def h5checksum_version():
152
"""
153
Return h5checksum library version string.
154
155
Returns:
156
str: Version of h5checksum library
157
"""
158
```
159
160
### Package Introspection
161
162
Functions for exploring the package structure and available functionality.
163
164
```python { .api }
165
def __dir__():
166
"""
167
Return list of all accessible attributes in the package.
168
169
This includes all codecs, functions, and constants that can be accessed
170
through the lazy loading mechanism.
171
172
Returns:
173
list[str]: List of accessible attribute names
174
"""
175
176
def __getattr__(name):
177
"""
178
Lazy loading mechanism for codec modules and functions.
179
180
This function is called when accessing attributes not directly imported,
181
enabling on-demand loading of codec modules.
182
183
Parameters:
184
- name: str - Attribute name to load
185
186
Returns:
187
Any: The requested attribute (function, class, or constant)
188
189
Raises:
190
DelayedImportError: If the requested codec is not available
191
AttributeError: If the attribute does not exist
192
"""
193
194
# Special constants for codec management
195
_codecs: dict # Dictionary of all available codecs
196
_extensions: dict # Dictionary mapping file extensions to codecs
197
```
198
199
### Exception Classes
200
201
Structured exception hierarchy for error handling.
202
203
```python { .api }
204
class DelayedImportError(ImportError):
205
"""
206
Delayed ImportError raised when optional codec dependencies are not available.
207
208
This exception is raised during lazy loading when a codec's underlying
209
library is not installed or cannot be imported.
210
"""
211
212
def __init__(self, name: str) -> None:
213
"""
214
Initialize DelayedImportError.
215
216
Parameters:
217
- name: str - Name of the missing codec or library
218
"""
219
220
class ImcdError(Exception):
221
"""
222
Base exception class for IMCD codec errors.
223
224
This is the base class for all codec-specific exceptions in the package.
225
"""
226
227
class NoneError(Exception):
228
"""
229
Exception for NONE codec operations.
230
231
Raised when operations on the NONE codec fail (should be rare).
232
"""
233
234
class NumpyError(Exception):
235
"""
236
Exception for NumPy codec operations.
237
238
Raised when NumPy-based codec operations fail.
239
"""
240
```
241
242
## Usage Examples
243
244
### Package Information and Debugging
245
246
```python
247
import imagecodecs
248
249
# Get comprehensive version information
250
print("=== Imagecodecs Version Information ===")
251
version_info = imagecodecs.version()
252
print(version_info)
253
254
# Get specific version details
255
print(f"\nPackage version: {imagecodecs.__version__}")
256
print(f"Cython version: {imagecodecs.cython_version()}")
257
print(f"NumPy ABI version: {imagecodecs.numpy_abi_version()}")
258
print(f"IMCD version: {imagecodecs.imcd_version()}")
259
260
# Get structured version data
261
version_dict = imagecodecs.version(astype=dict)
262
print(f"\nAvailable codecs: {len(version_dict.get('codecs', {}))}")
263
264
# Check specific codec availability
265
codecs_to_check = ['jpeg', 'png', 'webp', 'avif', 'jpegxl', 'heif']
266
print("\n=== Codec Availability ===")
267
for codec_name in codecs_to_check:
268
try:
269
codec_class = getattr(imagecodecs, codec_name.upper())
270
available = codec_class.available
271
if available:
272
version_func = getattr(imagecodecs, f'{codec_name}_version')
273
version = version_func()
274
print(f"{codec_name.upper()}: ✓ available (v{version})")
275
else:
276
print(f"{codec_name.upper()}: ✗ not available")
277
except (AttributeError, imagecodecs.DelayedImportError):
278
print(f"{codec_name.upper()}: ✗ not available")
279
```
280
281
### Codec Discovery and Introspection
282
283
```python
284
import imagecodecs
285
286
# Discover all available attributes
287
all_attributes = imagecodecs.__dir__()
288
print(f"Total attributes: {len(all_attributes)}")
289
290
# Filter for codec functions
291
encode_functions = [attr for attr in all_attributes if attr.endswith('_encode')]
292
decode_functions = [attr for attr in all_attributes if attr.endswith('_decode')]
293
check_functions = [attr for attr in all_attributes if attr.endswith('_check')]
294
version_functions = [attr for attr in all_attributes if attr.endswith('_version')]
295
296
print(f"Encode functions: {len(encode_functions)}")
297
print(f"Decode functions: {len(decode_functions)}")
298
print(f"Check functions: {len(check_functions)}")
299
print(f"Version functions: {len(version_functions)}")
300
301
# Find codec constants classes
302
codec_constants = [attr for attr in all_attributes
303
if attr.isupper() and not attr.startswith('_')]
304
print(f"Codec constants: {len(codec_constants)}")
305
306
# Test lazy loading
307
print("\n=== Testing Lazy Loading ===")
308
try:
309
# This will trigger lazy loading if not already loaded
310
jpeg_encode = imagecodecs.jpeg_encode
311
print("JPEG codec loaded successfully")
312
except imagecodecs.DelayedImportError as e:
313
print(f"JPEG codec not available: {e}")
314
315
# Get codec and extension mappings
316
if hasattr(imagecodecs, '_codecs'):
317
print(f"Internal codec registry: {len(imagecodecs._codecs)} entries")
318
if hasattr(imagecodecs, '_extensions'):
319
print(f"File extension mappings: {len(imagecodecs._extensions)} entries")
320
```
321
322
### Data Integrity Verification
323
324
```python
325
import imagecodecs
326
import numpy as np
327
328
# Generate test data
329
test_data = np.random.randint(0, 256, 10000, dtype=np.uint8).tobytes()
330
331
print("=== Checksum Verification ===")
332
333
# Calculate various checksums
334
fletcher32 = imagecodecs.h5checksum_fletcher32(test_data)
335
lookup3 = imagecodecs.h5checksum_lookup3(test_data)
336
crc = imagecodecs.h5checksum_crc(test_data)
337
338
print(f"Fletcher-32: 0x{fletcher32:08x}")
339
print(f"Jenkins lookup3: 0x{lookup3:08x}")
340
print(f"CRC: 0x{crc:08x}")
341
342
# Incremental checksum calculation
343
chunk_size = 1000
344
fletcher32_inc = 0
345
lookup3_inc = 0
346
crc_inc = 0
347
348
for i in range(0, len(test_data), chunk_size):
349
chunk = test_data[i:i+chunk_size]
350
fletcher32_inc = imagecodecs.h5checksum_fletcher32(chunk, fletcher32_inc)
351
lookup3_inc = imagecodecs.h5checksum_lookup3(chunk, lookup3_inc)
352
crc_inc = imagecodecs.h5checksum_crc(chunk, crc_inc)
353
354
print(f"\nIncremental checksums:")
355
print(f"Fletcher-32: 0x{fletcher32_inc:08x} {'✓' if fletcher32_inc == fletcher32 else '✗'}")
356
print(f"Jenkins lookup3: 0x{lookup3_inc:08x} {'✓' if lookup3_inc == lookup3 else '✗'}")
357
print(f"CRC: 0x{crc_inc:08x} {'✓' if crc_inc == crc else '✗'}")
358
359
# Verify data integrity after compression/decompression
360
compressed = imagecodecs.zlib_encode(test_data)
361
decompressed = imagecodecs.zlib_decode(compressed)
362
363
decompressed_fletcher32 = imagecodecs.h5checksum_fletcher32(decompressed)
364
integrity_check = decompressed_fletcher32 == fletcher32
365
print(f"\nData integrity after compression: {'✓ PASS' if integrity_check else '✗ FAIL'}")
366
```
367
368
### Format Detection
369
370
```python
371
import imagecodecs
372
import numpy as np
373
374
# Create sample data in different formats
375
test_image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
376
377
# Encode in various formats
378
formats = {
379
'JPEG': imagecodecs.jpeg_encode(test_image, level=85),
380
'PNG': imagecodecs.png_encode(test_image),
381
'WebP': imagecodecs.webp_encode(test_image) if imagecodecs.WEBP.available else None,
382
'ZLIB': imagecodecs.zlib_encode(test_image.tobytes()),
383
'GZIP': imagecodecs.gzip_encode(test_image.tobytes()),
384
}
385
386
print("=== Format Detection ===")
387
for format_name, data in formats.items():
388
if data is None:
389
print(f"{format_name}: Not available")
390
continue
391
392
# Test format detection
393
check_results = {}
394
check_functions = [
395
('jpeg_check', imagecodecs.jpeg_check),
396
('png_check', imagecodecs.png_check),
397
('webp_check', imagecodecs.webp_check if imagecodecs.WEBP.available else None),
398
('zlib_check', imagecodecs.zlib_check),
399
('gzip_check', imagecodecs.gzip_check),
400
]
401
402
for check_name, check_func in check_functions:
403
if check_func is None:
404
continue
405
try:
406
result = check_func(data)
407
if result:
408
check_results[check_name] = result
409
except Exception as e:
410
pass
411
412
detected_formats = [name.replace('_check', '').upper() for name in check_results.keys()]
413
print(f"{format_name}: Detected as {detected_formats if detected_formats else 'Unknown'}")
414
```
415
416
### Error Handling and Fallbacks
417
418
```python
419
import imagecodecs
420
import numpy as np
421
422
def safe_encode(data, preferred_codecs=['avif', 'webp', 'jpeg'], **kwargs):
423
"""
424
Encode image with fallback to available codecs.
425
"""
426
for codec_name in preferred_codecs:
427
try:
428
# Check if codec is available
429
codec_class = getattr(imagecodecs, codec_name.upper())
430
if not codec_class.available:
431
continue
432
433
# Try to encode
434
encode_func = getattr(imagecodecs, f'{codec_name}_encode')
435
encoded = encode_func(data, **kwargs)
436
print(f"Encoded with {codec_name.upper()}")
437
return encoded, codec_name
438
439
except imagecodecs.DelayedImportError:
440
print(f"{codec_name.upper()} not available")
441
continue
442
except Exception as e:
443
print(f"{codec_name.upper()} encoding failed: {e}")
444
continue
445
446
# Fallback to always-available codec
447
encoded = imagecodecs.zlib_encode(data.tobytes())
448
print("Fell back to ZLIB compression")
449
return encoded, 'zlib'
450
451
# Test with sample image
452
test_image = np.random.randint(0, 256, (256, 256, 3), dtype=np.uint8)
453
454
try:
455
encoded_data, used_codec = safe_encode(test_image, level=85)
456
print(f"Successfully encoded with {used_codec}")
457
except Exception as e:
458
print(f"All encoding methods failed: {e}")
459
460
# Test exception handling
461
print("\n=== Exception Handling ===")
462
try:
463
# Try to use a non-existent codec
464
result = imagecodecs.__getattr__('nonexistent_codec')
465
except AttributeError as e:
466
print(f"AttributeError: {e}")
467
468
try:
469
# Try to access unavailable codec
470
if not imagecodecs.AVIF.available:
471
imagecodecs.avif_encode(test_image)
472
except imagecodecs.DelayedImportError as e:
473
print(f"DelayedImportError: {e}")
474
```
475
476
### Codec Performance Benchmarking
477
478
```python
479
import imagecodecs
480
import numpy as np
481
import time
482
483
def benchmark_codecs(image, codecs_to_test=None):
484
"""
485
Benchmark compression performance of available codecs.
486
"""
487
if codecs_to_test is None:
488
codecs_to_test = ['jpeg', 'png', 'webp', 'zlib', 'zstd', 'lz4']
489
490
results = []
491
original_size = image.nbytes
492
493
for codec_name in codecs_to_test:
494
try:
495
# Check availability
496
codec_class = getattr(imagecodecs, codec_name.upper())
497
if not codec_class.available:
498
continue
499
500
encode_func = getattr(imagecodecs, f'{codec_name}_encode')
501
decode_func = getattr(imagecodecs, f'{codec_name}_decode')
502
503
# Benchmark encoding
504
start_time = time.time()
505
if codec_name in ['zlib', 'zstd', 'lz4']:
506
encoded = encode_func(image.tobytes())
507
else:
508
encoded = encode_func(image)
509
encode_time = time.time() - start_time
510
511
# Benchmark decoding
512
start_time = time.time()
513
if codec_name in ['zlib', 'zstd', 'lz4']:
514
decoded = decode_func(encoded)
515
else:
516
decoded = decode_func(encoded)
517
decode_time = time.time() - start_time
518
519
compressed_size = len(encoded)
520
compression_ratio = original_size / compressed_size
521
522
results.append({
523
'codec': codec_name.upper(),
524
'compressed_size': compressed_size,
525
'compression_ratio': compression_ratio,
526
'encode_time': encode_time * 1000, # ms
527
'decode_time': decode_time * 1000, # ms
528
})
529
530
except (AttributeError, imagecodecs.DelayedImportError, Exception) as e:
531
print(f"Skipping {codec_name}: {e}")
532
continue
533
534
return results
535
536
# Run benchmark
537
test_image = np.random.randint(0, 256, (512, 512, 3), dtype=np.uint8)
538
benchmark_results = benchmark_codecs(test_image)
539
540
print("=== Codec Performance Benchmark ===")
541
print(f"{'Codec':<8} {'Size (KB)':<10} {'Ratio':<8} {'Enc (ms)':<10} {'Dec (ms)':<10}")
542
print("-" * 60)
543
544
for result in sorted(benchmark_results, key=lambda x: x['compression_ratio'], reverse=True):
545
print(f"{result['codec']:<8} "
546
f"{result['compressed_size']/1024:<10.1f} "
547
f"{result['compression_ratio']:<8.1f} "
548
f"{result['encode_time']:<10.1f} "
549
f"{result['decode_time']:<10.1f}")
550
```
551
552
## Constants and Configuration
553
554
### Package Constants
555
556
```python { .api }
557
# Version information
558
__version__: str # Package version string
559
560
# Internal registries (read-only)
561
_codecs: dict # Codec function registry
562
_extensions: dict # File extension to codec mappings
563
_MODULES: dict # Module loading configuration
564
_ATTRIBUTES: dict # Attribute to module mappings
565
_COMPATIBILITY: dict # Backward compatibility aliases
566
567
# Always-available codecs
568
NONE: type # NONE codec constants
569
NUMPY: type # NumPy codec constants
570
```
571
572
### Checksum Constants
573
574
```python { .api }
575
class H5CHECKSUM:
576
available: bool
577
578
# Checksum algorithm identifiers
579
FLETCHER32 = 'fletcher32'
580
LOOKUP3 = 'lookup3'
581
CRC = 'crc'
582
```
583
584
## Performance Considerations
585
586
### Version Information
587
- `version()` imports all codec modules, which may be slow on first call
588
- Cache version information if needed frequently
589
- Use `version(astype=dict)` for programmatic access to version data
590
591
### Lazy Loading
592
- Attributes are loaded on first access, causing slight delay
593
- Pre-load frequently used codecs at startup if performance is critical
594
- Use `__dir__()` to discover available functionality without loading
595
596
### Checksum Performance
597
- HDF5 checksums are optimized for incremental calculation
598
- Use appropriate chunk sizes for incremental checksums
599
- Fletcher-32 is generally fastest, CRC provides best error detection
600
601
## Error Handling Patterns
602
603
```python
604
import imagecodecs
605
606
# Check codec availability before use
607
if imagecodecs.WEBP.available:
608
encoded = imagecodecs.webp_encode(image)
609
else:
610
encoded = imagecodecs.jpeg_encode(image) # Fallback
611
612
# Handle delayed import errors
613
try:
614
result = imagecodecs.avif_encode(image)
615
except imagecodecs.DelayedImportError:
616
result = imagecodecs.jpeg_encode(image) # Fallback
617
618
# Comprehensive error handling
619
def safe_decode(data, possible_formats=['jpeg', 'png', 'webp']):
620
for fmt in possible_formats:
621
try:
622
check_func = getattr(imagecodecs, f'{fmt}_check')
623
if check_func(data):
624
decode_func = getattr(imagecodecs, f'{fmt}_decode')
625
return decode_func(data)
626
except (AttributeError, imagecodecs.DelayedImportError, Exception):
627
continue
628
raise ValueError("Unable to decode data with any available codec")
629
```