Tessl Tile for pypi/imagecodecs@2025.8.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

array-processing.md color-management.md image-formats.md image-io.md index.md lossless-compression.md scientific-compression.md utilities.md

array-processing.mddocs/

0
# Array Processing
1

2
Utilities for array transformation, bit manipulation, byte shuffling, and data preparation for compression algorithms. These functions optimize data layout and remove redundancy to improve compression efficiency or prepare data for specific processing requirements.
3

4
## Capabilities
5

6
### Delta Encoding
7

8
Compute differences between adjacent elements to remove trends and improve compressibility.
9

10
```python { .api }
11
def delta_encode(data, *, axis=-1, dist=1, out=None):
12
    """
13
    Return delta encoded data.
14
    
15
    Parameters:
16
    - data: NDArray - Input array to encode (any numeric dtype)
17
    - axis: int - Axis along which to compute differences (default -1, last axis)
18
    - dist: int - Distance for delta computation (default 1, adjacent elements)
19
    - out: NDArray | None - Pre-allocated output buffer (same shape as input)
20
    
21
    Returns:
22
    NDArray: Delta encoded array (first element unchanged, rest are differences)
23
    """
24

25
def delta_decode(data, *, axis=-1, dist=1, out=None):
26
    """
27
    Return delta decoded data.
28
    
29
    Parameters:
30
    - data: NDArray - Delta encoded array
31
    - axis: int - Axis along which delta was computed (default -1)
32
    - dist: int - Distance used for delta computation (default 1)
33
    - out: NDArray | None - Pre-allocated output buffer
34
    
35
    Returns:
36
    NDArray: Decoded array (reconstructed from differences)
37
    """
38

39
def delta_check(data):
40
    """
41
    Check if data is delta encoded.
42
    
43
    Parameters:
44
    - data: bytes | bytearray | mmap.mmap | NDArray - Data to check
45
    
46
    Returns:
47
    None: Always returns None (delta is a transform, not a format)
48
    """
49
```
50

51
### Bit Shuffling
52

53
Reorganize bits to group similar bit positions together, improving compression of typed data.
54

55
```python { .api }
56
def bitshuffle_encode(data, *, itemsize=1, blocksize=0, out=None):
57
    """
58
    Return bit-shuffled data.
59
    
60
    Parameters:
61
    - data: bytes | bytearray | mmap.mmap | NDArray - Input data
62
    - itemsize: int - Size of data items in bytes (default 1)
63
        Common values: 1 (uint8), 2 (uint16), 4 (uint32/float32), 8 (uint64/float64)
64
    - blocksize: int - Block size for shuffling in bytes (default 0 = auto-determine)
65
    - out: bytes | bytearray | NDArray | None - Pre-allocated output buffer
66
    
67
    Returns:
68
    bytes | bytearray | NDArray: Bit-shuffled data
69
    """
70

71
def bitshuffle_decode(data, *, itemsize=1, blocksize=0, out=None):
72
    """
73
    Return un-bit-shuffled data.
74
    
75
    Parameters:
76
    - data: bytes | bytearray | mmap.mmap | NDArray - Bit-shuffled data
77
    - itemsize: int - Size of data items in bytes (must match encoding)
78
    - blocksize: int - Block size used for shuffling (must match encoding)
79
    - out: bytes | bytearray | NDArray | None - Pre-allocated output buffer
80
    
81
    Returns:
82
    bytes | bytearray | NDArray: Reconstructed data
83
    """
84

85
def bitshuffle_check(data):
86
    """
87
    Check if data is bit-shuffled.
88
    
89
    Parameters:
90
    - data: bytes | bytearray | mmap.mmap - Data to check
91
    
92
    Returns:
93
    bool | None: True if bitshuffle signature detected
94
    """
95
```
96

97
### Byte Shuffling
98

99
Reorder bytes to group similar byte positions together, useful for multi-byte data types.
100

101
```python { .api }
102
def byteshuffle_encode(data, *, axis=-1, dist=1, delta=False, reorder=False, out=None):
103
    """
104
    Return byte-shuffled data.
105
    
106
    Parameters:
107
    - data: NDArray - Input array to shuffle
108
    - axis: int - Axis along which to shuffle (default -1)
109
    - dist: int - Distance for shuffling pattern (default 1)
110
    - delta: bool - Apply delta encoding before shuffling (default False)
111
    - reorder: bool - Reorder dimensions for better locality (default False)
112
    - out: NDArray | None - Pre-allocated output buffer
113
    
114
    Returns:
115
    NDArray: Byte-shuffled array
116
    """
117

118
def byteshuffle_decode(data, *, axis=-1, dist=1, delta=False, reorder=False, out=None):
119
    """
120
    Return un-byte-shuffled data.
121
    
122
    Parameters:
123
    - data: NDArray - Byte-shuffled array
124
    - axis: int - Axis along which shuffling was applied (default -1)
125
    - dist: int - Distance used for shuffling (default 1)
126
    - delta: bool - Reverse delta encoding after unshuffling (default False)
127
    - reorder: bool - Reverse dimension reordering (default False)
128
    - out: NDArray | None - Pre-allocated output buffer
129
    
130
    Returns:
131
    NDArray: Reconstructed array
132
    """
133

134
def byteshuffle_check(data):
135
    """
136
    Check if data is byte-shuffled.
137
    
138
    Parameters:
139
    - data: bytes | bytearray | mmap.mmap - Data to check
140
    
141
    Returns:
142
    None: Always returns None (byte shuffle is a transform, not a format)
143
    """
144
```
145

146
### Integer Packing
147

148
Pack integer arrays by removing unused high-order bits to reduce storage requirements.
149

150
```python { .api }
151
def packints_encode(data, *, out=None):
152
    """
153
    Return packed integer array.
154
    
155
    Parameters:
156
    - data: NDArray - Integer array to pack (uint8, uint16, uint32, uint64)
157
    - out: NDArray | None - Pre-allocated output buffer
158
    
159
    Returns:
160
    NDArray: Packed integer data with reduced bit width
161
    """
162

163
def packints_decode(data, dtype=None, *, out=None):
164
    """
165
    Return unpacked integer array.
166
    
167
    Parameters:
168
    - data: NDArray - Packed integer data
169
    - dtype: numpy.dtype | None - Target dtype for unpacking (required)
170
    - out: NDArray | None - Pre-allocated output buffer
171
    
172
    Returns:
173
    NDArray: Unpacked integer array
174
    """
175

176
def packints_check(data):
177
    """
178
    Check if data is packed integers.
179
    
180
    Parameters:
181
    - data: bytes | bytearray | mmap.mmap - Data to check
182
    
183
    Returns:
184
    None: Always returns None (packints is a transform, not a format)
185
    """
186
```
187

188
### PackBits Compression
189

190
Simple run-length encoding compression used in TIFF and other formats.
191

192
```python { .api }
193
def packbits_encode(data, *, out=None):
194
    """
195
    Return PackBits encoded data.
196
    
197
    Parameters:
198
    - data: bytes | bytearray | mmap.mmap - Input data to encode
199
    - out: bytes | bytearray | None - Pre-allocated output buffer
200
    
201
    Returns:
202
    bytes | bytearray: PackBits encoded data
203
    """
204

205
def packbits_decode(data, *, out=None):
206
    """
207
    Return PackBits decoded data.
208
    
209
    Parameters:
210
    - data: bytes | bytearray | mmap.mmap - PackBits encoded data
211
    - out: bytes | bytearray | None - Pre-allocated output buffer
212
    
213
    Returns:
214
    bytes | bytearray: Decoded data
215
    """
216

217
def packbits_check(data):
218
    """
219
    Check if data is PackBits encoded.
220
    
221
    Parameters:
222
    - data: bytes | bytearray | mmap.mmap - Data to check
223
    
224
    Returns:
225
    None: Always returns None (no reliable magic number)
226
    """
227
```
228

229
### XOR Encoding
230

231
Apply XOR transformation to remove correlation between adjacent values.
232

233
```python { .api }
234
def xor_encode(data, *, out=None):
235
    """
236
    Return XOR encoded data.
237
    
238
    Parameters:
239
    - data: NDArray - Input array to encode (integer types)
240
    - out: NDArray | None - Pre-allocated output buffer
241
    
242
    Returns:
243
    NDArray: XOR encoded array
244
    """
245

246
def xor_decode(data, *, out=None):
247
    """
248
    Return XOR decoded data.
249
    
250
    Parameters:
251
    - data: NDArray - XOR encoded array
252
    - out: NDArray | None - Pre-allocated output buffer
253
    
254
    Returns:
255
    NDArray: Decoded array
256
    """
257

258
def xor_check(data):
259
    """
260
    Check if data is XOR encoded.
261
    
262
    Parameters:
263
    - data: bytes | bytearray | mmap.mmap - Data to check
264
    
265
    Returns:
266
    None: Always returns None (XOR is a transform, not a format)
267
    """
268
```
269

270
### Bit Order Reversal
271

272
Reverse the bit order within bytes for compatibility with different endianness or protocols.
273

274
```python { .api }
275
def bitorder_encode(data, *, out=None):
276
    """
277
    Return data with reversed bit-order.
278
    
279
    Parameters:
280
    - data: bytes | bytearray | mmap.mmap | NDArray - Input data
281
    - out: bytes | bytearray | NDArray | None - Pre-allocated output buffer
282
    
283
    Returns:
284
    bytes | bytearray | NDArray: Data with bits reversed in each byte
285
    """
286

287
def bitorder_decode(data, *, out=None):
288
    """
289
    Return data with restored bit-order (same as encode).
290
    
291
    Parameters:
292
    - data: bytes | bytearray | mmap.mmap | NDArray - Bit-reversed data
293
    - out: bytes | bytearray | NDArray | None - Pre-allocated output buffer
294
    
295
    Returns:
296
    bytes | bytearray | NDArray: Data with original bit order
297
    """
298

299
def bitorder_check(data):
300
    """
301
    Check if data has reversed bit-order.
302
    
303
    Parameters:
304
    - data: bytes | bytearray | mmap.mmap - Data to check
305
    
306
    Returns:
307
    None: Always returns None (bit order reversal is a transform)
308
    """
309
```
310

311
### Quantization
312

313
Reduce the precision of floating-point data by quantizing to fewer levels.
314

315
```python { .api }
316
def quantize_encode(data, *, levels=None, out=None):
317
    """
318
    Return quantized data.
319
    
320
    Parameters:
321
    - data: NDArray - Floating-point data to quantize
322
    - levels: int | None - Number of quantization levels (default 256)
323
    - out: NDArray | None - Pre-allocated output buffer
324
    
325
    Returns:
326
    NDArray: Quantized data (typically integer type)
327
    """
328

329
def quantize_decode(data, *, levels=None, out=None):
330
    """
331
    Return dequantized data.
332
    
333
    Parameters:
334
    - data: NDArray - Quantized data
335
    - levels: int | None - Number of quantization levels used (default 256)
336
    - out: NDArray | None - Pre-allocated output buffer
337
    
338
    Returns:
339
    NDArray: Dequantized floating-point data
340
    """
341

342
def quantize_check(data):
343
    """
344
    Check if data is quantized.
345
    
346
    Parameters:
347
    - data: bytes | bytearray | mmap.mmap - Data to check
348
    
349
    Returns:
350
    None: Always returns None (quantization is a transform)
351
    """
352
```
353

354
## Usage Examples
355

356
### Image Data Preprocessing
357

358
```python
359
import imagecodecs
360
import numpy as np
361

362
# Simulate 16-bit sensor data
363
sensor_data = np.random.randint(0, 65536, (1024, 1024), dtype=np.uint16)
364

365
# Apply delta encoding to remove gradients
366
delta_encoded = imagecodecs.delta_encode(sensor_data, axis=1)  # Row-wise differences
367

368
# Apply bit shuffling optimized for 16-bit data
369
bit_shuffled = imagecodecs.bitshuffle_encode(
370
    delta_encoded, 
371
    itemsize=2,  # 16-bit = 2 bytes
372
    blocksize=8192  # 8KB blocks
373
)
374

375
# Compress the preprocessed data
376
compressed = imagecodecs.zlib_encode(bit_shuffled.tobytes(), level=9)
377

378
# Compare with direct compression
379
direct_compressed = imagecodecs.zlib_encode(sensor_data.tobytes(), level=9)
380

381
print(f"Original size: {sensor_data.nbytes} bytes")
382
print(f"Direct compression: {len(direct_compressed)} bytes ({len(direct_compressed)/sensor_data.nbytes:.2%})")
383
print(f"Preprocessed compression: {len(compressed)} bytes ({len(compressed)/sensor_data.nbytes:.2%})")
384
print(f"Improvement: {len(direct_compressed) / len(compressed):.1f}x")
385

386
# Decompress and decode
387
decompressed_bytes = imagecodecs.zlib_decode(compressed)
388
decompressed_array = np.frombuffer(decompressed_bytes, dtype=np.uint16).reshape(sensor_data.shape)
389
bit_unshuffled = imagecodecs.bitshuffle_decode(decompressed_array, itemsize=2, blocksize=8192)
390
reconstructed = imagecodecs.delta_decode(bit_unshuffled, axis=1)
391

392
assert np.array_equal(sensor_data, reconstructed)
393
```
394

395
### Scientific Data Optimization
396

397
```python
398
import imagecodecs
399
import numpy as np
400

401
# Simulate time-series scientific measurements
402
time_points, sensors = 10000, 128
403
measurements = np.cumsum(np.random.normal(0, 0.1, (time_points, sensors)), axis=0).astype(np.float32)
404

405
# Apply floating-point predictor along time axis
406
predicted = imagecodecs.floatpred_encode(measurements, axis=0)
407

408
# Apply byte shuffling for better compression
409
shuffled = imagecodecs.byteshuffle_encode(predicted, axis=1, delta=False)
410

411
# Compress with high-performance algorithm
412
compressed = imagecodecs.blosc_encode(
413
    shuffled.tobytes(),
414
    level=5,
415
    compressor='zstd',
416
    shuffle=1,  # Additional byte shuffle at BLOSC level
417
    typesize=4,  # float32 = 4 bytes
418
    numthreads=4
419
)
420

421
print(f"Original: {measurements.nbytes} bytes")
422
print(f"Compressed: {len(compressed)} bytes ({len(compressed)/measurements.nbytes:.2%})")
423

424
# Decompress and reconstruct
425
decompressed_bytes = imagecodecs.blosc_decode(compressed, numthreads=4)
426
decompressed_array = np.frombuffer(decompressed_bytes, dtype=np.float32).reshape(measurements.shape)
427
unshuffled = imagecodecs.byteshuffle_decode(decompressed_array, axis=1, delta=False)
428
reconstructed = imagecodecs.floatpred_decode(unshuffled, axis=0)
429

430
# Verify exact reconstruction
431
assert np.allclose(measurements, reconstructed, rtol=1e-7, atol=1e-7)
432
```
433

434
### Integer Data Optimization
435

436
```python
437
import imagecodecs
438
import numpy as np
439

440
# Simulate sparse integer data (many small values)
441
data = np.random.choice([0, 1, 2, 3, 4, 255, 65535], size=(1000, 1000), 
442
                       p=[0.4, 0.2, 0.15, 0.1, 0.1, 0.04, 0.01]).astype(np.uint16)
443

444
# Pack integers to remove unused high bits
445
packed = imagecodecs.packints_encode(data)
446
print(f"Original dtype: {data.dtype}, packed dtype: {packed.dtype}")
447

448
# Apply XOR encoding to remove correlation
449
xor_encoded = imagecodecs.xor_encode(packed)
450

451
# Apply run-length encoding for sparse data
452
packbits_compressed = imagecodecs.packbits_encode(xor_encoded.tobytes())
453

454
print(f"Original: {data.nbytes} bytes")
455
print(f"After packing: {packed.nbytes} bytes")
456
print(f"After PackBits: {len(packbits_compressed)} bytes")
457
print(f"Total compression: {data.nbytes / len(packbits_compressed):.1f}x")
458

459
# Reconstruct
460
packbits_decompressed = imagecodecs.packbits_decode(packbits_compressed)
461
packed_array = np.frombuffer(packbits_decompressed, dtype=packed.dtype).reshape(packed.shape)
462
xor_decoded = imagecodecs.xor_decode(packed_array)  
463
unpacked = imagecodecs.packints_decode(xor_decoded, dtype=data.dtype)
464

465
assert np.array_equal(data, unpacked)
466
```
467

468
### Multi-dimensional Data Processing
469

470
```python
471
import imagecodecs
472
import numpy as np
473

474
# 3D medical or scientific dataset
475
depth, height, width = 64, 512, 512
476
volume = np.random.randint(0, 4096, (depth, height, width), dtype=np.uint16)
477

478
# Apply delta encoding along different axes
479
z_delta = imagecodecs.delta_encode(volume, axis=0)  # Slice-to-slice differences
480
xy_delta = imagecodecs.delta_encode(z_delta, axis=2)  # Column differences
481

482
# Byte shuffle optimized for 3D data
483
shuffled = imagecodecs.byteshuffle_encode(xy_delta, axis=1, reorder=True)
484

485
# Compress with algorithm suitable for 3D data
486
compressed = imagecodecs.lzma_encode(shuffled.tobytes(), level=6)
487

488
print(f"3D volume: {volume.shape}")
489
print(f"Original: {volume.nbytes} bytes")
490
print(f"Compressed: {len(compressed)} bytes ({len(compressed)/volume.nbytes:.2%})")
491

492
# Reconstruct
493
decompressed_bytes = imagecodecs.lzma_decode(compressed)
494
decompressed_array = np.frombuffer(decompressed_bytes, dtype=volume.dtype).reshape(volume.shape)
495
unshuffled = imagecodecs.byteshuffle_decode(decompressed_array, axis=1, reorder=True)
496
xy_reconstructed = imagecodecs.delta_decode(unshuffled, axis=2)
497
z_reconstructed = imagecodecs.delta_decode(xy_reconstructed, axis=0)
498

499
assert np.array_equal(volume, z_reconstructed)
500
```
501

502
### Quantization for Lossy Compression
503

504
```python
505
import imagecodecs
506
import numpy as np
507

508
# High-precision floating-point data
509
data = np.random.normal(0, 1, (256, 256)).astype(np.float64)
510

511
# Quantize to reduce precision
512
quantized = imagecodecs.quantize_encode(data, levels=1024)  # 10-bit quantization
513
print(f"Original dtype: {data.dtype}, quantized dtype: {quantized.dtype}")
514

515
# Compress quantized data (integers compress better)
516
compressed = imagecodecs.zlib_encode(quantized.tobytes(), level=9)
517

518
# Compare with direct float compression
519
direct_compressed = imagecodecs.zlib_encode(data.tobytes(), level=9)
520

521
print(f"Original: {data.nbytes} bytes")
522
print(f"Direct compression: {len(direct_compressed)} bytes")
523
print(f"Quantized compression: {len(compressed)} bytes")
524
print(f"Improvement: {len(direct_compressed) / len(compressed):.1f}x")
525

526
# Reconstruct (lossy)
527
decompressed_bytes = imagecodecs.zlib_decode(compressed)
528
quantized_restored = np.frombuffer(decompressed_bytes, dtype=quantized.dtype).reshape(data.shape)
529
dequantized = imagecodecs.quantize_decode(quantized_restored, levels=1024)
530

531
# Measure quantization error
532
max_error = np.max(np.abs(data - dequantized))
533
mse = np.mean((data - dequantized) ** 2)
534
print(f"Max quantization error: {max_error:.6f}")
535
print(f"MSE: {mse:.6f}")
536
```
537

538
## Performance Considerations
539

540
### Transform Selection
541
- **Delta encoding**: Best for data with trends or gradients
542
- **Bit shuffling**: Optimal for typed numerical data before compression
543
- **Byte shuffling**: Good for multi-byte data types and multi-dimensional arrays
544
- **PackBits**: Effective for sparse data with runs of identical values
545
- **XOR encoding**: Removes correlation between adjacent integer values
546
- **Quantization**: Trade precision for compression ratio
547

548
### Optimization Guidelines
549
- Chain transforms for maximum benefit (e.g., delta → shuffle → compress)
550
- Match itemsize parameter to your data type for bit/byte shuffling
551
- Use appropriate axis for delta encoding based on data structure
552
- Consider data distribution when choosing quantization levels
553
- Pre-allocate output buffers for large datasets
554

555
### Memory Management
556
- Transforms are typically in-place where possible
557
- Use appropriate block sizes for bit shuffling with large datasets
558
- Consider memory usage when chaining multiple transforms
559

560
## Constants and Configuration
561

562
### Bit Shuffle Constants
563

564
```python { .api }
565
class BITSHUFFLE:
566
    available: bool
567
    
568
    # Common item sizes
569
    ITEMSIZE_UINT8 = 1
570
    ITEMSIZE_UINT16 = 2  
571
    ITEMSIZE_UINT32 = 4
572
    ITEMSIZE_UINT64 = 8
573
    ITEMSIZE_FLOAT32 = 4
574
    ITEMSIZE_FLOAT64 = 8
575
```
576

577
### Delta Encoding Constants
578

579
```python { .api }
580
class DELTA:
581
    available: bool = True  # Pure Python implementation always available
582
    
583
    # Common distance values
584
    DISTANCE_ADJACENT = 1    # Adjacent elements
585
    DISTANCE_ROW = None      # Width of 2D array (context-dependent)
586
    DISTANCE_PLANE = None    # Area of 2D slice in 3D array
587
```
588

589
## Error Handling
590

591
All array processing functions use the base `ImcdError` exception class:
592

593
```python { .api }
594
class ImcdError(Exception):
595
    """Base IMCD codec exception."""
596

597
# Specific aliases for array processing
598
DeltaError = ImcdError
599
BitshuffleError = Exception  # Uses standard bitshuffle exceptions
600
ByteshuffleError = ImcdError
601
PackintsError = ImcdError  
602
PackbitsError = ImcdError
603
XorError = ImcdError
604
BitorderError = ImcdError
605
QuantizeError = ImcdError
606
```

Version

Tile

Files

array-processing.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

array-processing.mddocs/