Tessl Tile for pypi/imagecodecs@2025.8.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

array-processing.md color-management.md image-formats.md image-io.md index.md lossless-compression.md scientific-compression.md utilities.md

scientific-compression.mddocs/

0
# Scientific Data Compression
1

2
Specialized codecs optimized for scientific computing, including floating-point data compression, error-bounded compression, and array processing utilities. These algorithms are designed for numerical accuracy, performance, and specific scientific data characteristics.
3

4
## Capabilities
5

6
### ZFP Floating-Point Compression
7

8
Compressed floating-point arrays with configurable precision, rate, or error tolerance for scientific datasets.
9

10
```python { .api }
11
def zfp_encode(data, *, rate=None, precision=None, tolerance=None, out=None):
12
    """
13
    Return ZFP encoded floating-point array.
14
    
15
    Parameters:
16
    - data: NDArray - Floating-point array to compress (1D-4D, float32/float64)
17
    - rate: float | None - Target compression rate in bits per value
18
    - precision: int | None - Number of bit planes to encode (lossless if sufficient)
19
    - tolerance: float | None - Absolute error tolerance (error-bounded mode)
20
    - out: bytes | bytearray | None - Pre-allocated output buffer
21
    
22
    Returns:
23
    bytes | bytearray: ZFP compressed data
24
    
25
    Note: Exactly one of rate, precision, or tolerance must be specified
26
    """
27

28
def zfp_decode(data, shape=None, dtype=None, *, out=None):
29
    """
30
    Return decoded ZFP floating-point array.
31
    
32
    Parameters:
33
    - data: bytes | bytearray | mmap.mmap - ZFP compressed data
34
    - shape: tuple | None - Output array shape (required)
35
    - dtype: numpy.dtype | None - Output data type (float32 or float64, required)
36
    - out: NDArray | None - Pre-allocated output buffer
37
    
38
    Returns:
39
    NDArray: Decoded floating-point array
40
    """
41

42
def zfp_check(data):
43
    """
44
    Check if data is ZFP encoded.
45
    
46
    Parameters:
47
    - data: bytes | bytearray | mmap.mmap - Data to check
48
    
49
    Returns:
50
    bool | None: True if ZFP header detected
51
    """
52
```
53

54
### SPERR Scientific Compression
55

56
Error-bounded lossy compressor optimized for scientific floating-point data with multiple quality modes.
57

58
```python { .api }
59
def sperr_encode(data, *, mode=None, quality=None, tolerance=None, out=None):
60
    """
61
    Return SPERR encoded floating-point data.
62
    
63
    Parameters:
64
    - data: NDArray - Floating-point data to compress (2D/3D, float32/float64)
65
    - mode: str | None - Compression mode:
66
        'rate' = fixed bit rate, 'psnr' = peak signal-to-noise ratio, 'pwe' = point-wise error
67
    - quality: float | None - Quality parameter for chosen mode:
68
        For 'rate': bits per pixel (e.g., 1.0-16.0)
69
        For 'psnr': target PSNR in dB (e.g., 40.0-80.0)  
70
        For 'pwe': maximum point-wise error
71
    - tolerance: float | None - Alternative way to specify error tolerance
72
    - out: bytes | bytearray | None - Pre-allocated output buffer
73
    
74
    Returns:
75
    bytes | bytearray: SPERR compressed data
76
    """
77

78
def sperr_decode(data, *, out=None):
79
    """
80
    Return decoded SPERR floating-point data.
81
    
82
    Parameters:
83
    - data: bytes | bytearray | mmap.mmap - SPERR compressed data
84
    - out: NDArray | None - Pre-allocated output buffer
85
    
86
    Returns:
87
    NDArray: Decoded floating-point array
88
    """
89

90
def sperr_check(data):
91
    """
92
    Check if data is SPERR encoded.
93
    
94
    Parameters:
95
    - data: bytes | bytearray | mmap.mmap - Data to check
96
    
97
    Returns:
98
    bool | None: True if SPERR signature detected
99
    """
100
```
101

102
### SZ3 Error-Bounded Compression
103

104
High-performance error-bounded lossy compressor for scientific datasets with excellent compression ratios.
105

106
```python { .api }
107
def sz3_encode(data, *, tolerance=None, out=None):
108
    """
109
    Return SZ3 encoded floating-point data.
110
    
111
    Parameters:
112
    - data: NDArray - Floating-point data to compress (float32/float64)
113
    - tolerance: float | None - Absolute error bound (required)
114
    - out: bytes | bytearray | None - Pre-allocated output buffer
115
    
116
    Returns:
117
    bytes | bytearray: SZ3 compressed data
118
    """
119

120
def sz3_decode(data, shape=None, dtype=None, *, out=None):
121
    """
122
    Return decoded SZ3 floating-point data.
123
    
124
    Parameters:
125
    - data: bytes | bytearray | mmap.mmap - SZ3 compressed data
126
    - shape: tuple | None - Output array shape (required)
127
    - dtype: numpy.dtype | None - Output data type (required)
128
    - out: NDArray | None - Pre-allocated output buffer
129
    
130
    Returns:
131
    NDArray: Decoded floating-point array
132
    """
133

134
def sz3_check(data):
135
    """
136
    Check if data is SZ3 encoded.
137
    
138
    Parameters:
139
    - data: bytes | bytearray | mmap.mmap - Data to check
140
    
141
    Returns:
142
    bool | None: True if SZ3 signature detected
143
    """
144
```
145

146
### Floating-Point Predictor
147

148
Preprocessing filter that improves compression by removing predictable patterns in floating-point data.
149

150
```python { .api }
151
def floatpred_encode(data, *, axis=-1, dist=1, out=None):
152
    """
153
    Return floating-point predictor encoded data.
154
    
155
    Parameters:
156
    - data: NDArray - Floating-point data to encode (float32/float64)
157
    - axis: int - Axis along which to apply predictor (default -1)
158
    - dist: int - Predictor distance (default 1)
159
    - out: NDArray | None - Pre-allocated output buffer
160
    
161
    Returns:
162
    NDArray: Predictor encoded data (same shape and dtype as input)
163
    """
164

165
def floatpred_decode(data, *, axis=-1, dist=1, out=None):
166
    """
167
    Return floating-point predictor decoded data.
168
    
169
    Parameters:
170
    - data: NDArray - Predictor encoded data
171
    - axis: int - Axis along which predictor was applied (default -1)
172
    - dist: int - Predictor distance used (default 1)
173
    - out: NDArray | None - Pre-allocated output buffer
174
    
175
    Returns:
176
    NDArray: Decoded floating-point data
177
    """
178

179
def floatpred_check(data):
180
    """
181
    Check if data is floating-point predictor encoded.
182
    
183
    Parameters:
184
    - data: bytes | bytearray | mmap.mmap | NDArray - Data to check
185
    
186
    Returns:
187
    None: Always returns None (predictor is a transform, not a format)
188
    """
189
```
190

191
### JETRAW Scientific Image Compression
192

193
High-performance lossless compression specifically optimized for scientific image data including X-ray, microscopy, and other detector data.
194

195
```python { .api }
196
def jetraw_encode(data, *, identifier=None, out=None):
197
    """
198
    Return JETRAW encoded image data.
199
    
200
    Parameters:
201
    - data: NDArray - Image data to compress (typically uint16 detector data)
202
    - identifier: str | None - Optional identifier string
203
    - out: bytes | bytearray | None - Pre-allocated output buffer
204
    
205
    Returns:
206
    bytes | bytearray: JETRAW compressed data
207
    """
208

209
def jetraw_decode(data, *, out=None):
210
    """
211
    Return decoded JETRAW image data.
212
    
213
    Parameters:
214
    - data: bytes | bytearray | mmap.mmap - JETRAW compressed data
215
    - out: NDArray | None - Pre-allocated output buffer
216
    
217
    Returns:
218
    NDArray: Decoded image data
219
    """
220

221
def jetraw_check(data):
222
    """
223
    Check if data is JETRAW encoded.
224
    
225
    Parameters:
226
    - data: bytes | bytearray | mmap.mmap - Data to check
227
    
228
    Returns:
229
    bool | None: True if JETRAW signature detected
230
    """
231
```
232

233
### LERC Limited Error Raster Compression
234

235
Lossy/lossless compression specifically designed for raster data with configurable error bounds.
236

237
```python { .api }
238
def lerc_encode(data, *, tolerance=None, version=None, out=None):
239
    """
240
    Return LERC encoded raster data.
241
    
242
    Parameters:
243
    - data: NDArray - Raster data to compress (integer or floating-point)
244
    - tolerance: float | None - Maximum error tolerance (0.0 for lossless)
245
    - version: int | None - LERC version (2 or 4, default 4)
246
    - out: bytes | bytearray | None - Pre-allocated output buffer
247
    
248
    Returns:
249
    bytes | bytearray: LERC compressed data
250
    """
251

252
def lerc_decode(data, *, out=None):
253
    """
254
    Return decoded LERC raster data.
255
    
256
    Parameters:
257
    - data: bytes | bytearray | mmap.mmap - LERC compressed data
258
    - out: NDArray | None - Pre-allocated output buffer
259
    
260
    Returns:
261
    NDArray: Decoded raster array
262
    """
263

264
def lerc_check(data):
265
    """
266
    Check if data is LERC encoded.
267
    
268
    Parameters:
269
    - data: bytes | bytearray | mmap.mmap - Data to check
270
    
271
    Returns:
272
    bool | None: True if LERC signature detected
273
    """
274
```
275

276
### SZIP Scientific Data Compression
277

278
NASA's adaptive entropy encoder designed for scientific datasets, particularly satellite and remote sensing data.
279

280
```python { .api }
281
def szip_encode(data, *, coding=None, pixels_per_block=None, bits_per_pixel=None, out=None):
282
    """
283
    Return SZIP encoded scientific data.
284
    
285
    Parameters:
286
    - data: NDArray - Scientific data to compress (integer types)
287
    - coding: str | None - Coding method ('ec' for entropy coding, 'nn' for nearest neighbor)
288
    - pixels_per_block: int | None - Pixels per compression block (8, 16, 32)
289
    - bits_per_pixel: int | None - Bits per pixel in input data
290
    - out: bytes | bytearray | None - Pre-allocated output buffer
291
    
292
    Returns:
293
    bytes | bytearray: SZIP compressed data
294
    """
295

296
def szip_decode(data, *, out=None):
297
    """
298
    Return decoded SZIP scientific data.
299
    
300
    Parameters:
301
    - data: bytes | bytearray | mmap.mmap - SZIP compressed data
302
    - out: NDArray | None - Pre-allocated output buffer
303
    
304
    Returns:
305
    NDArray: Decoded scientific data array
306
    """
307

308
def szip_check(data):
309
    """
310
    Check if data is SZIP encoded.
311
    
312
    Parameters:
313
    - data: bytes | bytearray | mmap.mmap - Data to check
314
    
315
    Returns:
316
    bool | None: True if SZIP signature detected
317
    """
318
```
319

320
### PCODEC Parquet Codec
321

322
Compression codec designed for columnar data formats, optimized for analytical workloads.
323

324
```python { .api }
325
def pcodec_encode(data, *, level=None, out=None):
326
    """
327
    Return PCODEC encoded columnar data.
328
    
329
    Parameters:
330
    - data: NDArray - Columnar data to compress
331
    - level: int | None - Compression level (0-12, default 8)
332
    - out: bytes | bytearray | None - Pre-allocated output buffer
333
    
334
    Returns:
335
    bytes | bytearray: PCODEC compressed data
336
    """
337

338
def pcodec_decode(data, *, out=None):
339
    """
340
    Return decoded PCODEC columnar data.
341
    
342
    Parameters:
343
    - data: bytes | bytearray | mmap.mmap - PCODEC compressed data
344
    - out: NDArray | None - Pre-allocated output buffer
345
    
346
    Returns:
347
    NDArray: Decoded columnar data array
348
    """
349

350
def pcodec_check(data):
351
    """
352
    Check if data is PCODEC encoded.
353
    
354
    Parameters:
355
    - data: bytes | bytearray | mmap.mmap - Data to check
356
    
357
    Returns:
358
    bool | None: True if PCODEC signature detected
359
    """
360
```
361

362
## Usage Examples
363

364
### Climate Data Compression
365

366
```python
367
import imagecodecs
368
import numpy as np
369

370
# Simulate climate model output (temperature data)
371
time_steps, lat, lon = 365, 180, 360
372
temperature = np.random.normal(15.0, 20.0, (time_steps, lat, lon)).astype(np.float32)
373

374
# Error-bounded compression with 0.1°C tolerance
375
zfp_compressed = imagecodecs.zfp_encode(temperature, tolerance=0.1)
376
zfp_decoded = imagecodecs.zfp_decode(
377
    zfp_compressed, 
378
    shape=temperature.shape, 
379
    dtype=temperature.dtype
380
)
381

382
# Verify error bound
383
max_error = np.max(np.abs(temperature - zfp_decoded))
384
print(f"Max error: {max_error:.3f}°C (tolerance: 0.1°C)")
385
print(f"Compression ratio: {temperature.nbytes / len(zfp_compressed):.1f}x")
386

387
# Alternative with SPERR
388
sperr_compressed = imagecodecs.sperr_encode(
389
    temperature, 
390
    mode='pwe', 
391
    quality=0.1  # 0.1°C point-wise error
392
)
393
sperr_decoded = imagecodecs.sperr_decode(sperr_compressed)
394
```
395

396
### Medical Imaging Data
397

398
```python
399
import imagecodecs
400
import numpy as np
401

402
# Simulate 3D medical scan (CT or MRI)
403
scan = np.random.randint(0, 4096, (256, 256, 128), dtype=np.uint16)
404

405
# Lossless compression with LERC
406
lerc_lossless = imagecodecs.lerc_encode(scan, tolerance=0.0)
407
lerc_decoded = imagecodecs.lerc_decode(lerc_lossless)
408
assert np.array_equal(scan, lerc_decoded)
409

410
# Near-lossless with small tolerance
411
lerc_lossy = imagecodecs.lerc_encode(scan, tolerance=1.0)  # 1 HU tolerance
412
lerc_lossy_decoded = imagecodecs.lerc_decode(lerc_lossy)
413

414
print(f"Original size: {scan.nbytes} bytes")
415
print(f"Lossless LERC: {len(lerc_lossless)} bytes ({len(lerc_lossless)/scan.nbytes:.2%})")
416
print(f"Lossy LERC: {len(lerc_lossy)} bytes ({len(lerc_lossy)/scan.nbytes:.2%})")
417
```
418

419
### Satellite Data Processing
420

421
```python
422
import imagecodecs
423
import numpy as np
424

425
# Simulate satellite imagery (multispectral)
426
bands, height, width = 8, 1024, 1024
427
satellite_data = np.random.randint(0, 65535, (bands, height, width), dtype=np.uint16)
428

429
# SZIP compression optimized for remote sensing
430
compressed_bands = []
431
for band in satellite_data:
432
    compressed = imagecodecs.szip_encode(
433
        band,
434
        coding='ec',  # Entropy coding
435
        pixels_per_block=16,
436
        bits_per_pixel=16
437
    )
438
    compressed_bands.append(compressed)
439

440
# Calculate total compression
441
original_size = satellite_data.nbytes
442
compressed_size = sum(len(band) for band in compressed_bands)
443
print(f"SZIP compression ratio: {original_size / compressed_size:.1f}x")
444

445
# Decode bands
446
decoded_bands = []
447
for compressed in compressed_bands:
448
    decoded = imagecodecs.szip_decode(compressed)
449
    decoded_bands.append(decoded)
450

451
reconstructed = np.stack(decoded_bands)
452
assert np.array_equal(satellite_data, reconstructed)
453
```
454

455
### Floating-Point Predictor Usage
456

457
```python
458
import imagecodecs
459
import numpy as np
460

461
# Scientific simulation data with smooth gradients
462
x = np.linspace(0, 10, 1000)
463
y = np.linspace(0, 10, 1000)
464
X, Y = np.meshgrid(x, y)
465
field = np.sin(X) * np.cos(Y) + 0.1 * np.random.random((1000, 1000))
466
field = field.astype(np.float32)
467

468
# Apply floating-point predictor before compression
469
predicted = imagecodecs.floatpred_encode(field, axis=1)  # Predict along rows
470

471
# Compress the predicted data
472
compressed = imagecodecs.zlib_encode(predicted.tobytes(), level=9)
473

474
# Compare with direct compression
475
direct_compressed = imagecodecs.zlib_encode(field.tobytes(), level=9)
476

477
print(f"Direct compression: {len(direct_compressed)} bytes")
478
print(f"With predictor: {len(compressed)} bytes")
479
print(f"Improvement: {len(direct_compressed) / len(compressed):.1f}x")
480

481
# Decompress and decode
482
decompressed_bytes = imagecodecs.zlib_decode(compressed)
483
predicted_restored = np.frombuffer(decompressed_bytes, dtype=np.float32).reshape(field.shape)
484
field_restored = imagecodecs.floatpred_decode(predicted_restored, axis=1)
485

486
# Verify exact reconstruction (lossless)
487
assert np.array_equal(field, field_restored)
488
```
489

490
### Quality vs Compression Trade-offs
491

492
```python
493
import imagecodecs
494
import numpy as np
495

496
# Generate test scientific dataset
497
data = np.random.exponential(2.0, (512, 512, 64)).astype(np.float32)
498

499
# Test different error tolerances with ZFP
500
tolerances = [0.001, 0.01, 0.1, 1.0]
501
for tol in tolerances:
502
    compressed = imagecodecs.zfp_encode(data, tolerance=tol)
503
    decoded = imagecodecs.zfp_decode(compressed, shape=data.shape, dtype=data.dtype)
504
    
505
    compression_ratio = data.nbytes / len(compressed)
506
    max_error = np.max(np.abs(data - decoded))
507
    mse = np.mean((data - decoded) ** 2)
508
    
509
    print(f"Tolerance {tol:5.3f}: {compression_ratio:5.1f}x compression, "
510
          f"max error {max_error:.3f}, MSE {mse:.6f}")
511

512
# Test different bit rates with ZFP  
513
rates = [1.0, 2.0, 4.0, 8.0]
514
for rate in rates:
515
    compressed = imagecodecs.zfp_encode(data, rate=rate)
516
    decoded = imagecodecs.zfp_decode(compressed, shape=data.shape, dtype=data.dtype)
517
    
518
    actual_rate = len(compressed) * 8 / data.size
519
    max_error = np.max(np.abs(data - decoded))
520
    
521
    print(f"Target rate {rate:3.1f} bpv: actual {actual_rate:.1f} bpv, "
522
          f"max error {max_error:.3f}")
523
```
524

525
## Performance Considerations
526

527
### Algorithm Selection
528
- **ZFP**: Best for regular grids, configurable precision/rate/tolerance
529
- **SPERR**: Optimized for 2D/3D scientific datasets, excellent compression ratios  
530
- **SZ3**: High performance, good for large datasets
531
- **LERC**: Designed for raster/GIS data, wide format support
532
- **SZIP**: NASA standard, excellent for satellite/remote sensing data
533

534
### Optimization Guidelines
535
- Use floating-point predictor before general compression for smooth data
536
- Choose error tolerance based on measurement precision
537
- Consider data characteristics (smooth vs noisy, regular vs irregular)
538
- Balance compression ratio vs reconstruction speed for your use case
539

540
### Memory Management
541
- Pre-allocate output buffers for large datasets
542
- Process data in chunks for memory-constrained environments
543
- Use appropriate data types (float32 vs float64) based on precision needs
544

545
## Constants and Configuration
546

547
### ZFP Constants
548

549
```python { .api }
550
class ZFP:
551
    available: bool
552
    
553
    class EXEC:
554
        SERIAL = 0
555
        OMP = 1     # OpenMP parallel execution
556
        CUDA = 2    # CUDA GPU execution
557
    
558
    class MODE:
559
        EXPERT = 0      # Expert mode with custom parameters
560
        FIXED_RATE = 1  # Fixed bit rate mode
561
        FIXED_PRECISION = 2  # Fixed precision mode  
562
        FIXED_ACCURACY = 3   # Fixed accuracy/tolerance mode
563
```
564

565
### SPERR Constants
566

567
```python { .api }
568
class SPERR:
569
    available: bool
570
    
571
    class MODE:
572
        RATE = 'rate'   # Fixed bit rate
573
        PSNR = 'psnr'   # Peak signal-to-noise ratio  
574
        PWE = 'pwe'     # Point-wise error bound
575
```
576

577
## Error Handling
578

579
```python { .api }
580
class ZfpError(Exception):
581
    """ZFP codec exception."""
582

583
class SperrError(Exception):
584
    """SPERR codec exception."""
585

586
class Sz3Error(Exception):
587
    """SZ3 codec exception."""
588

589
class FloatpredError(Exception):
590
    """Floating-point predictor exception."""
591

592
class LercError(Exception):
593
    """LERC codec exception."""
594

595
class SzipError(Exception):
596
    """SZIP codec exception."""
597

598
class PcodecError(Exception):
599
    """PCODEC codec exception."""
600
```

Version

Tile

Files

scientific-compression.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

scientific-compression.mddocs/