Tessl Tile for pypi/imagecodecs@2025.8.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

array-processing.md color-management.md image-formats.md image-io.md index.md lossless-compression.md scientific-compression.md utilities.md

lossless-compression.mddocs/

0
# Lossless Compression
1

2
General-purpose lossless compression algorithms optimized for different data types and use cases. These codecs provide high-performance compression without data loss, making them ideal for scientific computing, data archival, and scenarios where exact data reconstruction is required.
3

4
## Capabilities
5

6
### ZLIB/DEFLATE Compression
7

8
Industry-standard deflate compression with zlib wrapper, widely compatible and efficient for general-purpose data compression.
9

10
```python { .api }
11
def zlib_encode(data, level=None, *, out=None):
12
    """
13
    Return ZLIB encoded data.
14
    
15
    Parameters:
16
    - data: bytes | bytearray | mmap.mmap - Input data to compress
17
    - level: int | None - Compression level (0-9, default 6). Higher values = better compression, slower speed
18
    - out: bytes | bytearray | None - Pre-allocated output buffer
19
    
20
    Returns:
21
    bytes | bytearray: ZLIB compressed data with header and checksum
22
    """
23

24
def zlib_decode(data, *, out=None):
25
    """
26
    Return decoded ZLIB data.
27
    
28
    Parameters:
29
    - data: bytes | bytearray | mmap.mmap - ZLIB compressed data
30
    - out: bytes | bytearray | None - Pre-allocated output buffer
31
    
32
    Returns:
33
    bytes | bytearray: Decompressed data
34
    """
35

36
def zlib_check(data):
37
    """
38
    Check if data is ZLIB encoded.
39
    
40
    Parameters:
41
    - data: bytes | bytearray | mmap.mmap - Data to check
42
    
43
    Returns:
44
    bool | None: True if ZLIB format detected, None if uncertain
45
    """
46

47
def zlib_crc32(data, value=None):
48
    """
49
    Return CRC32 checksum.
50
    
51
    Parameters:
52
    - data: bytes | bytearray | mmap.mmap - Data to checksum
53
    - value: int | None - Initial CRC value for incremental calculation
54
    
55
    Returns:
56
    int: CRC32 checksum value
57
    """
58

59
def zlib_adler32(data, value=None):
60
    """
61
    Return Adler-32 checksum.
62
    
63
    Parameters:
64
    - data: bytes | bytearray | mmap.mmap - Data to checksum  
65
    - value: int | None - Initial Adler-32 value for incremental calculation
66
    
67
    Returns:
68
    int: Adler-32 checksum value
69
    """
70
```
71

72
### GZIP Compression
73

74
GZIP format compression compatible with gzip command-line tool and HTTP compression.
75

76
```python { .api }
77
def gzip_encode(data, level=None, *, out=None):
78
    """
79
    Return GZIP encoded data.
80
    
81
    Parameters:
82
    - data: bytes | bytearray | mmap.mmap - Input data to compress
83
    - level: int | None - Compression level (0-9, default 6)
84
    - out: bytes | bytearray | None - Pre-allocated output buffer
85
    
86
    Returns:
87
    bytes | bytearray: GZIP compressed data with header and trailer
88
    """
89

90
def gzip_decode(data, *, out=None):
91
    """
92
    Return decoded GZIP data.
93
    
94
    Parameters:
95
    - data: bytes | bytearray | mmap.mmap - GZIP compressed data
96
    - out: bytes | bytearray | None - Pre-allocated output buffer
97
    
98
    Returns:
99
    bytes | bytearray: Decompressed data
100
    """
101

102
def gzip_check(data):
103
    """
104
    Check if data is GZIP encoded.
105
    
106
    Parameters:
107
    - data: bytes | bytearray | mmap.mmap - Data to check
108
    
109
    Returns:
110
    bool: True if GZIP magic number detected
111
    """
112
```
113

114
### BLOSC High-Performance Compression
115

116
Columnar storage compressor optimized for numerical data with multi-threading and multiple compression algorithms.
117

118
```python { .api }
119
def blosc_encode(data, level=None, *, compressor=None, shuffle=None, typesize=None, blocksize=None, numthreads=None, out=None):
120
    """
121
    Return BLOSC encoded data.
122
    
123
    Parameters:
124
    - data: bytes | bytearray | mmap.mmap - Input data to compress
125
    - level: int | None - Compression level (0-9, default 5)
126
    - compressor: str | None - Compression algorithm:
127
        'blosclz' (default), 'lz4', 'lz4hc', 'snappy', 'zlib', 'zstd'
128
    - shuffle: int | None - Shuffle filter:
129
        0 = no shuffle, 1 = byte shuffle, 2 = bit shuffle
130
    - typesize: int | None - Element size in bytes for shuffle optimization
131
    - blocksize: int | None - Block size in bytes (default auto-determined)
132
    - numthreads: int | None - Number of threads for compression
133
    - out: bytes | bytearray | None - Pre-allocated output buffer
134
    
135
    Returns:
136
    bytes | bytearray: BLOSC compressed data
137
    """
138

139
def blosc_decode(data, *, numthreads=None, out=None):
140
    """
141
    Return decoded BLOSC data.
142
    
143
    Parameters:
144
    - data: bytes | bytearray | mmap.mmap - BLOSC compressed data
145
    - numthreads: int | None - Number of threads for decompression
146
    - out: bytes | bytearray | None - Pre-allocated output buffer
147
    
148
    Returns:
149
    bytes | bytearray: Decompressed data
150
    """
151

152
def blosc_check(data):
153
    """
154
    Check if data is BLOSC encoded.
155
    
156
    Parameters:
157
    - data: bytes | bytearray | mmap.mmap - Data to check
158
    
159
    Returns:
160
    None: Always returns None (format detected by attempting decompression)
161
    """
162
```
163

164
### ZSTD (ZStandard) Compression
165

166
Modern compression algorithm providing excellent compression ratios with fast decompression speeds.
167

168
```python { .api }
169
def zstd_encode(data, level=None, *, out=None):
170
    """
171
    Return ZSTD encoded data.
172
    
173
    Parameters:
174
    - data: bytes | bytearray | mmap.mmap - Input data to compress
175
    - level: int | None - Compression level (1-22, default 3). 
176
                         Higher values = better compression, slower speed
177
    - out: bytes | bytearray | None - Pre-allocated output buffer
178
    
179
    Returns:
180
    bytes | bytearray: ZSTD compressed data
181
    """
182

183
def zstd_decode(data, *, out=None):
184
    """
185
    Return decoded ZSTD data.
186
    
187
    Parameters:
188
    - data: bytes | bytearray | mmap.mmap - ZSTD compressed data
189
    - out: bytes | bytearray | None - Pre-allocated output buffer
190
    
191
    Returns:
192
    bytes | bytearray: Decompressed data
193
    """
194

195
def zstd_check(data):
196
    """
197
    Check if data is ZSTD encoded.
198
    
199
    Parameters:
200
    - data: bytes | bytearray | mmap.mmap - Data to check
201
    
202
    Returns:
203
    bool | None: True if ZSTD magic number detected
204
    """
205
```
206

207
### LZ4 Fast Compression
208

209
Ultra-fast compression algorithm optimized for speed over compression ratio.
210

211
```python { .api }
212
def lz4_encode(data, level=None, *, out=None):
213
    """
214
    Return LZ4 encoded data.
215
    
216
    Parameters:
217
    - data: bytes | bytearray | mmap.mmap - Input data to compress
218
    - level: int | None - Compression level (1-12, default 1).
219
                         Higher values = better compression, slower speed
220
    - out: bytes | bytearray | None - Pre-allocated output buffer
221
    
222
    Returns:
223
    bytes | bytearray: LZ4 compressed data
224
    """
225

226
def lz4_decode(data, *, out=None):
227
    """
228
    Return decoded LZ4 data.
229
    
230
    Parameters:
231
    - data: bytes | bytearray | mmap.mmap - LZ4 compressed data
232
    - out: bytes | bytearray | None - Pre-allocated output buffer (size must be known)
233
    
234
    Returns:
235
    bytes | bytearray: Decompressed data
236
    """
237

238
def lz4_check(data):
239
    """
240
    Check if data is LZ4 encoded.
241
    
242
    Parameters:
243
    - data: bytes | bytearray | mmap.mmap - Data to check
244
    
245
    Returns:
246
    bool | None: True if LZ4 magic number detected
247
    """
248
```
249

250
### LZ4F Frame Format
251

252
LZ4 compression with frame format that includes metadata and content checksums for safe streaming.
253

254
```python { .api }
255
def lz4f_encode(data, level=None, *, out=None):
256
    """
257
    Return LZ4F (LZ4 Frame format) encoded data.
258
    
259
    Parameters:
260
    - data: bytes | bytearray | mmap.mmap - Input data to compress
261
    - level: int | None - Compression level (0-12, default 0)
262
    - out: bytes | bytearray | None - Pre-allocated output buffer
263
    
264
    Returns:
265
    bytes | bytearray: LZ4F compressed data with frame header and footer
266
    """
267

268
def lz4f_decode(data, *, out=None):
269
    """
270
    Return decoded LZ4F data.
271
    
272
    Parameters:
273
    - data: bytes | bytearray | mmap.mmap - LZ4F compressed data
274
    - out: bytes | bytearray | None - Pre-allocated output buffer
275
    
276
    Returns:
277
    bytes | bytearray: Decompressed data
278
    """
279

280
def lz4f_check(data):
281
    """
282
    Check if data is LZ4F encoded.
283
    
284
    Parameters:
285
    - data: bytes | bytearray | mmap.mmap - Data to check
286
    
287
    Returns:
288
    bool | None: True if LZ4F magic number detected
289
    """
290
```
291

292
### LZMA/XZ Compression
293

294
High compression ratio algorithm used in 7-Zip and XZ utilities.
295

296
```python { .api }
297
def lzma_encode(data, level=None, *, out=None):
298
    """
299
    Return LZMA encoded data.
300
    
301
    Parameters:
302
    - data: bytes | bytearray | mmap.mmap - Input data to compress
303
    - level: int | None - Compression level (0-9, default 6)
304
    - out: bytes | bytearray | None - Pre-allocated output buffer
305
    
306
    Returns:
307
    bytes | bytearray: LZMA compressed data
308
    """
309

310
def lzma_decode(data, *, out=None):
311
    """
312
    Return decoded LZMA data.
313
    
314
    Parameters:
315
    - data: bytes | bytearray | mmap.mmap - LZMA compressed data
316
    - out: bytes | bytearray | None - Pre-allocated output buffer
317
    
318
    Returns:
319
    bytes | bytearray: Decompressed data
320
    """
321

322
def lzma_check(data):
323
    """
324
    Check if data is LZMA encoded.
325
    
326
    Parameters:
327
    - data: bytes | bytearray | mmap.mmap - Data to check
328
    
329
    Returns:
330
    bool | None: True if LZMA signature detected
331
    """
332
```
333

334
### BROTLI Compression
335

336
Google's compression algorithm optimized for web content and text compression.
337

338
```python { .api }
339
def brotli_encode(data, level=None, *, mode=None, lgwin=None, out=None):
340
    """
341
    Return BROTLI encoded data.
342
    
343
    Parameters:
344
    - data: bytes | bytearray | mmap.mmap - Input data to compress
345
    - level: int | None - Compression level (0-11, default 6)
346
    - mode: int | None - Compression mode (0=generic, 1=text, 2=font)
347
    - lgwin: int | None - Window size (10-24, default 22)
348
    - out: bytes | bytearray | None - Pre-allocated output buffer
349
    
350
    Returns:
351
    bytes | bytearray: BROTLI compressed data
352
    """
353

354
def brotli_decode(data, *, out=None):
355
    """
356
    Return decoded BROTLI data.
357
    
358
    Parameters:
359
    - data: bytes | bytearray | mmap.mmap - BROTLI compressed data
360
    - out: bytes | bytearray | None - Pre-allocated output buffer
361
    
362
    Returns:
363
    bytes | bytearray: Decompressed data
364
    """
365

366
def brotli_check(data):
367
    """
368
    Check if data is BROTLI encoded.
369
    
370
    Parameters:
371
    - data: bytes | bytearray | mmap.mmap - Data to check
372
    
373
    Returns:
374
    None: Always returns None (no reliable magic number)
375
    """
376
```
377

378
### SNAPPY Compression
379

380
Fast compression algorithm developed by Google for high-speed compression/decompression.
381

382
```python { .api }
383
def snappy_encode(data, *, out=None):
384
    """
385
    Return SNAPPY encoded data.
386
    
387
    Parameters:
388
    - data: bytes | bytearray | mmap.mmap - Input data to compress
389
    - out: bytes | bytearray | None - Pre-allocated output buffer
390
    
391
    Returns:
392
    bytes | bytearray: SNAPPY compressed data
393
    """
394

395
def snappy_decode(data, *, out=None):
396
    """
397
    Return decoded SNAPPY data.
398
    
399
    Parameters:
400
    - data: bytes | bytearray | mmap.mmap - SNAPPY compressed data
401
    - out: bytes | bytearray | None - Pre-allocated output buffer
402
    
403
    Returns:
404
    bytes | bytearray: Decompressed data
405
    """
406

407
def snappy_check(data):
408
    """
409
    Check if data is SNAPPY encoded.
410
    
411
    Parameters:
412
    - data: bytes | bytearray | mmap.mmap - Data to check
413
    
414
    Returns:
415
    None: Always returns None (no magic number)
416
    """
417
```
418

419
## Usage Patterns
420

421
### Basic Compression
422

423
```python
424
import imagecodecs
425
import numpy as np
426

427
# Compress array data
428
data = np.random.randint(0, 256, 10000, dtype=np.uint8).tobytes()
429

430
# Try different algorithms
431
zlib_compressed = imagecodecs.zlib_encode(data, level=9)
432
zstd_compressed = imagecodecs.zstd_encode(data, level=3)
433
lz4_compressed = imagecodecs.lz4_encode(data, level=1)
434

435
print(f"Original size: {len(data)}")
436
print(f"ZLIB size: {len(zlib_compressed)} ({len(zlib_compressed)/len(data):.2%})")
437
print(f"ZSTD size: {len(zstd_compressed)} ({len(zstd_compressed)/len(data):.2%})")
438
print(f"LZ4 size: {len(lz4_compressed)} ({len(lz4_compressed)/len(data):.2%})")
439
```
440

441
### High-Performance Scientific Data
442

443
```python
444
import imagecodecs
445
import numpy as np
446

447
# Scientific array compression with BLOSC
448
data = np.random.random((1000, 1000)).astype(np.float32)
449
data_bytes = data.tobytes()
450

451
# Optimize for floating-point data
452
compressed = imagecodecs.blosc_encode(
453
    data_bytes,
454
    level=5,
455
    compressor='zstd',
456
    shuffle=1,  # Byte shuffle for better compression
457
    typesize=4,  # float32 = 4 bytes
458
    numthreads=4  # Multi-threaded compression
459
)
460

461
# Decompress with multi-threading
462
decompressed = imagecodecs.blosc_decode(compressed, numthreads=4)
463
recovered = np.frombuffer(decompressed, dtype=np.float32).reshape(1000, 1000)
464

465
assert np.array_equal(data, recovered)
466
print(f"Compression ratio: {len(compressed)/len(data_bytes):.2%}")
467
```
468

469
### Stream Processing
470

471
```python
472
import imagecodecs
473

474
# Incremental checksum calculation
475
crc = 0
476
adler = 1
477

478
data_chunks = [b"chunk1", b"chunk2", b"chunk3"]
479
for chunk in data_chunks:
480
    crc = imagecodecs.zlib_crc32(chunk, crc)
481
    adler = imagecodecs.zlib_adler32(chunk, adler)
482

483
print(f"Final CRC32: {crc:08x}")
484
print(f"Final Adler32: {adler:08x}")
485
```
486

487
## Constants and Configuration
488

489
### ZLIB Constants
490

491
```python { .api }
492
class ZLIB:
493
    available: bool = True
494
    
495
    class COMPRESSION:
496
        NO_COMPRESSION = 0
497
        BEST_SPEED = 1
498
        BEST_COMPRESSION = 9
499
        DEFAULT_COMPRESSION = 6
500
    
501
    class STRATEGY:
502
        DEFAULT_STRATEGY = 0
503
        FILTERED = 1
504
        HUFFMAN_ONLY = 2
505
        RLE = 3  
506
        FIXED = 4
507
```
508

509
### BLOSC Constants
510

511
```python { .api }
512
class BLOSC:
513
    available: bool
514
    
515
    class SHUFFLE:
516
        NOSHUFFLE = 0
517
        SHUFFLE = 1
518
        BITSHUFFLE = 2
519
    
520
    class COMPRESSOR:
521
        BLOSCLZ = 'blosclz'
522
        LZ4 = 'lz4'
523
        LZ4HC = 'lz4hc'
524
        SNAPPY = 'snappy'
525
        ZLIB = 'zlib'
526
        ZSTD = 'zstd'
527
```
528

529
### ZSTD Constants
530

531
```python { .api }
532
class ZSTD:
533
    available: bool
534
    
535
    class STRATEGY:
536
        FAST = 1
537
        DFAST = 2
538
        GREEDY = 3
539
        LAZY = 4
540
        LAZY2 = 5
541
        BTLAZY2 = 6
542
        BTOPT = 7
543
        BTULTRA = 8
544
        BTULTRA2 = 9
545
```
546

547
## Performance Guidelines
548

549
### Algorithm Selection
550
- **LZ4**: Fastest compression/decompression, moderate compression ratio
551
- **SNAPPY**: Very fast, good for real-time applications
552
- **ZLIB**: Balanced speed and compression, widely compatible
553
- **ZSTD**: Excellent compression ratio with good speed
554
- **BLOSC**: Best for numerical/scientific data with shuffle filters
555
- **BROTLI**: Best for text and web content
556
- **LZMA**: Highest compression ratio, slower speed
557

558
### Optimization Tips
559
- Use appropriate compression levels (higher = better compression, slower speed)
560
- Enable shuffle filters for BLOSC with numerical data
561
- Use multi-threading when available (BLOSC, JPEG XL, AVIF)
562
- Pre-allocate output buffers to reduce memory allocations
563
- Choose typesize parameter in BLOSC to match your data element size
564

565
### Memory Considerations
566
- Pre-allocate output buffers when processing large amounts of data
567
- Use memory-mapped input for very large files
568
- Consider streaming approaches for data larger than available RAM

Version

Tile

Files

lossless-compression.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

lossless-compression.mddocs/