or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.md

index.mddocs/

0

# Blosc

1

2

A high-performance compression library wrapper providing Python bindings for the Blosc compression library. Optimized for compressing binary and numerical data with multiple compression algorithms (blosclz, lz4, lz4hc, snappy, zlib, zstd) and configurable shuffling filters for optimal performance on time series, sparse data, and regular-spaced numerical arrays.

3

4

## Package Information

5

6

- **Package Name**: blosc

7

- **Language**: Python

8

- **Installation**: `pip install blosc`

9

- **Supported Python**: 3.9+

10

11

## Core Imports

12

13

```python

14

import blosc

15

```

16

17

## Basic Usage

18

19

```python

20

import blosc

21

import array

22

23

# Basic compression and decompression

24

data = b'0123456789' * 1000

25

compressed = blosc.compress(data, typesize=1)

26

decompressed = blosc.decompress(compressed)

27

28

# Working with numerical arrays

29

a = array.array('i', range(1000000))

30

a_bytes = a.tobytes()

31

compressed_array = blosc.compress(a_bytes, typesize=4, cname='lz4')

32

decompressed_array = blosc.decompress(compressed_array)

33

34

# Configuration

35

blosc.set_nthreads(4) # Use 4 threads

36

blosc.set_blocksize(0) # Automatic blocksize

37

38

# Get compression information

39

nbytes, cbytes, blocksize = blosc.get_cbuffer_sizes(compressed)

40

clib = blosc.get_clib(compressed)

41

```

42

43

## Capabilities

44

45

### Core Compression Functions

46

47

Primary compression and decompression operations supporting bytes-like objects with configurable compression parameters.

48

49

```python { .api }

50

def compress(bytesobj, typesize=8, clevel=9, shuffle=blosc.SHUFFLE, cname='blosclz'):

51

"""

52

Compress bytesobj with specified parameters.

53

54

Parameters:

55

- bytesobj: bytes-like object supporting buffer interface

56

- typesize: int, data type size (1-255)

57

- clevel: int, compression level 0-9 (0=no compression, 9=max)

58

- shuffle: int, shuffle filter (NOSHUFFLE, SHUFFLE, BITSHUFFLE)

59

- cname: str, compressor name ('blosclz', 'lz4', 'lz4hc', 'snappy', 'zlib', 'zstd')

60

61

Returns:

62

bytes: Compressed data

63

64

Raises:

65

TypeError: If bytesobj doesn't support buffer interface

66

ValueError: If parameters out of range or cname invalid

67

"""

68

69

def decompress(bytes_like, as_bytearray=False):

70

"""

71

Decompress bytes-like compressed object.

72

73

Parameters:

74

- bytes_like: bytes-like object with compressed data

75

- as_bytearray: bool, return bytearray instead of bytes

76

77

Returns:

78

bytes or bytearray: Decompressed data

79

80

Raises:

81

TypeError: If bytes_like doesn't support buffer protocol

82

"""

83

```

84

85

### Memory Pointer Functions

86

87

Low-level compression and decompression using memory addresses for integration with NumPy arrays and ctypes.

88

89

```python { .api }

90

def compress_ptr(address, items, typesize=8, clevel=9, shuffle=blosc.SHUFFLE, cname='blosclz'):

91

"""

92

Compress data at memory address.

93

94

Parameters:

95

- address: int, memory pointer to data

96

- items: int, number of items of typesize to compress

97

- typesize: int, size of each data item

98

- clevel: int, compression level 0-9

99

- shuffle: int, shuffle filter

100

- cname: str, compressor name

101

102

Returns:

103

bytes: Compressed data

104

105

Raises:

106

TypeError: If address not int

107

ValueError: If items negative or total size exceeds limits

108

"""

109

110

def decompress_ptr(bytes_like, address):

111

"""

112

Decompress data directly into memory address.

113

114

Parameters:

115

- bytes_like: bytes-like object with compressed data

116

- address: int, memory pointer where to write decompressed data

117

118

Returns:

119

int: Number of bytes written

120

121

Raises:

122

TypeError: If address not int or bytes_like invalid

123

"""

124

```

125

126

### NumPy Array Functions

127

128

High-level functions for compressing and decompressing NumPy arrays using pickle serialization.

129

130

```python { .api }

131

def pack_array(array, clevel=9, shuffle=blosc.SHUFFLE, cname='blosclz'):

132

"""

133

Pack (compress) a NumPy array.

134

135

Parameters:

136

- array: ndarray, NumPy array to compress

137

- clevel: int, compression level 0-9

138

- shuffle: int, shuffle filter

139

- cname: str, compressor name

140

141

Returns:

142

bytes: Packed array data

143

144

Raises:

145

TypeError: If array doesn't have dtype and shape attributes

146

ValueError: If array size exceeds limits or parameters invalid

147

"""

148

149

def unpack_array(packed_array, **kwargs):

150

"""

151

Unpack (decompress) a packed NumPy array.

152

153

Parameters:

154

- packed_array: bytes, packed array data

155

- **kwargs: Additional parameters for pickle.loads

156

157

Returns:

158

ndarray: Decompressed NumPy array

159

160

Raises:

161

TypeError: If packed_array not bytes

162

"""

163

```

164

165

### Buffer Information Functions

166

167

Functions to inspect compressed buffer properties and validate compressed data.

168

169

```python { .api }

170

def get_cbuffer_sizes(bytesobj):

171

"""

172

Get information about compressed buffer.

173

174

Parameters:

175

- bytesobj: bytes, compressed buffer

176

177

Returns:

178

tuple: (uncompressed_bytes, compressed_bytes, blocksize)

179

"""

180

181

def cbuffer_validate(bytesobj):

182

"""

183

Validate compressed buffer safety.

184

185

Parameters:

186

- bytesobj: bytes, compressed buffer to validate

187

188

Returns:

189

bool: True if buffer is safe to decompress

190

"""

191

192

def get_clib(bytesobj):

193

"""

194

Get compression library name from compressed buffer.

195

196

Parameters:

197

- bytesobj: bytes, compressed buffer

198

199

Returns:

200

str: Name of compression library used

201

"""

202

```

203

204

### Configuration Functions

205

206

Functions to configure Blosc behavior including threading and block sizes.

207

208

```python { .api }

209

def set_nthreads(nthreads):

210

"""

211

Set number of threads for Blosc operations.

212

213

Parameters:

214

- nthreads: int, number of threads (1 to MAX_THREADS)

215

216

Returns:

217

int: Previous number of threads

218

219

Raises:

220

ValueError: If nthreads exceeds MAX_THREADS

221

"""

222

223

def set_blocksize(blocksize):

224

"""

225

Force specific blocksize (0 for automatic).

226

227

Parameters:

228

- blocksize: int, blocksize in bytes (0 for automatic)

229

"""

230

231

def get_blocksize():

232

"""

233

Get current blocksize setting.

234

235

Returns:

236

int: Current blocksize (0 means automatic)

237

"""

238

239

def set_releasegil(gilstate):

240

"""

241

Set whether to release Python GIL during operations.

242

243

Parameters:

244

- gilstate: bool, True to release GIL during compression/decompression

245

246

Returns:

247

bool: Previous GIL release state

248

"""

249

```

250

251

### Utility Functions

252

253

System detection, resource management, and version information functions.

254

255

```python { .api }

256

def detect_number_of_cores():

257

"""

258

Detect number of CPU cores in system.

259

260

Returns:

261

int: Number of cores detected

262

"""

263

264

def free_resources():

265

"""

266

Free memory temporaries and thread resources.

267

268

Returns:

269

None

270

"""

271

272

def print_versions():

273

"""

274

Print versions of blosc and all dependencies.

275

276

Returns:

277

None

278

"""

279

```

280

281

### Compressor Information Functions

282

283

Functions to query available compressors and their properties.

284

285

```python { .api }

286

def compressor_list():

287

"""

288

Get list of available compressors.

289

290

Returns:

291

list: List of compressor names

292

"""

293

294

def code_to_name(code):

295

"""

296

Convert compressor code to name.

297

298

Parameters:

299

- code: int, compressor code

300

301

Returns:

302

str: Compressor name

303

"""

304

305

def name_to_code(name):

306

"""

307

Convert compressor name to code.

308

309

Parameters:

310

- name: str, compressor name

311

312

Returns:

313

int: Compressor code

314

"""

315

316

def clib_info(cname):

317

"""

318

Get compression library information.

319

320

Parameters:

321

- cname: str, compressor name

322

323

Returns:

324

tuple: (library_name, version)

325

"""

326

```

327

328

### Testing Function

329

330

```python { .api }

331

def test():

332

"""

333

Run blosc test suite.

334

335

Returns:

336

None

337

"""

338

```

339

340

### Low-Level Functions

341

342

Functions for initializing and cleaning up Blosc resources (called automatically):

343

344

```python { .api }

345

def init():

346

"""

347

Initialize Blosc library.

348

349

Returns:

350

None

351

352

Note: Called automatically on package import

353

"""

354

355

def destroy():

356

"""

357

Destroy Blosc resources and cleanup.

358

359

Returns:

360

None

361

362

Note: Called automatically on program exit

363

"""

364

```

365

366

## Constants

367

368

### Version Information

369

370

```python { .api }

371

__version__: str # Python package version

372

VERSION_STRING: str # Blosc C library version

373

VERSION_DATE: str # Blosc C library date

374

blosclib_version: str # Combined version string

375

```

376

377

### Size Limits

378

379

```python { .api }

380

MAX_BUFFERSIZE: int # Maximum buffer size for compression

381

MAX_THREADS: int # Maximum number of threads

382

MAX_TYPESIZE: int # Maximum type size (255)

383

```

384

385

### Shuffle Filters

386

387

```python { .api }

388

NOSHUFFLE: int # No shuffle filter (0)

389

SHUFFLE: int # Byte shuffle filter (1)

390

BITSHUFFLE: int # Bit shuffle filter (2)

391

```

392

393

### Legacy Constants

394

395

Backward compatibility constants with BLOSC_ prefix:

396

397

```python { .api }

398

BLOSC_VERSION_STRING: str # Alias for VERSION_STRING

399

BLOSC_VERSION_DATE: str # Alias for VERSION_DATE

400

BLOSC_MAX_BUFFERSIZE: int # Alias for MAX_BUFFERSIZE

401

BLOSC_MAX_THREADS: int # Alias for MAX_THREADS

402

BLOSC_MAX_TYPESIZE: int # Alias for MAX_TYPESIZE

403

```

404

405

## Runtime Variables

406

407

Current state variables updated by configuration functions:

408

409

```python { .api }

410

nthreads: int # Current number of threads in use

411

ncores: int # Number of cores detected on system

412

cnames: list # List of available compressor names

413

cname2clib: dict # Map compressor names to libraries

414

clib_versions: dict # Map libraries to versions

415

filters: dict # Map shuffle constants to string names

416

```

417

418

## Error Handling

419

420

Common exceptions raised by blosc functions:

421

422

- **TypeError**: Raised when input doesn't support buffer protocol or address not int

423

- **ValueError**: Raised when parameters are out of valid ranges:

424

- `clevel` not in 0-9 range

425

- `typesize` not in 1-MAX_TYPESIZE range

426

- `cname` not in available compressors

427

- `shuffle` not NOSHUFFLE, SHUFFLE, or BITSHUFFLE

428

- `nthreads` exceeds MAX_THREADS

429

- Buffer size exceeds MAX_BUFFERSIZE

430

431

## Performance Notes

432

433

- **Shuffle filters**: SHUFFLE works best for integer data, BITSHUFFLE for floating-point

434

- **Compressor selection**: 'lz4' for speed, 'zstd' for compression ratio, 'blosclz' for balance

435

- **Threading**: Optimal thread count often slightly below CPU core count

436

- **Block size**: Automatic sizing (0) usually optimal, manual sizing for expert use

437

- **GIL release**: Beneficial for large chunks with ThreadPool, small penalty for small blocks

438

- **Type size**: Should match actual data type size for optimal shuffle performance