or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

advanced-compression.mdadvanced-decompression.mdbuffer-operations.mddictionary-compression.mdframe-analysis.mdindex.mdsimple-operations.md

buffer-operations.mddocs/

0

# Buffer Operations

1

2

Advanced buffer management for zero-copy operations, efficient batch processing, and high-performance data handling in compression and decompression workflows.

3

4

## Capabilities

5

6

### Buffer Segments

7

8

Individual buffer segments that provide efficient access to portions of larger buffers without copying data.

9

10

```python { .api }

11

class BufferSegment:

12

@property

13

def offset(self) -> int:

14

"""Offset of this segment within the parent buffer."""

15

16

def __len__(self) -> int:

17

"""Get segment length in bytes."""

18

19

def tobytes(self) -> bytes:

20

"""

21

Convert segment to bytes.

22

23

Returns:

24

bytes: Copy of segment data

25

"""

26

```

27

28

**Usage Example:**

29

30

```python

31

import zstandard as zstd

32

33

# Buffer segments are typically returned by compression operations

34

compressor = zstd.ZstdCompressor()

35

result = compressor.multi_compress_to_buffer([b"data1", b"data2", b"data3"])

36

37

# Access individual segments

38

for i, segment in enumerate(result):

39

print(f"Segment {i}: offset={segment.offset}, length={len(segment)}")

40

data = segment.tobytes()

41

process_data(data)

42

```

43

44

### Buffer Collections

45

46

Collections of buffer segments that provide efficient iteration and access patterns.

47

48

```python { .api }

49

class BufferSegments:

50

def __len__(self) -> int:

51

"""Get number of segments in collection."""

52

53

def __getitem__(self, i: int) -> BufferSegment:

54

"""

55

Get segment by index.

56

57

Parameters:

58

- i: int, segment index

59

60

Returns:

61

BufferSegment: Segment at index

62

"""

63

```

64

65

**Usage Example:**

66

67

```python

68

import zstandard as zstd

69

70

# BufferSegments collections are returned by some operations

71

compressor = zstd.ZstdCompressor()

72

result = compressor.multi_compress_to_buffer([b"data1", b"data2"])

73

74

# Iterate over segments

75

for segment in result:

76

data = segment.tobytes()

77

print(f"Segment data: {len(data)} bytes")

78

79

# Access by index

80

first_segment = result[0]

81

second_segment = result[1]

82

```

83

84

### Buffers with Segments

85

86

Buffers that contain multiple segments, providing both the raw data and segment boundary information.

87

88

```python { .api }

89

class BufferWithSegments:

90

@property

91

def size(self) -> int:

92

"""Total buffer size in bytes."""

93

94

def __init__(self, data: bytes, segments: bytes):

95

"""

96

Create buffer with segment information.

97

98

Parameters:

99

- data: bytes, raw buffer data

100

- segments: bytes, segment boundary information

101

"""

102

103

def __len__(self) -> int:

104

"""Get number of segments."""

105

106

def __getitem__(self, i: int) -> BufferSegment:

107

"""

108

Get segment by index.

109

110

Parameters:

111

- i: int, segment index

112

113

Returns:

114

BufferSegment: Segment at index

115

"""

116

117

def segments(self):

118

"""Get segments iterator."""

119

120

def tobytes(self) -> bytes:

121

"""

122

Convert entire buffer to bytes.

123

124

Returns:

125

bytes: Complete buffer data

126

"""

127

```

128

129

**Usage Example:**

130

131

```python

132

import zstandard as zstd

133

134

# Create buffer with segments manually (advanced usage)

135

data = b"concatenated data from multiple sources"

136

# segments contains boundary information (format is internal)

137

segments = b"..." # segment boundary data

138

139

buffer = zstd.BufferWithSegments(data, segments)

140

141

print(f"Buffer size: {buffer.size} bytes")

142

print(f"Number of segments: {len(buffer)}")

143

144

# Access segments

145

for i in range(len(buffer)):

146

segment = buffer[i]

147

segment_data = segment.tobytes()

148

print(f"Segment {i}: {len(segment_data)} bytes")

149

150

# Get all data

151

all_data = buffer.tobytes()

152

```

153

154

### Buffer Collections

155

156

Collections of multiple buffers with segments, used for batch operations and efficient data management.

157

158

```python { .api }

159

class BufferWithSegmentsCollection:

160

def __init__(self, *args):

161

"""

162

Create collection of buffers with segments.

163

164

Parameters:

165

- *args: BufferWithSegments objects

166

"""

167

168

def __len__(self) -> int:

169

"""Get number of buffers in collection."""

170

171

def __getitem__(self, i: int) -> BufferSegment:

172

"""

173

Get segment by global index across all buffers.

174

175

Parameters:

176

- i: int, global segment index

177

178

Returns:

179

BufferSegment: Segment at index

180

"""

181

182

def size(self) -> int:

183

"""

184

Get total size of all buffers.

185

186

Returns:

187

int: Total size in bytes

188

"""

189

```

190

191

**Usage Example:**

192

193

```python

194

import zstandard as zstd

195

196

# Collections are typically returned by multi-threaded operations

197

compressor = zstd.ZstdCompressor()

198

data_items = [b"item1", b"item2", b"item3", b"item4"]

199

200

# Multi-compress returns a collection

201

collection = compressor.multi_compress_to_buffer(data_items, threads=2)

202

203

print(f"Collection size: {collection.size()} bytes")

204

print(f"Number of items: {len(collection)}")

205

206

# Access compressed items

207

for i in range(len(collection)):

208

segment = collection[i]

209

compressed_data = segment.tobytes()

210

print(f"Item {i}: {len(compressed_data)} bytes compressed")

211

```

212

213

### Batch Compression with Buffers

214

215

Efficient batch compression that returns results in buffer collections for optimal memory usage.

216

217

```python { .api }

218

class ZstdCompressor:

219

def multi_compress_to_buffer(

220

self,

221

data,

222

threads: int = 0

223

) -> BufferWithSegmentsCollection:

224

"""

225

Compress multiple data items to buffer collection.

226

227

Parameters:

228

- data: list[bytes], BufferWithSegments, or BufferWithSegmentsCollection

229

- threads: int, number of threads (0 = auto)

230

231

Returns:

232

BufferWithSegmentsCollection: Compressed data in buffer collection

233

"""

234

```

235

236

**Usage Example:**

237

238

```python

239

import zstandard as zstd

240

241

compressor = zstd.ZstdCompressor(level=5)

242

243

# Prepare data for batch compression

244

documents = [

245

b'{"id": 1, "text": "First document"}',

246

b'{"id": 2, "text": "Second document"}',

247

b'{"id": 3, "text": "Third document"}',

248

b'{"id": 4, "text": "Fourth document"}'

249

]

250

251

# Compress in parallel

252

result = compressor.multi_compress_to_buffer(documents, threads=4)

253

254

# Process results efficiently

255

total_original = sum(len(doc) for doc in documents)

256

total_compressed = result.size()

257

258

print(f"Compressed {total_original} bytes to {total_compressed} bytes")

259

print(f"Compression ratio: {total_original/total_compressed:.2f}:1")

260

261

# Extract individual compressed documents

262

compressed_docs = []

263

for i in range(len(result)):

264

segment = result[i]

265

compressed_docs.append(segment.tobytes())

266

```

267

268

### Batch Decompression with Buffers

269

270

Efficient batch decompression using buffer collections for high-throughput processing.

271

272

```python { .api }

273

class ZstdDecompressor:

274

def multi_decompress_to_buffer(

275

self,

276

frames,

277

decompressed_sizes: bytes = b"",

278

threads: int = 0

279

) -> BufferWithSegmentsCollection:

280

"""

281

Decompress multiple frames to buffer collection.

282

283

Parameters:

284

- frames: list[bytes], BufferWithSegments, or BufferWithSegmentsCollection

285

- decompressed_sizes: bytes, expected decompressed sizes (optional optimization)

286

- threads: int, number of threads (0 = auto)

287

288

Returns:

289

BufferWithSegmentsCollection: Decompressed data in buffer collection

290

"""

291

```

292

293

**Usage Example:**

294

295

```python

296

import zstandard as zstd

297

298

decompressor = zstd.ZstdDecompressor()

299

300

# Compressed frames from previous example

301

compressed_frames = compressed_docs

302

303

# Decompress in parallel

304

result = decompressor.multi_decompress_to_buffer(compressed_frames, threads=4)

305

306

print(f"Decompressed {len(compressed_frames)} frames")

307

print(f"Total decompressed size: {result.size()} bytes")

308

309

# Extract decompressed data

310

decompressed_docs = []

311

for i in range(len(result)):

312

segment = result[i]

313

decompressed_docs.append(segment.tobytes())

314

315

# Verify round-trip

316

for i, (original, decompressed) in enumerate(zip(documents, decompressed_docs)):

317

assert original == decompressed, f"Mismatch in document {i}"

318

```

319

320

### Zero-Copy Operations

321

322

Advanced usage patterns that minimize memory copying for maximum performance.

323

324

**Usage Example:**

325

326

```python

327

import zstandard as zstd

328

329

def process_large_dataset(data_items):

330

"""Process large dataset with minimal memory copying."""

331

compressor = zstd.ZstdCompressor(level=3)

332

333

# Compress in batches to manage memory

334

batch_size = 1000

335

all_results = []

336

337

for i in range(0, len(data_items), batch_size):

338

batch = data_items[i:i+batch_size]

339

340

# Multi-compress returns BufferWithSegmentsCollection

341

compressed_batch = compressor.multi_compress_to_buffer(batch, threads=4)

342

343

# Process segments without copying unless necessary

344

for j in range(len(compressed_batch)):

345

segment = compressed_batch[j]

346

347

# Only copy if we need to persist the data

348

if need_to_store(j):

349

data = segment.tobytes()

350

store_data(i + j, data)

351

else:

352

# Use segment directly for temporary operations

353

process_segment_in_place(segment)

354

355

return all_results

356

357

def stream_compress_with_buffers(input_stream, output_stream):

358

"""Stream compression using buffers for efficiency."""

359

compressor = zstd.ZstdCompressor()

360

361

# Read chunks and compress in batches

362

chunks = []

363

chunk_size = 64 * 1024 # 64KB chunks

364

365

while True:

366

chunk = input_stream.read(chunk_size)

367

if not chunk:

368

break

369

370

chunks.append(chunk)

371

372

# Process in batches of 100 chunks

373

if len(chunks) >= 100:

374

result = compressor.multi_compress_to_buffer(chunks, threads=2)

375

376

# Write compressed data

377

for i in range(len(result)):

378

segment = result[i]

379

output_stream.write(segment.tobytes())

380

381

chunks = []

382

383

# Process remaining chunks

384

if chunks:

385

result = compressor.multi_compress_to_buffer(chunks, threads=2)

386

for i in range(len(result)):

387

segment = result[i]

388

output_stream.write(segment.tobytes())

389

```

390

391

### Memory Management

392

393

Buffer operations provide efficient memory usage patterns for high-performance applications.

394

395

**Memory Usage Example:**

396

397

```python

398

import zstandard as zstd

399

400

def analyze_buffer_memory():

401

"""Analyze memory usage of buffer operations."""

402

compressor = zstd.ZstdCompressor()

403

404

# Large dataset

405

data = [b"x" * 1024 for _ in range(1000)] # 1000 x 1KB items

406

407

print(f"Original data: {sum(len(item) for item in data)} bytes")

408

print(f"Compressor memory: {compressor.memory_size()} bytes")

409

410

# Compress to buffer collection

411

result = compressor.multi_compress_to_buffer(data, threads=4)

412

413

print(f"Compressed size: {result.size()} bytes")

414

print(f"Number of segments: {len(result)}")

415

416

# Efficient iteration without copying

417

for i, segment in enumerate(result):

418

# segment.tobytes() copies data - avoid if possible

419

size = len(segment) # No copy required

420

offset = segment.offset # No copy required

421

422

if i < 5: # Show first few

423

print(f"Segment {i}: size={size}, offset={offset}")

424

```

425

426

## Performance Considerations

427

428

- Buffer operations minimize memory copying for better performance

429

- Multi-threaded operations return buffer collections for efficient parallel processing

430

- Segments provide zero-copy access to portions of larger buffers

431

- Use `tobytes()` only when you need a copy of the data

432

- Buffer collections enable efficient batch processing of large datasets

433

- Memory usage is optimized for high-throughput scenarios