or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

advanced-compression.mdcore-utilities.mdindex.mdstandard-compression.md

core-utilities.mddocs/

0

# Core Utilities

1

2

Core classes and utilities for buffer management and error handling in cramjam.

3

4

## Imports

5

6

```python { .api }

7

from cramjam import Buffer, File, CompressionError, DecompressionError, BufferProtocol

8

```

9

10

## Buffer Class

11

12

```python { .api }

13

class Buffer:

14

"""Buffer class implementing both readable and writable buffer protocols."""

15

16

def __init__(self, data: BufferProtocol | None = None, copy: bool | None = True) -> None:

17

"""Initialize buffer.

18

19

Args:

20

data: Anything implementing the buffer protocol (optional)

21

copy: Whether to make a copy of the provided data (default: True)

22

"""

23

```

24

25

The Buffer class provides a memory buffer with file-like interface for efficient data operations.

26

27

### Basic Usage

28

29

```python { .api }

30

import cramjam

31

32

# Create empty buffer

33

buffer = cramjam.Buffer()

34

35

# Create buffer from data

36

buffer = cramjam.Buffer(b"Hello World")

37

38

# Create buffer without copying (references original data)

39

data = bytearray(b"Original data")

40

buffer = cramjam.Buffer(data, copy=False)

41

```

42

43

### Read Operations

44

45

```python { .api }

46

def read(self, n_bytes: int | None = -1) -> bytes:

47

"""Read from buffer at current position.

48

49

Args:

50

n_bytes: Number of bytes to read, -1 for all remaining

51

52

Returns:

53

bytes: Data read from buffer

54

"""

55

56

def readinto(self, output: BufferProtocol) -> int:

57

"""Read from buffer into another buffer object.

58

59

Args:

60

output: Buffer protocol object to read data into

61

62

Returns:

63

int: Number of bytes read

64

"""

65

```

66

67

### Write Operations

68

69

```python { .api }

70

def write(self, input: BufferProtocol) -> int:

71

"""Write bytes to the buffer.

72

73

Args:

74

input: Data implementing Buffer Protocol to write

75

76

Returns:

77

int: Number of bytes written

78

"""

79

```

80

81

### Position Operations

82

83

```python { .api }

84

def seek(self, position: int, whence: int | None = 0) -> int:

85

"""Seek to position within the buffer.

86

87

Args:

88

position: Target position

89

whence: 0 (from start), 1 (from current), 2 (from end)

90

91

Returns:

92

int: New position

93

"""

94

95

def tell(self) -> int:

96

"""Get current position of the buffer."""

97

98

def seekable(self) -> bool:

99

"""Check if buffer is seekable (always True for compatibility)."""

100

```

101

102

### Size Operations

103

104

```python { .api }

105

def len(self) -> int:

106

"""Get length of the underlying buffer."""

107

108

def set_len(self, size: int) -> None:

109

"""Set buffer length. Truncates if smaller, null-fills if larger."""

110

111

def truncate(self) -> None:

112

"""Truncate the buffer."""

113

114

# Magic methods for convenience

115

def __len__(self) -> int:

116

"""Get buffer length."""

117

118

def __bool__(self) -> bool:

119

"""Check if buffer has content."""

120

```

121

122

### Memory Management

123

124

```python { .api }

125

def get_view_reference(self) -> None | Any:

126

"""Get PyObject this Buffer references as view.

127

128

Returns:

129

None if Buffer owns its data, PyObject reference otherwise

130

"""

131

132

def get_view_reference_count(self) -> None | int:

133

"""Get reference count of PyObject this Buffer references.

134

135

Returns:

136

None if Buffer owns its data, reference count otherwise

137

"""

138

```

139

140

### Buffer Usage Examples

141

142

```python { .api }

143

import cramjam

144

145

# Create and manipulate buffer

146

buffer = cramjam.Buffer()

147

buffer.write(b"Hello ")

148

buffer.write(b"World!")

149

buffer.seek(0)

150

data = buffer.read() # b"Hello World!"

151

152

# Use as compression target

153

source = b"Data to compress" * 1000

154

output_buffer = cramjam.Buffer()

155

cramjam.gzip.compress_into(source, output_buffer)

156

157

# Read compressed data

158

output_buffer.seek(0)

159

compressed_data = output_buffer.read()

160

```

161

162

## File Class

163

164

```python { .api }

165

class File:

166

"""File-like object owned on Rust side."""

167

168

def __init__(self, path: str, read: bool | None = None, write: bool | None = None,

169

truncate: bool | None = None, append: bool | None = None) -> None:

170

"""Open file with specified modes.

171

172

Args:

173

path: File path string

174

read: Enable read mode (optional)

175

write: Enable write mode (optional)

176

truncate: Enable truncate mode (optional)

177

append: Enable append mode (optional)

178

"""

179

```

180

181

### File Operations

182

183

The File class provides the same interface as Buffer but operates on actual files:

184

185

```python { .api }

186

# Read operations

187

def read(self, n_bytes: int | None = None) -> bytes:

188

"""Read from file at current position."""

189

190

def readinto(self, output: BufferProtocol) -> int:

191

"""Read from file into buffer object."""

192

193

# Write operations

194

def write(self, input: BufferProtocol) -> int:

195

"""Write bytes to file."""

196

197

# Position operations

198

def seek(self, position: int, whence: int | None = 0) -> int:

199

"""Seek to position within file."""

200

201

def tell(self) -> int:

202

"""Get current file position."""

203

204

def seekable(self) -> bool:

205

"""Check if file is seekable (always True)."""

206

207

# Size operations

208

def len(self) -> int:

209

"""Get file length in bytes."""

210

211

def set_len(self, size: int) -> None:

212

"""Set file length. Truncates if smaller, null-fills if larger."""

213

214

def truncate(self) -> None:

215

"""Truncate the file."""

216

```

217

218

### File Usage Examples

219

220

```python { .api }

221

import cramjam

222

223

# Open file for reading and writing

224

file_obj = cramjam.File("data.bin", read=True, write=True)

225

226

# Write compressed data directly to file

227

source_data = b"Large dataset" * 10000

228

cramjam.zstd.compress_into(source_data, file_obj)

229

230

# Read back and decompress

231

file_obj.seek(0)

232

compressed_data = file_obj.read()

233

decompressed = cramjam.zstd.decompress(compressed_data)

234

235

# Append mode for logs

236

log_file = cramjam.File("compressed.log", write=True, append=True)

237

log_entry = b"Log entry data"

238

compressed_entry = cramjam.gzip.compress(log_entry)

239

log_file.write(compressed_entry)

240

```

241

242

## Exception Classes

243

244

### CompressionError

245

246

```python { .api }

247

class CompressionError(Exception):

248

"""Cramjam-specific exception for failed compression operations."""

249

```

250

251

Raised when compression operations fail due to:

252

- Invalid input data

253

- Insufficient output buffer space

254

- Algorithm-specific limitations

255

256

```python { .api }

257

import cramjam

258

259

try:

260

# Attempt compression

261

result = cramjam.brotli.compress(invalid_data)

262

except cramjam.CompressionError as e:

263

print(f"Compression failed: {e}")

264

```

265

266

### DecompressionError

267

268

```python { .api }

269

class DecompressionError(Exception):

270

"""Cramjam-specific exception for failed decompression operations."""

271

```

272

273

Raised when decompression operations fail due to:

274

- Corrupted compressed data

275

- Wrong decompression algorithm

276

- Truncated input

277

278

```python { .api }

279

import cramjam

280

281

try:

282

# Attempt decompression

283

result = cramjam.gzip.decompress(corrupted_data)

284

except cramjam.DecompressionError as e:

285

print(f"Decompression failed: {e}")

286

```

287

288

## BufferProtocol Type

289

290

```python { .api }

291

BufferProtocol = Any # Type alias for buffer protocol objects

292

```

293

294

Type alias representing objects that implement the Python buffer protocol:

295

- `bytes` - Immutable byte strings

296

- `bytearray` - Mutable byte arrays

297

- `memoryview` - Memory view objects

298

- Custom objects implementing `__buffer__` method

299

300

All cramjam functions accept `BufferProtocol` objects as input, providing flexibility in data handling while maintaining performance through the buffer protocol's zero-copy semantics where possible.

301

302

## Memory Management Best Practices

303

304

### Performance Tips

305

306

1. **Use bytearray for inputs** when possible - avoids double allocation on Rust side

307

2. **Pre-allocate buffers** for `*_into` functions to avoid repeated allocations

308

3. **Use copy=False** in Buffer constructor when safe to reference original data

309

4. **Monitor reference counts** with Buffer memory management methods when working with large datasets

310

311

### Memory-Efficient Patterns

312

313

```python { .api }

314

import cramjam

315

316

# Efficient: Pre-allocated buffer pattern

317

source = bytearray(b"Large data" * 100000) # bytearray is faster

318

output = cramjam.Buffer() # Pre-allocated output

319

bytes_written = cramjam.zstd.compress_into(source, output)

320

321

# Memory view pattern for zero-copy operations

322

large_data = bytearray(1024 * 1024) # 1MB buffer

323

view = memoryview(large_data)[1000:2000] # Slice without copying

324

compressed = cramjam.lz4.compress(view)

325

326

# Reference pattern (be careful about data lifetime)

327

original_data = bytearray(b"Persistent data")

328

buffer = cramjam.Buffer(original_data, copy=False) # References original

329

# Ensure original_data stays alive while buffer is in use

330

```