0
# Core Utilities
1
2
Core classes and utilities for buffer management and error handling in cramjam.
3
4
## Imports
5
6
```python { .api }
7
from cramjam import Buffer, File, CompressionError, DecompressionError, BufferProtocol
8
```
9
10
## Buffer Class
11
12
```python { .api }
13
class Buffer:
14
"""Buffer class implementing both readable and writable buffer protocols."""
15
16
def __init__(self, data: BufferProtocol | None = None, copy: bool | None = True) -> None:
17
"""Initialize buffer.
18
19
Args:
20
data: Anything implementing the buffer protocol (optional)
21
copy: Whether to make a copy of the provided data (default: True)
22
"""
23
```
24
25
The Buffer class provides a memory buffer with file-like interface for efficient data operations.
26
27
### Basic Usage
28
29
```python { .api }
30
import cramjam
31
32
# Create empty buffer
33
buffer = cramjam.Buffer()
34
35
# Create buffer from data
36
buffer = cramjam.Buffer(b"Hello World")
37
38
# Create buffer without copying (references original data)
39
data = bytearray(b"Original data")
40
buffer = cramjam.Buffer(data, copy=False)
41
```
42
43
### Read Operations
44
45
```python { .api }
46
def read(self, n_bytes: int | None = -1) -> bytes:
47
"""Read from buffer at current position.
48
49
Args:
50
n_bytes: Number of bytes to read, -1 for all remaining
51
52
Returns:
53
bytes: Data read from buffer
54
"""
55
56
def readinto(self, output: BufferProtocol) -> int:
57
"""Read from buffer into another buffer object.
58
59
Args:
60
output: Buffer protocol object to read data into
61
62
Returns:
63
int: Number of bytes read
64
"""
65
```
66
67
### Write Operations
68
69
```python { .api }
70
def write(self, input: BufferProtocol) -> int:
71
"""Write bytes to the buffer.
72
73
Args:
74
input: Data implementing Buffer Protocol to write
75
76
Returns:
77
int: Number of bytes written
78
"""
79
```
80
81
### Position Operations
82
83
```python { .api }
84
def seek(self, position: int, whence: int | None = 0) -> int:
85
"""Seek to position within the buffer.
86
87
Args:
88
position: Target position
89
whence: 0 (from start), 1 (from current), 2 (from end)
90
91
Returns:
92
int: New position
93
"""
94
95
def tell(self) -> int:
96
"""Get current position of the buffer."""
97
98
def seekable(self) -> bool:
99
"""Check if buffer is seekable (always True for compatibility)."""
100
```
101
102
### Size Operations
103
104
```python { .api }
105
def len(self) -> int:
106
"""Get length of the underlying buffer."""
107
108
def set_len(self, size: int) -> None:
109
"""Set buffer length. Truncates if smaller, null-fills if larger."""
110
111
def truncate(self) -> None:
112
"""Truncate the buffer."""
113
114
# Magic methods for convenience
115
def __len__(self) -> int:
116
"""Get buffer length."""
117
118
def __bool__(self) -> bool:
119
"""Check if buffer has content."""
120
```
121
122
### Memory Management
123
124
```python { .api }
125
def get_view_reference(self) -> None | Any:
126
"""Get PyObject this Buffer references as view.
127
128
Returns:
129
None if Buffer owns its data, PyObject reference otherwise
130
"""
131
132
def get_view_reference_count(self) -> None | int:
133
"""Get reference count of PyObject this Buffer references.
134
135
Returns:
136
None if Buffer owns its data, reference count otherwise
137
"""
138
```
139
140
### Buffer Usage Examples
141
142
```python { .api }
143
import cramjam
144
145
# Create and manipulate buffer
146
buffer = cramjam.Buffer()
147
buffer.write(b"Hello ")
148
buffer.write(b"World!")
149
buffer.seek(0)
150
data = buffer.read() # b"Hello World!"
151
152
# Use as compression target
153
source = b"Data to compress" * 1000
154
output_buffer = cramjam.Buffer()
155
cramjam.gzip.compress_into(source, output_buffer)
156
157
# Read compressed data
158
output_buffer.seek(0)
159
compressed_data = output_buffer.read()
160
```
161
162
## File Class
163
164
```python { .api }
165
class File:
166
"""File-like object owned on Rust side."""
167
168
def __init__(self, path: str, read: bool | None = None, write: bool | None = None,
169
truncate: bool | None = None, append: bool | None = None) -> None:
170
"""Open file with specified modes.
171
172
Args:
173
path: File path string
174
read: Enable read mode (optional)
175
write: Enable write mode (optional)
176
truncate: Enable truncate mode (optional)
177
append: Enable append mode (optional)
178
"""
179
```
180
181
### File Operations
182
183
The File class provides the same interface as Buffer but operates on actual files:
184
185
```python { .api }
186
# Read operations
187
def read(self, n_bytes: int | None = None) -> bytes:
188
"""Read from file at current position."""
189
190
def readinto(self, output: BufferProtocol) -> int:
191
"""Read from file into buffer object."""
192
193
# Write operations
194
def write(self, input: BufferProtocol) -> int:
195
"""Write bytes to file."""
196
197
# Position operations
198
def seek(self, position: int, whence: int | None = 0) -> int:
199
"""Seek to position within file."""
200
201
def tell(self) -> int:
202
"""Get current file position."""
203
204
def seekable(self) -> bool:
205
"""Check if file is seekable (always True)."""
206
207
# Size operations
208
def len(self) -> int:
209
"""Get file length in bytes."""
210
211
def set_len(self, size: int) -> None:
212
"""Set file length. Truncates if smaller, null-fills if larger."""
213
214
def truncate(self) -> None:
215
"""Truncate the file."""
216
```
217
218
### File Usage Examples
219
220
```python { .api }
221
import cramjam
222
223
# Open file for reading and writing
224
file_obj = cramjam.File("data.bin", read=True, write=True)
225
226
# Write compressed data directly to file
227
source_data = b"Large dataset" * 10000
228
cramjam.zstd.compress_into(source_data, file_obj)
229
230
# Read back and decompress
231
file_obj.seek(0)
232
compressed_data = file_obj.read()
233
decompressed = cramjam.zstd.decompress(compressed_data)
234
235
# Append mode for logs
236
log_file = cramjam.File("compressed.log", write=True, append=True)
237
log_entry = b"Log entry data"
238
compressed_entry = cramjam.gzip.compress(log_entry)
239
log_file.write(compressed_entry)
240
```
241
242
## Exception Classes
243
244
### CompressionError
245
246
```python { .api }
247
class CompressionError(Exception):
248
"""Cramjam-specific exception for failed compression operations."""
249
```
250
251
Raised when compression operations fail due to:
252
- Invalid input data
253
- Insufficient output buffer space
254
- Algorithm-specific limitations
255
256
```python { .api }
257
import cramjam
258
259
try:
260
# Attempt compression
261
result = cramjam.brotli.compress(invalid_data)
262
except cramjam.CompressionError as e:
263
print(f"Compression failed: {e}")
264
```
265
266
### DecompressionError
267
268
```python { .api }
269
class DecompressionError(Exception):
270
"""Cramjam-specific exception for failed decompression operations."""
271
```
272
273
Raised when decompression operations fail due to:
274
- Corrupted compressed data
275
- Wrong decompression algorithm
276
- Truncated input
277
278
```python { .api }
279
import cramjam
280
281
try:
282
# Attempt decompression
283
result = cramjam.gzip.decompress(corrupted_data)
284
except cramjam.DecompressionError as e:
285
print(f"Decompression failed: {e}")
286
```
287
288
## BufferProtocol Type
289
290
```python { .api }
291
BufferProtocol = Any # Type alias for buffer protocol objects
292
```
293
294
Type alias representing objects that implement the Python buffer protocol:
295
- `bytes` - Immutable byte strings
296
- `bytearray` - Mutable byte arrays
297
- `memoryview` - Memory view objects
298
- Custom objects implementing `__buffer__` method
299
300
All cramjam functions accept `BufferProtocol` objects as input, providing flexibility in data handling while maintaining performance through the buffer protocol's zero-copy semantics where possible.
301
302
## Memory Management Best Practices
303
304
### Performance Tips
305
306
1. **Use bytearray for inputs** when possible - avoids double allocation on Rust side
307
2. **Pre-allocate buffers** for `*_into` functions to avoid repeated allocations
308
3. **Use copy=False** in Buffer constructor when safe to reference original data
309
4. **Monitor reference counts** with Buffer memory management methods when working with large datasets
310
311
### Memory-Efficient Patterns
312
313
```python { .api }
314
import cramjam
315
316
# Efficient: Pre-allocated buffer pattern
317
source = bytearray(b"Large data" * 100000) # bytearray is faster
318
output = cramjam.Buffer() # Pre-allocated output
319
bytes_written = cramjam.zstd.compress_into(source, output)
320
321
# Memory view pattern for zero-copy operations
322
large_data = bytearray(1024 * 1024) # 1MB buffer
323
view = memoryview(large_data)[1000:2000] # Slice without copying
324
compressed = cramjam.lz4.compress(view)
325
326
# Reference pattern (be careful about data lifetime)
327
original_data = bytearray(b"Persistent data")
328
buffer = cramjam.Buffer(original_data, copy=False) # References original
329
# Ensure original_data stays alive while buffer is in use
330
```