Tessl Tile for pypi/fastremap@1.17.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

advanced-remapping.md core-operations.md dtype-management.md index.md masking.md memory-layout.md search-analysis.md serialization.md

serialization.mddocs/

0
# Serialization
1

2
Efficient binary serialization of chunked arrays for storage and network transfer. This functionality provides optimized methods for converting array data to binary format with chunking support, essential for large-scale data processing and distributed computing workflows.
3

4
## Capabilities
5

6
### Chunked Binary Serialization
7

8
Compute binary representation of an image divided into a grid of cutouts, with optimized performance for specific memory layouts.
9

10
```python { .api }
11
def tobytes(
12
    image: NDArray,
13
    chunk_size: tuple[int, int, int],
14
    order: str = "C"
15
) -> list[bytes]:
16
    """
17
    Compute bytes with image divided into grid of cutouts.
18
    
19
    Args:
20
        image: Input image array
21
        chunk_size: Size of each chunk (x, y, z)
22
        order: Memory order ("C" or "F", default: "C")
23
    
24
    Returns:
25
        Resultant binaries indexed by gridpoint in fortran order
26
    """
27
```
28

29
**Usage Example:**
30

31
```python
32
import fastremap
33
import numpy as np
34

35
# Create a sample 3D image
36
image = np.random.randint(0, 255, size=(128, 128, 64), dtype=np.uint8)
37

38
# Divide into 64x64x64 chunks and serialize
39
chunk_size = (64, 64, 64)
40
binaries = fastremap.tobytes(image, chunk_size, order="C")
41

42
# Result is a list of bytes objects
43
print(f"Number of chunks: {len(binaries)}")
44
print(f"First chunk size: {len(binaries[0])} bytes")
45

46
# For Fortran-ordered output
47
binaries_f = fastremap.tobytes(image, chunk_size, order="F")
48
```
49

50
### Performance Optimization
51

52
The `tobytes` function is significantly optimized for specific conditions:
53

54
- **Matching memory layout**: When the input image is F-contiguous and F order is requested, or C-contiguous and C order is requested
55
- **Large images**: Performance benefits are most pronounced when the image is larger than a single chunk
56
- **Efficient chunking**: Avoids the overhead of iterating and calling `tobytes` on each chunk individually
57

58
**Performance Example:**
59

60
```python
61
import fastremap
62
import numpy as np
63
import time
64

65
# Large Fortran-ordered image
66
large_image = np.random.random((512, 512, 256)).astype(np.float32, order='F')
67
chunk_size = (64, 64, 64)
68

69
# Optimized fastremap approach
70
start = time.time()
71
fast_chunks = fastremap.tobytes(large_image, chunk_size, order="F")
72
fast_time = time.time() - start
73

74
# Manual chunking approach (for comparison)
75
start = time.time()
76
manual_chunks = []
77
for z in range(0, 256, 64):
78
    for y in range(0, 512, 64):
79
        for x in range(0, 512, 64):
80
            chunk = large_image[x:x+64, y:y+64, z:z+64]
81
            manual_chunks.append(chunk.tobytes(order='F'))
82
manual_time = time.time() - start
83

84
print(f"fastremap time: {fast_time:.3f}s")
85
print(f"Manual time: {manual_time:.3f}s")
86
print(f"Speedup: {manual_time/fast_time:.1f}x faster")
87
```
88

89
### Use Cases
90

91
#### Distributed Computing
92

93
```python
94
import fastremap
95
import numpy as np
96

97
# Prepare large dataset for distributed processing
98
dataset = np.random.random((1024, 1024, 512)).astype(np.float32)
99

100
# Chunk into manageable pieces for worker nodes
101
chunk_size = (128, 128, 128)
102
chunks = fastremap.tobytes(dataset, chunk_size, order="C")
103

104
# Each chunk can now be sent to different worker processes
105
for i, chunk_data in enumerate(chunks):
106
    # Send chunk_data to worker i
107
    # worker_pool.submit(process_chunk, chunk_data, i)
108
    pass
109
```
110

111
#### Efficient Storage
112

113
```python
114
import fastremap
115
import numpy as np
116
import pickle
117

118
# Large scientific dataset
119
data = np.random.random((2048, 2048, 1024)).astype(np.float32)
120

121
# Chunk and serialize for efficient storage
122
chunk_size = (256, 256, 256)
123
serialized_chunks = fastremap.tobytes(data, chunk_size, order="F")
124

125
# Store chunks efficiently
126
metadata = {
127
    'original_shape': data.shape,
128
    'chunk_size': chunk_size,
129
    'dtype': data.dtype,
130
    'order': 'F',
131
    'num_chunks': len(serialized_chunks)
132
}
133

134
# Save metadata and chunks
135
with open('data_metadata.pkl', 'wb') as f:
136
    pickle.dump(metadata, f)
137

138
for i, chunk in enumerate(serialized_chunks):
139
    with open(f'chunk_{i:04d}.bin', 'wb') as f:
140
        f.write(chunk)
141
```
142

143
#### Memory Layout Considerations
144

145
```python
146
import fastremap
147
import numpy as np
148

149
# For C-contiguous arrays, use C order for best performance
150
c_array = np.random.random((100, 200, 300)).astype(np.float32, order='C')
151
c_chunks = fastremap.tobytes(c_array, (50, 50, 50), order="C")  # Optimal
152

153
# For F-contiguous arrays, use F order for best performance
154
f_array = np.random.random((100, 200, 300)).astype(np.float32, order='F')
155
f_chunks = fastremap.tobytes(f_array, (50, 50, 50), order="F")  # Optimal
156

157
# Mixed orders work but may be slower
158
mixed_chunks = fastremap.tobytes(c_array, (50, 50, 50), order="F")  # Suboptimal
159
```
160

161
## Types
162

163
```python { .api }
164
NDArray = np.ndarray
165
```

Version

Tile

Files

serialization.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

serialization.mddocs/