0
# I/O and Conversion Functions
1
2
Functions for saving, loading, and converting sparse arrays between different formats and libraries. These operations enable interoperability with NumPy, SciPy, and file-based storage systems.
3
4
## Capabilities
5
6
### File I/O Operations
7
8
Functions for persistent storage of sparse arrays using compressed formats.
9
10
```python { .api }
11
def save_npz(file, *args, **kwargs):
12
"""
13
Save sparse arrays to compressed NumPy .npz format.
14
15
Stores multiple sparse arrays in a single compressed file with
16
efficient sparse representation. Arrays are saved with their
17
coordinate and data information.
18
19
Parameters:
20
- file: str or file-like, output file path or file object
21
- args: sparse arrays to save (positional arguments)
22
- kwargs: named sparse arrays to save (keyword arguments)
23
24
Returns:
25
None (saves to file)
26
"""
27
28
def load_npz(file):
29
"""
30
Load sparse array from compressed NumPy .npz format.
31
32
Reconstructs sparse array from stored coordinate and data information.
33
Returns the sparse array in COO format.
34
35
Parameters:
36
- file: str or file-like, input file path or file object
37
38
Returns:
39
Sparse COO array loaded from file
40
"""
41
```
42
43
### NumPy Conversion
44
45
Functions for converting between sparse arrays and NumPy dense arrays.
46
47
```python { .api }
48
def asnumpy(a):
49
"""
50
Convert sparse array to NumPy dense array.
51
52
Creates dense NumPy array representation with zeros filled in.
53
Equivalent to calling .todense() on sparse array.
54
55
Parameters:
56
- a: sparse array, input sparse array to convert
57
58
Returns:
59
numpy.ndarray with dense representation of sparse array
60
"""
61
```
62
63
### Type Conversion Utilities
64
65
Functions for converting between different data types while preserving sparsity.
66
67
```python { .api }
68
def astype(a, dtype):
69
"""
70
Cast sparse array to specified data type.
71
72
Converts array elements to new data type while preserving
73
sparse structure. May affect precision or range of values.
74
75
Parameters:
76
- a: sparse array, input array to convert
77
- dtype: numpy dtype, target data type
78
79
Returns:
80
Sparse array with elements cast to new data type
81
"""
82
83
def can_cast(from_, to):
84
"""
85
Check if casting between data types is safe.
86
87
Determines whether values can be cast from one data type
88
to another without loss of precision or range.
89
90
Parameters:
91
- from_: numpy dtype, source data type
92
- to: numpy dtype, target data type
93
94
Returns:
95
bool, True if casting is safe, False otherwise
96
"""
97
98
def result_type(*arrays_and_dtypes):
99
"""
100
Determine result data type for operations on multiple arrays.
101
102
Computes the common data type that would result from operations
103
involving multiple arrays or data types.
104
105
Parameters:
106
- arrays_and_dtypes: sparse arrays or numpy dtypes
107
108
Returns:
109
numpy dtype, common result type for operations
110
"""
111
```
112
113
### Array Property Access
114
115
Functions for accessing real and imaginary components of complex arrays.
116
117
```python { .api }
118
def real(a):
119
"""
120
Extract real part of complex sparse array.
121
122
For real arrays, returns copy of original array.
123
For complex arrays, returns sparse array containing only real components.
124
125
Parameters:
126
- a: sparse array, input array (real or complex)
127
128
Returns:
129
Sparse array containing real parts of input elements
130
"""
131
132
def imag(a):
133
"""
134
Extract imaginary part of complex sparse array.
135
136
For real arrays, returns sparse array of zeros with same shape.
137
For complex arrays, returns sparse array containing only imaginary components.
138
139
Parameters:
140
- a: sparse array, input array (real or complex)
141
142
Returns:
143
Sparse array containing imaginary parts of input elements
144
"""
145
```
146
147
### Testing and Validation Functions
148
149
Functions for testing array properties and validity.
150
151
```python { .api }
152
def isfinite(x):
153
"""
154
Test element-wise for finite values (not inf or NaN).
155
156
Parameters:
157
- x: sparse array, input array to test
158
159
Returns:
160
Sparse boolean array, True where elements are finite
161
"""
162
163
def isinf(x):
164
"""
165
Test element-wise for positive or negative infinity.
166
167
Parameters:
168
- x: sparse array, input array to test
169
170
Returns:
171
Sparse boolean array, True where elements are infinite
172
"""
173
174
def isnan(x):
175
"""
176
Test element-wise for NaN (Not a Number) values.
177
178
Parameters:
179
- x: sparse array, input array to test
180
181
Returns:
182
Sparse boolean array, True where elements are NaN
183
"""
184
185
def isneginf(x):
186
"""
187
Test element-wise for negative infinity.
188
189
Parameters:
190
- x: sparse array, input array to test
191
192
Returns:
193
Sparse boolean array, True where elements are negative infinity
194
"""
195
196
def isposinf(x):
197
"""
198
Test element-wise for positive infinity.
199
200
Parameters:
201
- x: sparse array, input array to test
202
203
Returns:
204
Sparse boolean array, True where elements are positive infinity
205
"""
206
```
207
208
### Special Functions
209
210
Advanced utility functions for element-wise operations.
211
212
```python { .api }
213
def elemwise(func, *args, **kwargs):
214
"""
215
Apply arbitrary function element-wise to sparse arrays.
216
217
Applies custom function to corresponding elements of input arrays.
218
Function should handle scalar inputs and return scalar outputs.
219
220
Parameters:
221
- func: callable, function to apply element-wise
222
- args: sparse arrays, input arrays for function
223
- kwargs: additional keyword arguments for function
224
225
Returns:
226
Sparse array with function applied element-wise
227
"""
228
```
229
230
## Usage Examples
231
232
### File I/O Operations
233
234
```python
235
import sparse
236
import numpy as np
237
import tempfile
238
import os
239
240
# Create test sparse arrays
241
array1 = sparse.COO.from_numpy(np.array([[1, 0, 3], [0, 2, 0]]))
242
array2 = sparse.random((5, 5), density=0.3)
243
array3 = sparse.eye(4)
244
245
# Save single array
246
with tempfile.NamedTemporaryFile(suffix='.npz', delete=False) as f:
247
sparse.save_npz(f.name, array1)
248
249
# Load single array
250
loaded_array1 = sparse.load_npz(f.name)
251
print(f"Arrays equal: {np.array_equal(array1.todense(), loaded_array1.todense())}")
252
253
os.unlink(f.name) # Clean up
254
255
# Save multiple arrays with names
256
with tempfile.NamedTemporaryFile(suffix='.npz', delete=False) as f:
257
sparse.save_npz(f.name,
258
main_array=array1,
259
random_array=array2,
260
identity=array3)
261
262
# Load returns dictionary for multiple arrays
263
loaded_data = sparse.load_npz(f.name)
264
print(f"Loaded arrays: {list(loaded_data.keys())}")
265
266
os.unlink(f.name) # Clean up
267
268
print(f"File I/O preserves sparsity and data integrity")
269
```
270
271
### NumPy Conversion
272
273
```python
274
# Create sparse array
275
sparse_array = sparse.COO.from_numpy(
276
np.array([[1, 0, 3, 0], [0, 2, 0, 4], [5, 0, 0, 6]])
277
)
278
279
# Convert to dense NumPy array
280
dense_array = sparse.asnumpy(sparse_array)
281
dense_array_alt = sparse_array.todense() # Alternative method
282
283
print(f"Sparse array nnz: {sparse_array.nnz}")
284
print(f"Dense array shape: {dense_array.shape}")
285
print(f"Arrays identical: {np.array_equal(dense_array, dense_array_alt)}")
286
287
# Memory usage comparison
288
sparse_memory = sparse_array.data.nbytes + sparse_array.coords.nbytes
289
dense_memory = dense_array.nbytes
290
print(f"Sparse memory: {sparse_memory} bytes")
291
print(f"Dense memory: {dense_memory} bytes")
292
print(f"Compression ratio: {dense_memory / sparse_memory:.1f}x")
293
```
294
295
### Type Conversion
296
297
```python
298
# Create arrays with different data types
299
int_array = sparse.COO.from_numpy(np.array([[1, 0, 3], [0, 2, 0]], dtype=np.int32))
300
float_array = sparse.astype(int_array, np.float64)
301
complex_array = sparse.astype(float_array, np.complex128)
302
303
print(f"Original dtype: {int_array.dtype}")
304
print(f"Float dtype: {float_array.dtype}")
305
print(f"Complex dtype: {complex_array.dtype}")
306
307
# Check casting safety
308
safe_int_to_float = sparse.can_cast(np.int32, np.float64)
309
unsafe_float_to_int = sparse.can_cast(np.float64, np.int32)
310
311
print(f"Safe int32 -> float64: {safe_int_to_float}") # True
312
print(f"Safe float64 -> int32: {unsafe_float_to_int}") # False
313
314
# Determine result types for operations
315
result_type = sparse.result_type(int_array, float_array)
316
print(f"Result type for int32 + float64: {result_type}") # float64
317
```
318
319
### Complex Number Handling
320
321
```python
322
# Create complex sparse array
323
real_part = sparse.random((3, 4), density=0.5)
324
imag_part = sparse.random((3, 4), density=0.3)
325
complex_array = real_part + 1j * imag_part
326
327
print(f"Complex array dtype: {complex_array.dtype}")
328
print(f"Complex array nnz: {complex_array.nnz}")
329
330
# Extract components
331
real_component = sparse.real(complex_array)
332
imag_component = sparse.imag(complex_array)
333
334
print(f"Real component nnz: {real_component.nnz}")
335
print(f"Imaginary component nnz: {imag_component.nnz}")
336
337
# Verify reconstruction
338
reconstructed = real_component + 1j * imag_component
339
print(f"Reconstruction accurate: {np.allclose(complex_array.todense(), reconstructed.todense())}")
340
341
# Real arrays
342
real_array = sparse.random((2, 3), density=0.4)
343
real_from_real = sparse.real(real_array) # Copy of original
344
imag_from_real = sparse.imag(real_array) # Array of zeros
345
346
print(f"Real from real equal: {np.array_equal(real_array.todense(), real_from_real.todense())}")
347
print(f"Imag from real nnz: {imag_from_real.nnz}") # Should be 0
348
```
349
350
### Array Validation and Testing
351
352
```python
353
# Create array with special values
354
test_data = np.array([[1.0, np.inf, 3.0], [0.0, np.nan, -np.inf]])
355
special_array = sparse.COO.from_numpy(test_data)
356
357
# Test for different conditions
358
finite_mask = sparse.isfinite(special_array)
359
inf_mask = sparse.isinf(special_array)
360
nan_mask = sparse.isnan(special_array)
361
neginf_mask = sparse.isneginf(special_array)
362
posinf_mask = sparse.isposinf(special_array)
363
364
print("Special value detection:")
365
print(f"Finite values: {np.sum(finite_mask.todense())}") # Count finite
366
print(f"Infinite values: {np.sum(inf_mask.todense())}") # Count inf
367
print(f"NaN values: {np.sum(nan_mask.todense())}") # Count NaN
368
print(f"Negative inf: {np.sum(neginf_mask.todense())}") # Count -inf
369
print(f"Positive inf: {np.sum(posinf_mask.todense())}") # Count +inf
370
371
# Use masks for filtering
372
finite_only = sparse.where(finite_mask, special_array, 0)
373
print(f"Finite-only array nnz: {finite_only.nnz}")
374
```
375
376
### Custom Element-wise Functions
377
378
```python
379
# Define custom functions
380
def sigmoid(x):
381
"""Sigmoid activation function"""
382
return 1 / (1 + np.exp(-x))
383
384
def custom_transform(x, scale=1.0, offset=0.0):
385
"""Custom transformation with parameters"""
386
return scale * np.tanh(x) + offset
387
388
# Apply custom functions element-wise
389
input_array = sparse.random((10, 10), density=0.2)
390
391
# Simple function application
392
sigmoid_result = sparse.elemwise(sigmoid, input_array)
393
print(f"Sigmoid result nnz: {sigmoid_result.nnz}")
394
395
# Function with additional parameters
396
transformed = sparse.elemwise(custom_transform, input_array, scale=2.0, offset=0.5)
397
print(f"Transformed result nnz: {transformed.nnz}")
398
399
# Multi-argument custom function
400
def weighted_sum(x, y, w1=0.5, w2=0.5):
401
return w1 * x + w2 * y
402
403
array_a = sparse.random((5, 5), density=0.3)
404
array_b = sparse.random((5, 5), density=0.3)
405
406
weighted_result = sparse.elemwise(weighted_sum, array_a, array_b, w1=0.7, w2=0.3)
407
print(f"Weighted sum result nnz: {weighted_result.nnz}")
408
```
409
410
### Advanced Conversion Scenarios
411
412
```python
413
# Batch conversion and type management
414
arrays = [sparse.random((20, 20), density=0.05) for _ in range(5)]
415
416
# Convert all to same type
417
common_dtype = np.float32
418
converted_arrays = [sparse.astype(arr, common_dtype) for arr in arrays]
419
420
# Verify type consistency
421
dtypes = [arr.dtype for arr in converted_arrays]
422
print(f"All same dtype: {all(dt == common_dtype for dt in dtypes)}")
423
424
# Memory usage comparison
425
original_memory = sum(arr.data.nbytes + arr.coords.nbytes for arr in arrays)
426
converted_memory = sum(arr.data.nbytes + arr.coords.nbytes for arr in converted_arrays)
427
print(f"Memory change: {converted_memory / original_memory:.2f}x")
428
429
# Result type prediction for operations
430
result_types = []
431
for i in range(len(arrays)):
432
for j in range(i + 1, len(arrays)):
433
rt = sparse.result_type(arrays[i], arrays[j])
434
result_types.append(rt)
435
436
print(f"Operation result types: {set(str(rt) for rt in result_types)}")
437
```
438
439
### Integration with SciPy Sparse
440
441
```python
442
# Although not directly in sparse API, demonstrate typical workflow
443
try:
444
from scipy import sparse as sp
445
446
# Create sparse array
447
sparse_coo = sparse.random((100, 100), density=0.02)
448
449
# Convert to SciPy format via dense (for demonstration)
450
dense_temp = sparse.asnumpy(sparse_coo)
451
scipy_csr = sp.csr_matrix(dense_temp)
452
453
# Convert back to sparse via dense
454
back_to_sparse = sparse.COO.from_numpy(scipy_csr.toarray())
455
456
print(f"Round-trip successful: {np.allclose(sparse_coo.todense(), back_to_sparse.todense())}")
457
print(f"SciPy CSR nnz: {scipy_csr.nnz}")
458
print(f"Sparse COO nnz: {sparse_coo.nnz}")
459
460
except ImportError:
461
print("SciPy not available for integration example")
462
```
463
464
### Performance Considerations
465
466
```python
467
# Efficient I/O for large arrays
468
large_array = sparse.random((10000, 10000), density=0.001)
469
print(f"Large array: {large_array.shape}, nnz: {large_array.nnz}")
470
471
# File I/O timing
472
import time
473
with tempfile.NamedTemporaryFile(suffix='.npz', delete=False) as f:
474
start_time = time.time()
475
sparse.save_npz(f.name, large_array)
476
save_time = time.time() - start_time
477
478
start_time = time.time()
479
loaded = sparse.load_npz(f.name)
480
load_time = time.time() - start_time
481
482
print(f"Save time: {save_time:.3f}s")
483
print(f"Load time: {load_time:.3f}s")
484
print(f"Data integrity: {np.array_equal(large_array.coords, loaded.coords)}")
485
486
os.unlink(f.name)
487
488
# Type conversion efficiency
489
start_time = time.time()
490
converted = sparse.astype(large_array, np.float32)
491
conversion_time = time.time() - start_time
492
print(f"Type conversion time: {conversion_time:.3f}s")
493
```
494
495
## Performance and Compatibility Notes
496
497
### File I/O Efficiency
498
- **NPZ format**: Compressed storage optimized for sparse arrays
499
- **Multiple arrays**: Single file can contain multiple sparse arrays
500
- **Cross-platform**: Compatible across different operating systems and Python versions
501
502
### Type Conversion Considerations
503
- **Precision loss**: Converting to lower precision types may lose information
504
- **Memory usage**: Different data types have different memory requirements
505
- **Compatibility**: Result types follow NumPy casting rules
506
507
### Integration Tips
508
- Use `asnumpy()` for NumPy compatibility when dense representation is acceptable
509
- File I/O preserves sparsity structure efficiently
510
- Custom functions with `elemwise()` should handle scalar inputs and outputs
511
- Type checking with `can_cast()` prevents unexpected precision loss