Sparse n-dimensional arrays for the PyData ecosystem with multiple backend implementations
npx @tessl/cli install tessl/pypi-sparse@0.17.00
# Sparse
1
2
A comprehensive sparse n-dimensional array library for the PyData ecosystem that provides efficient storage and computation for arrays with many zero elements. The library features multiple backend implementations (Numba, Finch, MLIR) for optimal performance across different computational scenarios and supports the Array API standard for seamless interoperability with NumPy and other array libraries.
3
4
## Package Information
5
6
- **Package Name**: sparse
7
- **Language**: Python
8
- **Package Type**: library
9
- **Installation**: `pip install sparse`
10
- **Optional Backends**: `pip install sparse[finch]` or `pip install sparse[mlir]`
11
12
## Core Imports
13
14
```python
15
import sparse
16
```
17
18
Common imports for working with sparse arrays:
19
20
```python
21
from sparse import COO, DOK, GCXS
22
import sparse
23
```
24
25
## Basic Usage
26
27
```python
28
import sparse
29
import numpy as np
30
31
# Create a sparse COO array from dense data
32
dense_array = np.array([[1, 0, 2], [0, 0, 3], [4, 0, 0]])
33
sparse_array = sparse.COO.from_numpy(dense_array)
34
35
# Create sparse arrays directly
36
coords = [[0, 1, 2], [0, 2, 0]] # row, col indices
37
data = [1, 3, 4] # values at those positions
38
shape = (3, 3)
39
coo_array = sparse.COO(coords, data, shape)
40
41
# Perform operations
42
result = sparse_array + coo_array
43
dense_result = result.todense()
44
45
# Mathematical operations work like NumPy
46
sum_result = sparse.sum(sparse_array, axis=0)
47
dot_product = sparse.dot(sparse_array, sparse_array.T)
48
49
print(f"Original array shape: {sparse_array.shape}")
50
print(f"Number of stored elements: {sparse_array.nnz}")
51
print(f"Density: {sparse_array.density:.2%}")
52
```
53
54
## Architecture
55
56
The sparse library uses a multi-backend architecture for optimal performance:
57
58
- **Backend System**: Switchable backends via `SPARSE_BACKEND` environment variable
59
- **Numba Backend** (default): Production-ready with JIT compilation
60
- **Finch Backend**: Experimental tensor compiler backend
61
- **MLIR Backend**: Research-grade MLIR compiler integration
62
- **Array Formats**: Multiple sparse storage formats for different use cases
63
- **COO**: Coordinate format for general-purpose sparse arrays
64
- **DOK**: Dictionary format for efficient construction and modification
65
- **GCXS**: Compressed formats (CSR/CSC) for memory-efficient storage
66
- **Array API Compliance**: Implements Array API standard v2024.12 for NumPy compatibility
67
68
## Capabilities
69
70
### Core Array Classes
71
72
The main sparse array types providing different storage strategies and performance characteristics for various sparse data patterns.
73
74
```python { .api }
75
class SparseArray:
76
"""Abstract base class for all sparse arrays"""
77
def __init__(self, shape, fill_value=None): ...
78
@property
79
def nnz(self): ...
80
@property
81
def ndim(self): ...
82
@property
83
def size(self): ...
84
@property
85
def density(self): ...
86
@property
87
def T(self): ...
88
@property
89
def real(self): ...
90
@property
91
def imag(self): ...
92
def todense(self): ...
93
def astype(self, dtype, casting="unsafe", copy=True): ...
94
95
class COO(SparseArray):
96
"""Coordinate format sparse array - main user-facing class"""
97
def __init__(self, coords, data=None, shape=None, has_duplicates=True, sorted=False, prune=False, cache=False, fill_value=None, idx_dtype=None): ...
98
@classmethod
99
def from_numpy(cls, x, fill_value=None, idx_dtype=None): ...
100
@classmethod
101
def from_scipy_sparse(cls, x, /, *, fill_value=None): ...
102
@property
103
def T(self): ...
104
@property
105
def mT(self): ...
106
def todense(self): ...
107
def copy(self, deep=True): ...
108
def transpose(self, axes=None): ...
109
def dot(self, other): ...
110
def tocsr(self): ...
111
def tocsc(self): ...
112
113
class DOK(SparseArray):
114
"""Dictionary of Keys format - efficient for construction"""
115
def __init__(self, shape, data=None, dtype=None, fill_value=None): ...
116
@classmethod
117
def from_scipy_sparse(cls, x, /, *, fill_value=None): ...
118
@classmethod
119
def from_coo(cls, x): ...
120
@classmethod
121
def from_numpy(cls, x): ...
122
def to_coo(self): ...
123
def __getitem__(self, key): ...
124
def __setitem__(self, key, value): ...
125
126
class GCXS(SparseArray):
127
"""Generalized Compressed Sparse format (CSR/CSC)"""
128
def __init__(self, arg, shape=None, compressed_axes=None, prune=False, fill_value=None, idx_dtype=None): ...
129
@classmethod
130
def from_numpy(cls, x, compressed_axes=None, fill_value=None, idx_dtype=None): ...
131
@classmethod
132
def from_coo(cls, x, compressed_axes=None, idx_dtype=None): ...
133
@classmethod
134
def from_scipy_sparse(cls, x, /, *, fill_value=None): ...
135
@property
136
def T(self): ...
137
@property
138
def mT(self): ...
139
def tocoo(self): ...
140
def todok(self): ...
141
def change_compressed_axes(self, new_compressed_axes): ...
142
```
143
144
[Core Array Classes](./core-arrays.md)
145
146
### Array Creation Functions
147
148
Functions for creating sparse arrays from various inputs, including conversion from dense arrays, construction of special matrices, and generation of empty arrays.
149
150
```python { .api }
151
def asarray(obj, /, *, dtype=None, format="coo", copy=False, device=None): ...
152
def zeros(shape, dtype=float, format="coo", *, device=None, **kwargs): ...
153
def ones(shape, dtype=float, format="coo", *, device=None, **kwargs): ...
154
def eye(N, M=None, k=0, dtype=float, format="coo", *, device=None, **kwargs): ...
155
def full(shape, fill_value, dtype=None, format="coo", order="C", *, device=None, **kwargs): ...
156
def empty(shape, dtype=float, format="coo", *, device=None, **kwargs): ...
157
def random(shape, density=None, nnz=None, random_state=None, data_rvs=None, format="coo", fill_value=None, idx_dtype=None, **kwargs): ...
158
def zeros_like(a, dtype=None, shape=None, format=None, *, device=None, **kwargs): ...
159
def ones_like(a, dtype=None, shape=None, format=None, *, device=None, **kwargs): ...
160
def full_like(a, fill_value, dtype=None, shape=None, format=None, *, device=None, **kwargs): ...
161
def empty_like(a, dtype=None, shape=None, format=None, *, device=None, **kwargs): ...
162
```
163
164
[Array Creation](./array-creation.md)
165
166
### Mathematical Operations
167
168
Comprehensive mathematical functions including arithmetic, trigonometric, exponential, and comparison operations that preserve sparsity when possible.
169
170
```python { .api }
171
# Arithmetic operations
172
def add(x1, x2): ...
173
def subtract(x1, x2): ...
174
def multiply(x1, x2): ...
175
def divide(x1, x2): ...
176
def pow(x1, x2): ...
177
178
# Trigonometric functions
179
def sin(x): ...
180
def cos(x): ...
181
def tan(x): ...
182
def exp(x): ...
183
def log(x): ...
184
def sqrt(x): ...
185
```
186
187
[Mathematical Operations](./math-operations.md)
188
189
### Linear Algebra Operations
190
191
Matrix operations including dot products, matrix multiplication, eigenvalue computations, and other linear algebra functions optimized for sparse matrices.
192
193
```python { .api }
194
def dot(a, b): ...
195
def matmul(x1, x2): ...
196
def outer(a, b): ...
197
def kron(a, b): ...
198
def tensordot(a, b, axes=2): ...
199
def einsum(subscripts, *operands): ...
200
```
201
202
[Linear Algebra](./linear-algebra.md)
203
204
### Array Manipulation Functions
205
206
Functions for reshaping, indexing, slicing, and reorganizing sparse arrays while maintaining sparsity structure efficiently.
207
208
```python { .api }
209
def reshape(a, shape): ...
210
def transpose(a, axes=None): ...
211
def moveaxis(a, source, destination): ...
212
def squeeze(a, axis=None): ...
213
def expand_dims(a, axis): ...
214
def concatenate(arrays, axis=0): ...
215
def stack(arrays, axis=0): ...
216
```
217
218
[Array Manipulation](./array-manipulation.md)
219
220
### Reduction and Aggregation Operations
221
222
Functions for computing statistics and aggregations along specified axes, including standard reductions and NaN-aware variants.
223
224
```python { .api }
225
def sum(a, axis=None, keepdims=False): ...
226
def mean(a, axis=None, keepdims=False): ...
227
def max(a, axis=None, keepdims=False): ...
228
def min(a, axis=None, keepdims=False): ...
229
def var(a, axis=None, keepdims=False): ...
230
def std(a, axis=None, keepdims=False): ...
231
def nansum(a, axis=None, keepdims=False): ...
232
```
233
234
[Reductions](./reductions.md)
235
236
### I/O and Conversion Functions
237
238
Functions for saving, loading, and converting sparse arrays between different formats and libraries.
239
240
```python { .api }
241
def save_npz(file, *args, **kwargs): ...
242
def load_npz(file): ...
243
def asnumpy(a): ...
244
```
245
246
[I/O and Conversion](./io-conversion.md)
247
248
## Data Types
249
250
Sparse supports all NumPy data types and provides type conversion utilities:
251
252
```python { .api }
253
# Integer types
254
int8, int16, int32, int64
255
uint8, uint16, uint32, uint64
256
257
# Floating point types
258
float16, float32, float64
259
260
# Complex types
261
complex64, complex128
262
263
# Boolean type
264
bool
265
266
# Type utilities
267
def astype(a, dtype): ...
268
def can_cast(from_, to): ...
269
def result_type(*arrays_and_dtypes): ...
270
```
271
272
## Configuration
273
274
### Backend Selection
275
276
```python
277
import os
278
279
# Set backend before importing sparse
280
os.environ['SPARSE_BACKEND'] = 'Numba' # Default
281
# os.environ['SPARSE_BACKEND'] = 'Finch' # Requires sparse[finch]
282
# os.environ['SPARSE_BACKEND'] = 'MLIR' # Requires sparse[mlir]
283
284
import sparse
285
```
286
287
### Version Information
288
289
```python
290
import sparse
291
292
print(sparse.__version__) # Version string
293
print(sparse.__version_tuple__) # Version tuple
294
print(sparse.__array_api_version__) # Array API version
295
```