Tessl Tile for pypi/sparse@0.17.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

array-creation.md array-manipulation.md core-arrays.md index.md io-conversion.md linear-algebra.md math-operations.md reductions.md

reductions.mddocs/

0
# Reduction and Aggregation Operations
1

2
Functions for computing statistics and aggregations along specified axes, including standard reductions and NaN-aware variants. These operations efficiently compute summary statistics while preserving computational efficiency on sparse data.
3

4
## Capabilities
5

6
### Standard Reduction Operations
7

8
Core statistical functions that operate along specified axes or across entire arrays.
9

10
```python { .api }
11
def sum(a, axis=None, keepdims=False):
12
    """
13
    Compute sum of array elements along specified axis.
14
    
15
    Parameters:
16
    - a: sparse array, input array
17
    - axis: int or tuple, axis/axes along which to sum (None for all elements)
18
    - keepdims: bool, whether to preserve dimensions in result
19
    
20
    Returns:
21
    Sparse array or scalar with sum of elements
22
    """
23

24
def prod(a, axis=None, keepdims=False):
25
    """
26
    Compute product of array elements along specified axis.
27
    
28
    Parameters:
29
    - a: sparse array, input array
30
    - axis: int or tuple, axis/axes along which to compute product
31
    - keepdims: bool, whether to preserve dimensions in result
32
    
33
    Returns:
34
    Sparse array or scalar with product of elements
35
    """
36

37
def mean(a, axis=None, keepdims=False):
38
    """
39
    Compute arithmetic mean along specified axis.
40
    
41
    Parameters:
42
    - a: sparse array, input array
43
    - axis: int or tuple, axis/axes along which to compute mean
44
    - keepdims: bool, whether to preserve dimensions in result
45
    
46
    Returns:
47
    Sparse array or scalar with mean values
48
    """
49

50
def var(a, axis=None, keepdims=False, ddof=0):
51
    """
52
    Compute variance along specified axis.
53
    
54
    Parameters:
55
    - a: sparse array, input array
56
    - axis: int or tuple, axis/axes along which to compute variance
57
    - keepdims: bool, whether to preserve dimensions in result  
58
    - ddof: int, delta degrees of freedom for sample variance
59
    
60
    Returns:
61
    Sparse array or scalar with variance values
62
    """
63

64
def std(a, axis=None, keepdims=False, ddof=0):
65
    """
66
    Compute standard deviation along specified axis.
67
    
68
    Parameters:
69
    - a: sparse array, input array
70
    - axis: int or tuple, axis/axes along which to compute std
71
    - keepdims: bool, whether to preserve dimensions in result
72
    - ddof: int, delta degrees of freedom for sample std
73
    
74
    Returns:
75
    Sparse array or scalar with standard deviation values
76
    """
77
```
78

79
### Min/Max Operations
80

81
Functions for finding minimum and maximum values and their locations.
82

83
```python { .api }
84
def max(a, axis=None, keepdims=False):
85
    """
86
    Find maximum values along specified axis.
87
    
88
    Parameters:
89
    - a: sparse array, input array
90
    - axis: int or tuple, axis/axes along which to find maximum
91
    - keepdims: bool, whether to preserve dimensions in result
92
    
93
    Returns:
94
    Sparse array or scalar with maximum values
95
    """
96

97
def min(a, axis=None, keepdims=False):
98
    """
99
    Find minimum values along specified axis.
100
    
101
    Parameters:
102
    - a: sparse array, input array  
103
    - axis: int or tuple, axis/axes along which to find minimum
104
    - keepdims: bool, whether to preserve dimensions in result
105
    
106
    Returns:
107
    Sparse array or scalar with minimum values
108
    """
109

110
def argmax(a, axis=None, keepdims=False):
111
    """
112
    Find indices of maximum values along axis.
113
    
114
    Parameters:
115
    - a: sparse array, input array
116
    - axis: int, axis along which to find argmax (None for global)
117
    - keepdims: bool, whether to preserve dimensions in result
118
    
119
    Returns:
120
    Array with indices of maximum values
121
    """
122

123
def argmin(a, axis=None, keepdims=False):
124
    """
125
    Find indices of minimum values along axis.
126
    
127
    Parameters:
128
    - a: sparse array, input array
129
    - axis: int, axis along which to find argmin (None for global)  
130
    - keepdims: bool, whether to preserve dimensions in result
131
    
132
    Returns:
133
    Array with indices of minimum values
134
    """
135
```
136

137
### Boolean Reductions
138

139
Logical reduction operations for boolean arrays and conditions.
140

141
```python { .api }
142
def all(a, axis=None, keepdims=False):
143
    """
144
    Test whether all array elements along axis evaluate to True.
145
    
146
    Parameters:
147
    - a: sparse array, input array (typically boolean)
148
    - axis: int or tuple, axis/axes along which to test
149
    - keepdims: bool, whether to preserve dimensions in result
150
    
151
    Returns:
152
    Sparse boolean array or scalar, True where all elements are True
153
    """
154

155
def any(a, axis=None, keepdims=False):
156
    """
157
    Test whether any array element along axis evaluates to True.
158
    
159
    Parameters:
160
    - a: sparse array, input array (typically boolean)
161
    - axis: int or tuple, axis/axes along which to test
162
    - keepdims: bool, whether to preserve dimensions in result
163
    
164
    Returns:
165
    Sparse boolean array or scalar, True where any element is True
166
    """
167
```
168

169
### NaN-Aware Reductions
170

171
Specialized reduction functions that ignore NaN values in computations.
172

173
```python { .api }
174
def nansum(a, axis=None, keepdims=False):
175
    """
176
    Compute sum along axis, ignoring NaN values.
177
    
178
    Parameters:
179
    - a: sparse array, input array
180
    - axis: int or tuple, axis/axes along which to sum
181
    - keepdims: bool, whether to preserve dimensions in result
182
    
183
    Returns:
184
    Sparse array or scalar with sum ignoring NaN values
185
    """
186

187
def nanprod(a, axis=None, keepdims=False):
188
    """
189
    Compute product along axis, ignoring NaN values.
190
    
191
    Parameters:
192
    - a: sparse array, input array
193
    - axis: int or tuple, axis/axes along which to compute product
194
    - keepdims: bool, whether to preserve dimensions in result
195
    
196
    Returns:
197
    Sparse array or scalar with product ignoring NaN values
198
    """
199

200
def nanmean(a, axis=None, keepdims=False):
201
    """
202
    Compute mean along axis, ignoring NaN values.
203
    
204
    Parameters:
205
    - a: sparse array, input array
206
    - axis: int or tuple, axis/axes along which to compute mean
207
    - keepdims: bool, whether to preserve dimensions in result
208
    
209
    Returns:
210
    Sparse array or scalar with mean ignoring NaN values
211
    """
212

213
def nanmax(a, axis=None, keepdims=False):
214
    """
215
    Find maximum along axis, ignoring NaN values.
216
    
217
    Parameters:
218
    - a: sparse array, input array
219
    - axis: int or tuple, axis/axes along which to find maximum
220
    - keepdims: bool, whether to preserve dimensions in result
221
    
222
    Returns:
223
    Sparse array or scalar with maximum ignoring NaN values
224
    """
225

226
def nanmin(a, axis=None, keepdims=False):
227
    """
228
    Find minimum along axis, ignoring NaN values.
229
    
230
    Parameters:
231
    - a: sparse array, input array
232
    - axis: int or tuple, axis/axes along which to find minimum
233
    - keepdims: bool, whether to preserve dimensions in result
234
    
235
    Returns:
236
    Sparse array or scalar with minimum ignoring NaN values
237
    """
238

239
def nanreduce(a, func, axis=None, keepdims=False):
240
    """
241
    Generic reduction function that ignores NaN values.
242
    
243
    Parameters:
244
    - a: sparse array, input array
245
    - func: callable, reduction function to apply
246
    - axis: int or tuple, axis/axes along which to reduce
247
    - keepdims: bool, whether to preserve dimensions in result
248
    
249
    Returns:
250
    Result of applying func along axis, ignoring NaN values
251
    """
252
```
253

254
## Usage Examples
255

256
### Basic Reductions
257

258
```python
259
import sparse
260
import numpy as np
261

262
# Create test array
263
test_array = sparse.COO.from_numpy(
264
    np.array([[1, 0, 3, 0], [5, 2, 0, 4], [0, 0, 6, 1]])
265
)
266
print(f"Test array shape: {test_array.shape}")
267
print(f"Test array nnz: {test_array.nnz}")
268

269
# Global reductions (entire array)
270
total_sum = sparse.sum(test_array)
271
mean_value = sparse.mean(test_array)
272
max_value = sparse.max(test_array)
273
min_value = sparse.min(test_array)
274

275
print(f"Total sum: {total_sum.todense()}")      # 22
276
print(f"Mean: {mean_value.todense():.2f}")      # 1.83
277
print(f"Max: {max_value.todense()}")            # 6
278
print(f"Min: {min_value.todense()}")            # 0 (sparse arrays include zeros)
279
```
280

281
### Axis-Specific Reductions
282

283
```python
284
# Row-wise reductions (axis=1)
285
row_sums = sparse.sum(test_array, axis=1)
286
row_means = sparse.mean(test_array, axis=1)
287
row_max = sparse.max(test_array, axis=1)
288

289
print(f"Row sums shape: {row_sums.shape}")      # (3,)
290
print(f"Row sums: {row_sums.todense()}")        # [4, 11, 7]
291
print(f"Row means: {row_means.todense()}")      # [1.0, 2.75, 1.75]
292

293
# Column-wise reductions (axis=0)  
294
col_sums = sparse.sum(test_array, axis=0)
295
col_means = sparse.mean(test_array, axis=0)
296

297
print(f"Column sums shape: {col_sums.shape}")   # (4,)
298
print(f"Column sums: {col_sums.todense()}")     # [6, 2, 9, 5]
299
```
300

301
### Keepdims Parameter
302

303
```python
304
# Compare results with and without keepdims
305
row_sums_keepdims = sparse.sum(test_array, axis=1, keepdims=True)
306
row_sums_no_keepdims = sparse.sum(test_array, axis=1, keepdims=False)
307

308
print(f"With keepdims: {row_sums_keepdims.shape}")     # (3, 1)
309
print(f"Without keepdims: {row_sums_no_keepdims.shape}") # (3,)
310

311
# Keepdims useful for broadcasting
312
normalized = test_array / row_sums_keepdims  # Broadcasting works
313
print(f"Normalized array shape: {normalized.shape}")
314
```
315

316
### Multiple Axis Reductions
317

318
```python
319
# Create 3D array for multi-axis reductions
320
array_3d = sparse.random((4, 5, 6), density=0.2)
321

322
# Reduce along multiple axes
323
sum_axes_01 = sparse.sum(array_3d, axis=(0, 1))  # Sum over first two axes
324
mean_axes_02 = sparse.mean(array_3d, axis=(0, 2)) # Mean over first and last axes
325

326
print(f"Original shape: {array_3d.shape}")       # (4, 5, 6)
327
print(f"Sum axes (0,1): {sum_axes_01.shape}")    # (6,)
328
print(f"Mean axes (0,2): {mean_axes_02.shape}")  # (5,)
329

330
# All axes - equivalent to global reduction
331
sum_all_axes = sparse.sum(array_3d, axis=(0, 1, 2))
332
sum_global = sparse.sum(array_3d)
333
print(f"All axes equal global: {np.isclose(sum_all_axes.todense(), sum_global.todense())}")
334
```
335

336
### Statistical Measures
337

338
```python
339
# Variance and standard deviation
340
data = sparse.random((100, 50), density=0.1)
341

342
variance = sparse.var(data, axis=0)    # Column-wise variance
343
std_dev = sparse.std(data, axis=0)     # Column-wise standard deviation
344
std_sample = sparse.std(data, axis=0, ddof=1)  # Sample standard deviation
345

346
print(f"Population std vs sample std:")
347
print(f"Population: {sparse.mean(std_dev).todense():.4f}")
348
print(f"Sample: {sparse.mean(std_sample).todense():.4f}")
349

350
# Verify relationship: std = sqrt(var)
351
print(f"Std² ≈ Var: {np.allclose((std_dev ** 2).todense(), variance.todense())}")
352
```
353

354
### Index Finding Operations
355

356
```python
357
# Find locations of extreme values
358
large_array = sparse.random((20, 30), density=0.05)
359

360
# Global argmax/argmin
361
global_max_idx = sparse.argmax(large_array)
362
global_min_idx = sparse.argmin(large_array)
363

364
print(f"Global max index: {global_max_idx}")
365
print(f"Global min index: {global_min_idx}")
366

367
# Axis-specific argmax/argmin
368
row_max_indices = sparse.argmax(large_array, axis=1)  # Max in each row
369
col_max_indices = sparse.argmax(large_array, axis=0)  # Max in each column
370

371
print(f"Row max indices shape: {row_max_indices.shape}")  # (20,)
372
print(f"Column max indices shape: {col_max_indices.shape}")  # (30,)
373
```
374

375
### Boolean Reductions
376

377
```python
378
# Create boolean conditions
379
condition_array = sparse.greater(test_array, 2)
380
print(f"Elements > 2:")
381
print(condition_array.todense())
382

383
# Boolean reductions
384
any_gt_2 = sparse.any(condition_array)           # Any element > 2?
385
all_gt_2 = sparse.all(condition_array)           # All elements > 2?
386

387
any_rows = sparse.any(condition_array, axis=1)   # Any > 2 in each row?
388
all_cols = sparse.all(condition_array, axis=0)   # All > 2 in each column?
389

390
print(f"Any > 2: {any_gt_2.todense()}")         # True
391
print(f"All > 2: {all_gt_2.todense()}")         # False
392
print(f"Any per row: {any_rows.todense()}")     # [True, True, True]
393
print(f"All per column: {all_cols.todense()}")   # [False, False, False, False]
394
```
395

396
### NaN-Aware Reductions
397

398
```python
399
# Create array with NaN values
400
array_with_nan = sparse.COO.from_numpy(
401
    np.array([[1.0, np.nan, 3.0], [4.0, 2.0, np.nan], [np.nan, 5.0, 6.0]])
402
)
403

404
# Compare standard vs NaN-aware reductions
405
regular_sum = sparse.sum(array_with_nan, axis=1)
406
nan_aware_sum = sparse.nansum(array_with_nan, axis=1)
407

408
regular_mean = sparse.mean(array_with_nan, axis=1)  
409
nan_aware_mean = sparse.nanmean(array_with_nan, axis=1)
410

411
print("Regular vs NaN-aware reductions:")
412
print(f"Regular sum: {regular_sum.todense()}")     # Contains NaN
413
print(f"NaN-aware sum: {nan_aware_sum.todense()}")  # Ignores NaN
414
print(f"Regular mean: {regular_mean.todense()}")    # Contains NaN
415
print(f"NaN-aware mean: {nan_aware_mean.todense()}")  # Ignores NaN
416
```
417

418
### Custom Reductions
419

420
```python
421
# Using nanreduce for custom operations
422
def geometric_mean_func(arr):
423
    """Custom geometric mean function"""
424
    return np.exp(np.mean(np.log(arr)))
425

426
# Apply custom reduction (avoiding zeros for log)
427
positive_array = sparse.random((10, 10), density=0.1) + 0.1
428

429
# Use nanreduce with custom function
430
custom_result = sparse.nanreduce(positive_array, geometric_mean_func, axis=0)
431
print(f"Custom geometric mean shape: {custom_result.shape}")
432
```
433

434
### Large-Scale Reductions
435

436
```python
437
# Efficient reductions on large sparse arrays
438
large_sparse = sparse.random((10000, 5000), density=0.001)  # Very sparse
439

440
# These operations are memory efficient due to sparsity
441
row_sums_large = sparse.sum(large_sparse, axis=1)
442
col_means_large = sparse.mean(large_sparse, axis=0)
443

444
print(f"Large array: {large_sparse.shape}, density: {large_sparse.density:.4%}")
445
print(f"Row sums nnz: {row_sums_large.nnz} / {row_sums_large.size}")
446
print(f"Col means nnz: {col_means_large.nnz} / {col_means_large.size}")
447

448
# Global statistics are single values
449
global_stats = {
450
    'sum': sparse.sum(large_sparse).todense(),
451
    'mean': sparse.mean(large_sparse).todense(),
452
    'std': sparse.std(large_sparse).todense(),
453
    'max': sparse.max(large_sparse).todense(),
454
    'min': sparse.min(large_sparse).todense()
455
}
456

457
print("Global statistics:", global_stats)
458
```
459

460
### Performance Considerations for Sparse Reductions
461

462
```python
463
# Demonstrating sparsity preservation in reductions
464
original = sparse.random((1000, 1000), density=0.01)
465
print(f"Original density: {original.density:.2%}")
466

467
# Reductions along different axes have different density implications
468
axis0_reduction = sparse.sum(original, axis=0)  # Often denser
469
axis1_reduction = sparse.sum(original, axis=1)  # Often denser
470
global_reduction = sparse.sum(original)         # Single value
471

472
print(f"Axis-0 reduction nnz: {axis0_reduction.nnz} / {axis0_reduction.size}")
473
print(f"Axis-1 reduction nnz: {axis1_reduction.nnz} / {axis1_reduction.size}")
474
print(f"Global reduction: {global_reduction.todense()}")
475
```
476

477
## Performance and Memory Considerations
478

479
### Computational Efficiency
480

481
- **Sparse structure**: Operations only compute on stored (non-zero) elements
482
- **Axis selection**: Different axes may have different computational costs
483
- **Memory usage**: Reductions typically produce denser results than inputs
484
- **Keepdims**: Can enable efficient broadcasting in subsequent operations
485

486
### Optimization Tips
487

488
- Use axis-specific reductions when possible for better memory efficiency
489
- Consider using `keepdims=True` when the result will be used for broadcasting
490
- NaN-aware functions have additional overhead but handle missing data correctly
491
- Boolean reductions (`any`, `all`) can short-circuit for efficiency

Version

Tile

Files

reductions.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

reductions.mddocs/