or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

array-creation.mdarray-manipulation.mdcore-arrays.mdindex.mdio-conversion.mdlinear-algebra.mdmath-operations.mdreductions.md

reductions.mddocs/

0

# Reduction and Aggregation Operations

1

2

Functions for computing statistics and aggregations along specified axes, including standard reductions and NaN-aware variants. These operations efficiently compute summary statistics while preserving computational efficiency on sparse data.

3

4

## Capabilities

5

6

### Standard Reduction Operations

7

8

Core statistical functions that operate along specified axes or across entire arrays.

9

10

```python { .api }

11

def sum(a, axis=None, keepdims=False):

12

"""

13

Compute sum of array elements along specified axis.

14

15

Parameters:

16

- a: sparse array, input array

17

- axis: int or tuple, axis/axes along which to sum (None for all elements)

18

- keepdims: bool, whether to preserve dimensions in result

19

20

Returns:

21

Sparse array or scalar with sum of elements

22

"""

23

24

def prod(a, axis=None, keepdims=False):

25

"""

26

Compute product of array elements along specified axis.

27

28

Parameters:

29

- a: sparse array, input array

30

- axis: int or tuple, axis/axes along which to compute product

31

- keepdims: bool, whether to preserve dimensions in result

32

33

Returns:

34

Sparse array or scalar with product of elements

35

"""

36

37

def mean(a, axis=None, keepdims=False):

38

"""

39

Compute arithmetic mean along specified axis.

40

41

Parameters:

42

- a: sparse array, input array

43

- axis: int or tuple, axis/axes along which to compute mean

44

- keepdims: bool, whether to preserve dimensions in result

45

46

Returns:

47

Sparse array or scalar with mean values

48

"""

49

50

def var(a, axis=None, keepdims=False, ddof=0):

51

"""

52

Compute variance along specified axis.

53

54

Parameters:

55

- a: sparse array, input array

56

- axis: int or tuple, axis/axes along which to compute variance

57

- keepdims: bool, whether to preserve dimensions in result

58

- ddof: int, delta degrees of freedom for sample variance

59

60

Returns:

61

Sparse array or scalar with variance values

62

"""

63

64

def std(a, axis=None, keepdims=False, ddof=0):

65

"""

66

Compute standard deviation along specified axis.

67

68

Parameters:

69

- a: sparse array, input array

70

- axis: int or tuple, axis/axes along which to compute std

71

- keepdims: bool, whether to preserve dimensions in result

72

- ddof: int, delta degrees of freedom for sample std

73

74

Returns:

75

Sparse array or scalar with standard deviation values

76

"""

77

```

78

79

### Min/Max Operations

80

81

Functions for finding minimum and maximum values and their locations.

82

83

```python { .api }

84

def max(a, axis=None, keepdims=False):

85

"""

86

Find maximum values along specified axis.

87

88

Parameters:

89

- a: sparse array, input array

90

- axis: int or tuple, axis/axes along which to find maximum

91

- keepdims: bool, whether to preserve dimensions in result

92

93

Returns:

94

Sparse array or scalar with maximum values

95

"""

96

97

def min(a, axis=None, keepdims=False):

98

"""

99

Find minimum values along specified axis.

100

101

Parameters:

102

- a: sparse array, input array

103

- axis: int or tuple, axis/axes along which to find minimum

104

- keepdims: bool, whether to preserve dimensions in result

105

106

Returns:

107

Sparse array or scalar with minimum values

108

"""

109

110

def argmax(a, axis=None, keepdims=False):

111

"""

112

Find indices of maximum values along axis.

113

114

Parameters:

115

- a: sparse array, input array

116

- axis: int, axis along which to find argmax (None for global)

117

- keepdims: bool, whether to preserve dimensions in result

118

119

Returns:

120

Array with indices of maximum values

121

"""

122

123

def argmin(a, axis=None, keepdims=False):

124

"""

125

Find indices of minimum values along axis.

126

127

Parameters:

128

- a: sparse array, input array

129

- axis: int, axis along which to find argmin (None for global)

130

- keepdims: bool, whether to preserve dimensions in result

131

132

Returns:

133

Array with indices of minimum values

134

"""

135

```

136

137

### Boolean Reductions

138

139

Logical reduction operations for boolean arrays and conditions.

140

141

```python { .api }

142

def all(a, axis=None, keepdims=False):

143

"""

144

Test whether all array elements along axis evaluate to True.

145

146

Parameters:

147

- a: sparse array, input array (typically boolean)

148

- axis: int or tuple, axis/axes along which to test

149

- keepdims: bool, whether to preserve dimensions in result

150

151

Returns:

152

Sparse boolean array or scalar, True where all elements are True

153

"""

154

155

def any(a, axis=None, keepdims=False):

156

"""

157

Test whether any array element along axis evaluates to True.

158

159

Parameters:

160

- a: sparse array, input array (typically boolean)

161

- axis: int or tuple, axis/axes along which to test

162

- keepdims: bool, whether to preserve dimensions in result

163

164

Returns:

165

Sparse boolean array or scalar, True where any element is True

166

"""

167

```

168

169

### NaN-Aware Reductions

170

171

Specialized reduction functions that ignore NaN values in computations.

172

173

```python { .api }

174

def nansum(a, axis=None, keepdims=False):

175

"""

176

Compute sum along axis, ignoring NaN values.

177

178

Parameters:

179

- a: sparse array, input array

180

- axis: int or tuple, axis/axes along which to sum

181

- keepdims: bool, whether to preserve dimensions in result

182

183

Returns:

184

Sparse array or scalar with sum ignoring NaN values

185

"""

186

187

def nanprod(a, axis=None, keepdims=False):

188

"""

189

Compute product along axis, ignoring NaN values.

190

191

Parameters:

192

- a: sparse array, input array

193

- axis: int or tuple, axis/axes along which to compute product

194

- keepdims: bool, whether to preserve dimensions in result

195

196

Returns:

197

Sparse array or scalar with product ignoring NaN values

198

"""

199

200

def nanmean(a, axis=None, keepdims=False):

201

"""

202

Compute mean along axis, ignoring NaN values.

203

204

Parameters:

205

- a: sparse array, input array

206

- axis: int or tuple, axis/axes along which to compute mean

207

- keepdims: bool, whether to preserve dimensions in result

208

209

Returns:

210

Sparse array or scalar with mean ignoring NaN values

211

"""

212

213

def nanmax(a, axis=None, keepdims=False):

214

"""

215

Find maximum along axis, ignoring NaN values.

216

217

Parameters:

218

- a: sparse array, input array

219

- axis: int or tuple, axis/axes along which to find maximum

220

- keepdims: bool, whether to preserve dimensions in result

221

222

Returns:

223

Sparse array or scalar with maximum ignoring NaN values

224

"""

225

226

def nanmin(a, axis=None, keepdims=False):

227

"""

228

Find minimum along axis, ignoring NaN values.

229

230

Parameters:

231

- a: sparse array, input array

232

- axis: int or tuple, axis/axes along which to find minimum

233

- keepdims: bool, whether to preserve dimensions in result

234

235

Returns:

236

Sparse array or scalar with minimum ignoring NaN values

237

"""

238

239

def nanreduce(a, func, axis=None, keepdims=False):

240

"""

241

Generic reduction function that ignores NaN values.

242

243

Parameters:

244

- a: sparse array, input array

245

- func: callable, reduction function to apply

246

- axis: int or tuple, axis/axes along which to reduce

247

- keepdims: bool, whether to preserve dimensions in result

248

249

Returns:

250

Result of applying func along axis, ignoring NaN values

251

"""

252

```

253

254

## Usage Examples

255

256

### Basic Reductions

257

258

```python

259

import sparse

260

import numpy as np

261

262

# Create test array

263

test_array = sparse.COO.from_numpy(

264

np.array([[1, 0, 3, 0], [5, 2, 0, 4], [0, 0, 6, 1]])

265

)

266

print(f"Test array shape: {test_array.shape}")

267

print(f"Test array nnz: {test_array.nnz}")

268

269

# Global reductions (entire array)

270

total_sum = sparse.sum(test_array)

271

mean_value = sparse.mean(test_array)

272

max_value = sparse.max(test_array)

273

min_value = sparse.min(test_array)

274

275

print(f"Total sum: {total_sum.todense()}") # 22

276

print(f"Mean: {mean_value.todense():.2f}") # 1.83

277

print(f"Max: {max_value.todense()}") # 6

278

print(f"Min: {min_value.todense()}") # 0 (sparse arrays include zeros)

279

```

280

281

### Axis-Specific Reductions

282

283

```python

284

# Row-wise reductions (axis=1)

285

row_sums = sparse.sum(test_array, axis=1)

286

row_means = sparse.mean(test_array, axis=1)

287

row_max = sparse.max(test_array, axis=1)

288

289

print(f"Row sums shape: {row_sums.shape}") # (3,)

290

print(f"Row sums: {row_sums.todense()}") # [4, 11, 7]

291

print(f"Row means: {row_means.todense()}") # [1.0, 2.75, 1.75]

292

293

# Column-wise reductions (axis=0)

294

col_sums = sparse.sum(test_array, axis=0)

295

col_means = sparse.mean(test_array, axis=0)

296

297

print(f"Column sums shape: {col_sums.shape}") # (4,)

298

print(f"Column sums: {col_sums.todense()}") # [6, 2, 9, 5]

299

```

300

301

### Keepdims Parameter

302

303

```python

304

# Compare results with and without keepdims

305

row_sums_keepdims = sparse.sum(test_array, axis=1, keepdims=True)

306

row_sums_no_keepdims = sparse.sum(test_array, axis=1, keepdims=False)

307

308

print(f"With keepdims: {row_sums_keepdims.shape}") # (3, 1)

309

print(f"Without keepdims: {row_sums_no_keepdims.shape}") # (3,)

310

311

# Keepdims useful for broadcasting

312

normalized = test_array / row_sums_keepdims # Broadcasting works

313

print(f"Normalized array shape: {normalized.shape}")

314

```

315

316

### Multiple Axis Reductions

317

318

```python

319

# Create 3D array for multi-axis reductions

320

array_3d = sparse.random((4, 5, 6), density=0.2)

321

322

# Reduce along multiple axes

323

sum_axes_01 = sparse.sum(array_3d, axis=(0, 1)) # Sum over first two axes

324

mean_axes_02 = sparse.mean(array_3d, axis=(0, 2)) # Mean over first and last axes

325

326

print(f"Original shape: {array_3d.shape}") # (4, 5, 6)

327

print(f"Sum axes (0,1): {sum_axes_01.shape}") # (6,)

328

print(f"Mean axes (0,2): {mean_axes_02.shape}") # (5,)

329

330

# All axes - equivalent to global reduction

331

sum_all_axes = sparse.sum(array_3d, axis=(0, 1, 2))

332

sum_global = sparse.sum(array_3d)

333

print(f"All axes equal global: {np.isclose(sum_all_axes.todense(), sum_global.todense())}")

334

```

335

336

### Statistical Measures

337

338

```python

339

# Variance and standard deviation

340

data = sparse.random((100, 50), density=0.1)

341

342

variance = sparse.var(data, axis=0) # Column-wise variance

343

std_dev = sparse.std(data, axis=0) # Column-wise standard deviation

344

std_sample = sparse.std(data, axis=0, ddof=1) # Sample standard deviation

345

346

print(f"Population std vs sample std:")

347

print(f"Population: {sparse.mean(std_dev).todense():.4f}")

348

print(f"Sample: {sparse.mean(std_sample).todense():.4f}")

349

350

# Verify relationship: std = sqrt(var)

351

print(f"Std² ≈ Var: {np.allclose((std_dev ** 2).todense(), variance.todense())}")

352

```

353

354

### Index Finding Operations

355

356

```python

357

# Find locations of extreme values

358

large_array = sparse.random((20, 30), density=0.05)

359

360

# Global argmax/argmin

361

global_max_idx = sparse.argmax(large_array)

362

global_min_idx = sparse.argmin(large_array)

363

364

print(f"Global max index: {global_max_idx}")

365

print(f"Global min index: {global_min_idx}")

366

367

# Axis-specific argmax/argmin

368

row_max_indices = sparse.argmax(large_array, axis=1) # Max in each row

369

col_max_indices = sparse.argmax(large_array, axis=0) # Max in each column

370

371

print(f"Row max indices shape: {row_max_indices.shape}") # (20,)

372

print(f"Column max indices shape: {col_max_indices.shape}") # (30,)

373

```

374

375

### Boolean Reductions

376

377

```python

378

# Create boolean conditions

379

condition_array = sparse.greater(test_array, 2)

380

print(f"Elements > 2:")

381

print(condition_array.todense())

382

383

# Boolean reductions

384

any_gt_2 = sparse.any(condition_array) # Any element > 2?

385

all_gt_2 = sparse.all(condition_array) # All elements > 2?

386

387

any_rows = sparse.any(condition_array, axis=1) # Any > 2 in each row?

388

all_cols = sparse.all(condition_array, axis=0) # All > 2 in each column?

389

390

print(f"Any > 2: {any_gt_2.todense()}") # True

391

print(f"All > 2: {all_gt_2.todense()}") # False

392

print(f"Any per row: {any_rows.todense()}") # [True, True, True]

393

print(f"All per column: {all_cols.todense()}") # [False, False, False, False]

394

```

395

396

### NaN-Aware Reductions

397

398

```python

399

# Create array with NaN values

400

array_with_nan = sparse.COO.from_numpy(

401

np.array([[1.0, np.nan, 3.0], [4.0, 2.0, np.nan], [np.nan, 5.0, 6.0]])

402

)

403

404

# Compare standard vs NaN-aware reductions

405

regular_sum = sparse.sum(array_with_nan, axis=1)

406

nan_aware_sum = sparse.nansum(array_with_nan, axis=1)

407

408

regular_mean = sparse.mean(array_with_nan, axis=1)

409

nan_aware_mean = sparse.nanmean(array_with_nan, axis=1)

410

411

print("Regular vs NaN-aware reductions:")

412

print(f"Regular sum: {regular_sum.todense()}") # Contains NaN

413

print(f"NaN-aware sum: {nan_aware_sum.todense()}") # Ignores NaN

414

print(f"Regular mean: {regular_mean.todense()}") # Contains NaN

415

print(f"NaN-aware mean: {nan_aware_mean.todense()}") # Ignores NaN

416

```

417

418

### Custom Reductions

419

420

```python

421

# Using nanreduce for custom operations

422

def geometric_mean_func(arr):

423

"""Custom geometric mean function"""

424

return np.exp(np.mean(np.log(arr)))

425

426

# Apply custom reduction (avoiding zeros for log)

427

positive_array = sparse.random((10, 10), density=0.1) + 0.1

428

429

# Use nanreduce with custom function

430

custom_result = sparse.nanreduce(positive_array, geometric_mean_func, axis=0)

431

print(f"Custom geometric mean shape: {custom_result.shape}")

432

```

433

434

### Large-Scale Reductions

435

436

```python

437

# Efficient reductions on large sparse arrays

438

large_sparse = sparse.random((10000, 5000), density=0.001) # Very sparse

439

440

# These operations are memory efficient due to sparsity

441

row_sums_large = sparse.sum(large_sparse, axis=1)

442

col_means_large = sparse.mean(large_sparse, axis=0)

443

444

print(f"Large array: {large_sparse.shape}, density: {large_sparse.density:.4%}")

445

print(f"Row sums nnz: {row_sums_large.nnz} / {row_sums_large.size}")

446

print(f"Col means nnz: {col_means_large.nnz} / {col_means_large.size}")

447

448

# Global statistics are single values

449

global_stats = {

450

'sum': sparse.sum(large_sparse).todense(),

451

'mean': sparse.mean(large_sparse).todense(),

452

'std': sparse.std(large_sparse).todense(),

453

'max': sparse.max(large_sparse).todense(),

454

'min': sparse.min(large_sparse).todense()

455

}

456

457

print("Global statistics:", global_stats)

458

```

459

460

### Performance Considerations for Sparse Reductions

461

462

```python

463

# Demonstrating sparsity preservation in reductions

464

original = sparse.random((1000, 1000), density=0.01)

465

print(f"Original density: {original.density:.2%}")

466

467

# Reductions along different axes have different density implications

468

axis0_reduction = sparse.sum(original, axis=0) # Often denser

469

axis1_reduction = sparse.sum(original, axis=1) # Often denser

470

global_reduction = sparse.sum(original) # Single value

471

472

print(f"Axis-0 reduction nnz: {axis0_reduction.nnz} / {axis0_reduction.size}")

473

print(f"Axis-1 reduction nnz: {axis1_reduction.nnz} / {axis1_reduction.size}")

474

print(f"Global reduction: {global_reduction.todense()}")

475

```

476

477

## Performance and Memory Considerations

478

479

### Computational Efficiency

480

481

- **Sparse structure**: Operations only compute on stored (non-zero) elements

482

- **Axis selection**: Different axes may have different computational costs

483

- **Memory usage**: Reductions typically produce denser results than inputs

484

- **Keepdims**: Can enable efficient broadcasting in subsequent operations

485

486

### Optimization Tips

487

488

- Use axis-specific reductions when possible for better memory efficiency

489

- Consider using `keepdims=True` when the result will be used for broadcasting

490

- NaN-aware functions have additional overhead but handle missing data correctly

491

- Boolean reductions (`any`, `all`) can short-circuit for efficiency