0
# Utility Functions
1
2
Testing, benchmarking, and introspection utilities for performance analysis and development support. These functions help developers evaluate Bottleneck's performance benefits and validate functionality.
3
4
## Capabilities
5
6
### Performance Benchmarking
7
8
Comprehensive benchmarking tools to compare Bottleneck performance against NumPy equivalents.
9
10
```python { .api }
11
def bench():
12
"""
13
Run comprehensive performance benchmark comparing Bottleneck vs NumPy.
14
15
Executes a full benchmark suite testing all Bottleneck functions against
16
their NumPy equivalents across different array sizes, shapes, and data types.
17
Results show speed ratios (NumPy time / Bottleneck time) where higher
18
values indicate better Bottleneck performance.
19
20
Returns:
21
bool, always returns True after printing benchmark results
22
23
Performance Metrics Displayed:
24
- Function names and speed ratios for different array configurations
25
- Array shapes: (100,), (1000,1000) with various axes
26
- Data types: float64 arrays with and without NaN values
27
- Speed ratios > 1.0 indicate Bottleneck is faster than NumPy
28
"""
29
30
def bench_detailed(func_name, fraction_nan=0.3):
31
"""
32
Run detailed benchmark for a specific function with customizable parameters.
33
34
Provides in-depth performance analysis for a single function, including
35
timing breakdowns, memory usage, and parameter sensitivity analysis.
36
37
Parameters:
38
- func_name: str, name of Bottleneck function to benchmark
39
(e.g., 'nanmean', 'move_median', 'rankdata')
40
- fraction_nan: float, fraction of array elements to set as NaN
41
for testing NaN-handling performance (default: 0.3)
42
43
Returns:
44
None (prints detailed benchmark results)
45
46
Benchmark Details Include:
47
- Timing statistics across multiple runs
48
- Memory allocation patterns
49
- Performance scaling with array size
50
- Impact of NaN density on performance
51
- Comparison with NumPy/SciPy equivalents
52
"""
53
```
54
55
### Testing Framework
56
57
Built-in test suite execution for functionality validation.
58
59
```python { .api }
60
def test():
61
"""
62
Run the complete Bottleneck test suite.
63
64
Executes all unit tests to verify correct functionality across different
65
array configurations, data types, and edge cases. Uses pytest framework
66
for comprehensive testing coverage.
67
68
Returns:
69
bool, True if all tests pass, False if any test fails
70
71
Test Coverage Includes:
72
- Correctness verification against NumPy reference implementations
73
- Edge case handling (empty arrays, all-NaN arrays, single elements)
74
- Data type compatibility (int32, int64, float32, float64)
75
- Multi-dimensional array operations
76
- Axis parameter validation
77
- Memory layout handling (C-contiguous, Fortran-contiguous, strided)
78
- Input validation and error handling
79
"""
80
```
81
82
### Function Introspection
83
84
Utility functions for exploring and categorizing Bottleneck's API.
85
86
```python { .api }
87
def get_functions(module_name, as_string=False):
88
"""
89
Get list of functions from specified Bottleneck module.
90
91
Provides programmatic access to function lists for testing, documentation,
92
or dynamic function discovery purposes.
93
94
Parameters:
95
- module_name: str, module name to query:
96
- 'reduce': statistical reduction functions
97
- 'move': moving window functions
98
- 'nonreduce': array manipulation functions
99
- 'nonreduce_axis': axis-based manipulation functions
100
- 'all': all functions from all modules
101
- as_string: bool, return function names as strings instead of
102
function objects (default: False)
103
104
Returns:
105
list, function objects or function name strings
106
107
Available Modules:
108
- 'reduce': [nansum, nanmean, nanstd, nanvar, nanmin, nanmax, ...]
109
- 'move': [move_sum, move_mean, move_std, move_var, move_min, ...]
110
- 'nonreduce': [replace]
111
- 'nonreduce_axis': [partition, argpartition, rankdata, nanrankdata, push]
112
"""
113
```
114
115
## Usage Examples
116
117
### Performance Analysis
118
119
```python
120
import bottleneck as bn
121
122
# Run comprehensive benchmark to see overall performance gains
123
print("Running comprehensive benchmark...")
124
bn.bench()
125
126
# Output will show performance ratios like:
127
# no NaN no NaN NaN no NaN NaN
128
# (100,) (1000,1000)(1000,1000)(1000,1000)(1000,1000)
129
# axis=0 axis=0 axis=0 axis=1 axis=1
130
# nansum 29.7 1.4 1.6 2.0 2.1
131
# nanmean 99.0 2.0 1.8 3.2 2.5
132
# move_mean 6264.3 66.2 111.9 361.1 246.5
133
```
134
135
### Detailed Function Benchmarking
136
137
```python
138
import bottleneck as bn
139
140
# Analyze specific function performance with different NaN densities
141
print("Benchmarking nanmean with 10% NaN values:")
142
bn.bench_detailed('nanmean', fraction_nan=0.1)
143
144
print("\nBenchmarking nanmean with 50% NaN values:")
145
bn.bench_detailed('nanmean', fraction_nan=0.5)
146
147
# Benchmark moving window functions
148
print("\nBenchmarking move_median performance:")
149
bn.bench_detailed('move_median', fraction_nan=0.2)
150
151
# Compare different functions
152
functions_to_test = ['nansum', 'nanmean', 'nanstd', 'nanmedian']
153
for func in functions_to_test:
154
print(f"\n=== {func} ===")
155
bn.bench_detailed(func, fraction_nan=0.3)
156
```
157
158
### Function Discovery and Testing
159
160
```python
161
import bottleneck as bn
162
163
# Discover available functions by category
164
reduce_funcs = bn.get_functions('reduce', as_string=True)
165
move_funcs = bn.get_functions('move', as_string=True)
166
all_funcs = bn.get_functions('all', as_string=True)
167
168
print("Reduction functions:", reduce_funcs)
169
print("Moving window functions:", move_funcs)
170
print("Total functions available:", len(all_funcs))
171
172
# Get function objects for dynamic usage
173
move_function_objects = bn.get_functions('move', as_string=False)
174
for func in move_function_objects:
175
print(f"Function: {func.__name__}, Module: {func.__module__}")
176
177
# Test specific function categories
178
print("\nTesting reduction functions...")
179
reduce_functions = bn.get_functions('reduce')
180
for func in reduce_functions[:3]: # Test first 3 functions
181
try:
182
import numpy as np
183
test_data = np.array([1, 2, np.nan, 4, 5])
184
result = func(test_data)
185
print(f"{func.__name__}([1, 2, nan, 4, 5]) = {result}")
186
except Exception as e:
187
print(f"Error testing {func.__name__}: {e}")
188
```
189
190
### Development and Validation Workflow
191
192
```python
193
import bottleneck as bn
194
import numpy as np
195
196
# Complete development workflow example
197
def validate_bottleneck_installation():
198
"""Comprehensive validation of Bottleneck installation and performance."""
199
200
print("=== Bottleneck Installation Validation ===")
201
202
# 1. Run test suite
203
print("1. Running test suite...")
204
test_result = bn.test()
205
print(f" Tests passed: {test_result}")
206
207
# 2. Verify basic functionality
208
print("\n2. Testing basic functionality...")
209
test_data = np.array([1, 2, np.nan, 4, 5])
210
211
# Test core functions
212
results = {
213
'nanmean': bn.nanmean(test_data),
214
'nansum': bn.nansum(test_data),
215
'move_mean': bn.move_mean(test_data, window=3, min_count=1),
216
'rankdata': bn.rankdata([3, 1, 4, 1, 5])
217
}
218
219
for func_name, result in results.items():
220
print(f" {func_name}: {result}")
221
222
# 3. Check performance benefits
223
print("\n3. Quick performance check...")
224
large_data = np.random.randn(10000)
225
large_data[::10] = np.nan # Add some NaN values
226
227
import time
228
229
# Time NumPy
230
start = time.time()
231
numpy_result = np.nanmean(large_data)
232
numpy_time = time.time() - start
233
234
# Time Bottleneck
235
start = time.time()
236
bn_result = bn.nanmean(large_data)
237
bn_time = time.time() - start
238
239
speedup = numpy_time / bn_time
240
print(f" NumPy nanmean: {numpy_time:.6f}s")
241
print(f" Bottleneck nanmean: {bn_time:.6f}s")
242
print(f" Speedup: {speedup:.1f}x")
243
244
# 4. Function coverage check
245
print("\n4. Function coverage:")
246
all_functions = bn.get_functions('all', as_string=True)
247
by_category = {
248
'reduce': bn.get_functions('reduce', as_string=True),
249
'move': bn.get_functions('move', as_string=True),
250
'nonreduce': bn.get_functions('nonreduce', as_string=True),
251
'nonreduce_axis': bn.get_functions('nonreduce_axis', as_string=True)
252
}
253
254
for category, functions in by_category.items():
255
print(f" {category}: {len(functions)} functions")
256
257
print(f" Total: {len(all_functions)} functions")
258
259
return test_result and speedup > 1.0
260
261
# Run validation
262
is_working = validate_bottleneck_installation()
263
print(f"\nBottleneck working properly: {is_working}")
264
```
265
266
### Continuous Integration Testing
267
268
```python
269
import bottleneck as bn
270
import sys
271
272
def ci_test_bottleneck():
273
"""Lightweight test for CI/CD pipelines."""
274
275
# Essential functionality test
276
import numpy as np
277
278
try:
279
# Test basic operations
280
data = np.array([1, 2, np.nan, 4, 5])
281
282
assert bn.nanmean(data) == 3.0
283
assert bn.nansum(data) == 12.0
284
assert not bn.anynan(np.array([1, 2, 3]))
285
assert bn.allnan(np.array([np.nan, np.nan]))
286
287
# Test moving window
288
series = np.array([1, 2, 3, 4, 5])
289
ma = bn.move_mean(series, window=3, min_count=1)
290
assert len(ma) == len(series)
291
292
# Test ranking
293
ranks = bn.rankdata([1, 3, 2])
294
expected = np.array([1.0, 3.0, 2.0])
295
assert np.allclose(ranks, expected)
296
297
print("✓ All essential functions working")
298
return True
299
300
except Exception as e:
301
print(f"✗ Error in essential functionality: {e}")
302
return False
303
304
# Use in CI pipeline
305
if __name__ == "__main__":
306
success = ci_test_bottleneck()
307
sys.exit(0 if success else 1)
308
```
309
310
## Performance Optimization Tips
311
312
When using Bottleneck's utility functions for optimization:
313
314
1. **Use bench() periodically** to verify performance benefits on your specific hardware and data patterns
315
316
2. **Profile with bench_detailed()** when optimizing critical code paths to understand the impact of:
317
- Array size and shape
318
- NaN density in your data
319
- Memory layout (C vs Fortran order)
320
321
3. **Validate with test()** after any environment changes (Python version, NumPy version, compilation flags)
322
323
4. **Monitor function coverage** with get_functions() to ensure you're using the most optimized functions available
324
325
The utility functions themselves have minimal overhead and can be used freely in development and testing workflows.