0
# Bottleneck
1
2
Fast NumPy array functions written in C for high-performance numerical computing. Bottleneck provides optimized implementations of statistical functions, moving window operations, ranking functions, and array manipulation utilities, delivering significant speed improvements over standard NumPy implementations.
3
4
## Package Information
5
6
- **Package Name**: bottleneck
7
- **Language**: Python
8
- **Installation**: `pip install bottleneck`
9
10
## Core Imports
11
12
```python
13
import bottleneck as bn
14
```
15
16
For testing and benchmarking:
17
18
```python
19
import bottleneck as bn
20
# Run benchmarks
21
bn.bench()
22
# Run tests
23
bn.test()
24
```
25
26
## Basic Usage
27
28
```python
29
import bottleneck as bn
30
import numpy as np
31
32
# Create sample data with NaN values
33
data = np.array([1, 2, np.nan, 4, 5])
34
35
# Statistical functions ignore NaN values
36
mean_val = bn.nanmean(data) # 3.0
37
sum_val = bn.nansum(data) # 12.0
38
std_val = bn.nanstd(data) # 1.58...
39
40
# Moving window operations
41
series = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
42
moving_avg = bn.move_mean(series, window=3, min_count=1)
43
# array([1. , 1.5, 2. , 3. , 4. , 5. , 6. , 7. , 8. , 9. ])
44
45
# Ranking operations
46
ranks = bn.rankdata([3, 1, 4, 1, 5]) # array([3., 1.5, 4., 1.5, 5.])
47
48
# Array manipulation
49
arr = np.array([1, 2, np.nan, 4, 5])
50
bn.replace(arr, np.nan, 0) # Replaces NaN with 0 in-place
51
```
52
53
## Architecture
54
55
Bottleneck uses a dual-implementation approach for maximum performance:
56
57
- **Fast Path**: C-optimized implementations for supported data types (int32, int64, float32, float64)
58
- **Slow Path**: Pure Python/NumPy fallback implementations for all other data types
59
- **Automatic Selection**: Functions automatically choose the optimal implementation based on input array properties
60
61
The library automatically falls back to slower implementations for:
62
- Unsupported data types (e.g., float16, complex types)
63
- Byte-swapped arrays
64
- Special array layouts that can't be optimized
65
66
This design ensures compatibility with all NumPy arrays while providing significant performance gains where possible.
67
68
## Capabilities
69
70
### Reduction Functions
71
72
Statistical and aggregation functions that reduce arrays along specified axes, with optimized NaN handling and support for all common statistical operations.
73
74
```python { .api }
75
def nansum(a, axis=None): ...
76
def nanmean(a, axis=None): ...
77
def nanstd(a, axis=None, ddof=0): ...
78
def nanvar(a, axis=None, ddof=0): ...
79
def nanmin(a, axis=None): ...
80
def nanmax(a, axis=None): ...
81
def nanargmin(a, axis=None): ...
82
def nanargmax(a, axis=None): ...
83
def median(a, axis=None): ...
84
def nanmedian(a, axis=None): ...
85
def ss(a, axis=None): ...
86
def anynan(a, axis=None): ...
87
def allnan(a, axis=None): ...
88
```
89
90
[Reduction Functions](./reduction-functions.md)
91
92
### Moving Window Functions
93
94
High-performance moving window operations for time series analysis and sequential data processing, supporting customizable window sizes and minimum count requirements.
95
96
```python { .api }
97
def move_sum(a, window, min_count=None, axis=-1): ...
98
def move_mean(a, window, min_count=None, axis=-1): ...
99
def move_std(a, window, min_count=None, axis=-1, ddof=0): ...
100
def move_var(a, window, min_count=None, axis=-1, ddof=0): ...
101
def move_min(a, window, min_count=None, axis=-1): ...
102
def move_max(a, window, min_count=None, axis=-1): ...
103
def move_argmin(a, window, min_count=None, axis=-1): ...
104
def move_argmax(a, window, min_count=None, axis=-1): ...
105
def move_median(a, window, min_count=None, axis=-1): ...
106
def move_rank(a, window, min_count=None, axis=-1): ...
107
```
108
109
[Moving Window Functions](./moving-window-functions.md)
110
111
### Array Manipulation Functions
112
113
Utilities for array transformation, ranking, and data manipulation operations that maintain array structure while modifying values or order.
114
115
```python { .api }
116
def replace(a, old, new): ...
117
def rankdata(a, axis=None): ...
118
def nanrankdata(a, axis=None): ...
119
def partition(a, kth, axis=-1): ...
120
def argpartition(a, kth, axis=-1): ...
121
def push(a, n=None, axis=-1): ...
122
```
123
124
[Array Manipulation Functions](./array-manipulation-functions.md)
125
126
### Utility Functions
127
128
Testing, benchmarking, and introspection utilities for performance analysis and development support.
129
130
```python { .api }
131
def bench(): ...
132
def bench_detailed(func_name, fraction_nan=0.3): ...
133
def test(): ...
134
def get_functions(module_name, as_string=False): ...
135
```
136
137
[Utility Functions](./utility-functions.md)
138
139
## Performance Notes
140
141
Bottleneck delivers substantial performance improvements over NumPy:
142
143
- **Reduction operations**: 2x to 100x+ faster for NaN-aware functions
144
- **Moving window operations**: 10x to 1000x+ faster than equivalent NumPy implementations
145
- **Ranking functions**: 2x to 50x faster than SciPy equivalents
146
- **Memory efficiency**: Optimized algorithms use less memory and reduce allocation overhead
147
148
Performance gains are most significant for:
149
- Large arrays (1000+ elements)
150
- Operations involving NaN values
151
- Moving window calculations on time series data
152
- Repeated statistical computations on similar data types