0
# pybaselines
1
2
A comprehensive Python library providing over 60 algorithms for baseline correction of experimental data from scientific techniques such as Raman, FTIR, NMR, XRD, XRF, PIXE, MALDI-TOF, and LIBS spectroscopy. pybaselines offers multiple mathematical approaches including polynomial fitting, spline methods, Whittaker smoothing, morphological operations, and classification-based techniques for removing baseline interference from experimental measurements.
3
4
## Package Information
5
6
- **Package Name**: pybaselines
7
- **Language**: Python
8
- **Installation**: `pip install pybaselines`
9
- **Dependencies**: NumPy (>=1.20), SciPy (>=1.6)
10
- **Optional Dependencies**: pentapy (>=1.1), numba (>=0.53)
11
12
## Core Imports
13
14
```python
15
import pybaselines
16
```
17
18
Object-oriented interface:
19
20
```python
21
from pybaselines import Baseline
22
```
23
24
Functional interface:
25
26
```python
27
from pybaselines import whittaker, polynomial, smooth, morphological, spline, classification, misc, optimizers
28
```
29
30
2D baseline correction:
31
32
```python
33
from pybaselines import Baseline2D
34
```
35
36
## Basic Usage
37
38
```python
39
import numpy as np
40
from pybaselines import Baseline
41
import matplotlib.pyplot as plt
42
43
# Generate sample data with baseline
44
x = np.linspace(0, 1000, 1000)
45
signal_peaks = 100 * np.exp(-((x - 200) / 50)**2) + 50 * np.exp(-((x - 800) / 100)**2)
46
baseline_drift = 10 + 0.01 * x + 0.00001 * x**2
47
noise = np.random.normal(0, 2, x.shape)
48
data = signal_peaks + baseline_drift + noise
49
50
# Initialize baseline corrector
51
baseline_fitter = Baseline(x_data=x)
52
53
# Apply Asymmetric Least Squares (AsLS) method
54
baseline, params = baseline_fitter.asls(data, lam=1e6, p=1e-2)
55
56
# Get corrected signal
57
corrected_signal = data - baseline
58
59
# Results
60
print(f"Baseline fitted with {len(params['weights'])} data points")
61
print(f"Final tolerance: {params['tol_history'][-1]:.2e}")
62
```
63
64
Functional interface:
65
66
```python
67
from pybaselines.whittaker import asls
68
69
# Direct function call
70
baseline, params = asls(data, lam=1e6, p=1e-2, x_data=x)
71
corrected_signal = data - baseline
72
```
73
74
## Architecture
75
76
pybaselines provides both functional and object-oriented interfaces:
77
78
### Object-Oriented Interface
79
- **`Baseline`**: Main 1D baseline correction class that inherits methods from all algorithm modules
80
- **`Baseline2D`**: 2D baseline correction class for image and 2D spectroscopic data
81
82
### Modular Structure
83
- **Algorithm Modules**: Each algorithm category is implemented as a separate module with consistent parameter patterns
84
- **Utility Functions**: Common operations like padding, window optimization, and mathematical utilities
85
- **Return Patterns**: All methods return `(baseline, params)` tuple for consistency
86
87
This design enables users to choose between convenient object-oriented access and direct functional calls while maintaining consistent interfaces across all 65+ algorithms.
88
89
## Capabilities
90
91
### Whittaker-Smoothing Methods
92
93
Penalized least squares methods using Whittaker smoothing for iteratively reweighted baseline fitting. These methods excel at handling spectra with varying peak widths and baseline curvature.
94
95
```python { .api }
96
def asls(data, lam=1e6, p=1e-2, diff_order=2, max_iter=50, tol=1e-3, weights=None, x_data=None):
97
"""Asymmetric Least Squares baseline correction."""
98
99
def airpls(data, lam=1e6, diff_order=2, max_iter=50, tol=1e-3, weights=None, x_data=None, normalize_weights=False):
100
"""Adaptive iteratively reweighted Penalized Least Squares."""
101
102
def arpls(data, lam=1e5, diff_order=2, max_iter=50, tol=1e-3, weights=None, x_data=None):
103
"""Asymmetrically reweighted Penalized Least Squares."""
104
```
105
106
[Whittaker Methods](./whittaker.md)
107
108
### Polynomial Fitting Methods
109
110
Polynomial-based baseline correction using various fitting strategies and outlier handling approaches. Suitable for simple baseline shapes and when interpretable parameters are desired.
111
112
```python { .api }
113
def poly(data, poly_order=2, weights=None, return_coef=False, x_data=None):
114
"""Basic polynomial baseline fitting."""
115
116
def modpoly(data, poly_order=2, tol=1e-3, max_iter=250, weights=None, use_original=False, mask_initial_peaks=True, num_std=1.0, x_data=None, return_coef=False):
117
"""Modified polynomial with iterative peak masking."""
118
119
def quant_reg(data, poly_order=2, quantile=0.05, tol=1e-6, max_iter=1000, weights=None, eps=None, x_data=None, return_coef=False):
120
"""Quantile regression polynomial baseline."""
121
```
122
123
[Polynomial Methods](./polynomial.md)
124
125
### Smoothing-Based Methods
126
127
Algorithms that use smoothing operations and iterative filtering to separate baseline from signal. Effective for noisy data and spectra with complex peak structures.
128
129
```python { .api }
130
def snip(data, max_half_window=None, decreasing=False, smooth_half_window=None, filter_order=2, x_data=None, pad_kwargs=None, **kwargs):
131
"""Statistical Sensitive Non-linear Iterative Peak algorithm."""
132
133
def swima(data, min_half_window=3, max_half_window=None, smooth_half_window=None, x_data=None, pad_kwargs=None, **kwargs):
134
"""Small-window moving average baseline."""
135
136
def ipsa(data, half_window=None, max_iter=500, tol=None, roi=None, original_criteria=False, x_data=None, pad_kwargs=None, **kwargs):
137
"""Iterative Polynomial Smoothing Algorithm."""
138
```
139
140
[Smoothing Methods](./smooth.md)
141
142
### Morphological Methods
143
144
Mathematical morphology operations for baseline estimation using structuring elements and morphological transformations. Particularly effective for chromatographic and mass spectrometry data.
145
146
```python { .api }
147
def mor(data, half_window=None, x_data=None, pad_kwargs=None, **kwargs):
148
"""Morphological baseline using opening operation."""
149
150
def rolling_ball(data, half_window=None, x_data=None, pad_kwargs=None, **kwargs):
151
"""Rolling ball baseline algorithm."""
152
153
def tophat(data, half_window=None, x_data=None, pad_kwargs=None, **kwargs):
154
"""Top-hat morphological baseline."""
155
```
156
157
[Morphological Methods](./morphological.md)
158
159
### Spline-Based Methods
160
161
Penalized spline methods that combine the flexibility of splines with various penalty functions. Offers excellent balance between smoothness and data fidelity.
162
163
```python { .api }
164
def pspline_asls(data, lam=1e3, p=1e-2, num_knots=100, spline_degree=3, diff_order=2, max_iter=50, tol=1e-3, weights=None, x_data=None):
165
"""Penalized spline version of AsLS."""
166
167
def mixture_model(data, lam=1e5, p=1e-2, num_knots=100, spline_degree=3, diff_order=3, max_iter=50, tol=1e-3, weights=None, symmetric=False, num_bins=None):
168
"""Mixture model baseline using splines."""
169
170
def irsqr(data, lam=100, quantile=0.05, num_knots=100, spline_degree=3, diff_order=3, max_iter=100, tol=1e-6, weights=None, x_data=None):
171
"""Iterative reweighted spline quantile regression."""
172
```
173
174
[Spline Methods](./spline.md)
175
176
### Classification-Based Methods
177
178
Methods that classify data points as belonging to baseline or peaks using statistical and heuristic approaches. Effective for automatic baseline detection without user parameter tuning.
179
180
```python { .api }
181
def fabc(data, lam=1e5, diff_order=2, weights=None, weights_as_mask=False, x_data=None):
182
"""Fully automatic baseline correction."""
183
184
def dietrich(data, smooth_half_window=None, interp_half_window=None, max_iter=100, tol=1e-3, x_data=None, pad_kwargs=None, **kwargs):
185
"""Dietrich classification baseline."""
186
187
def cwt_br(data, poly_order=2, scales=None, num_std=1., ridge_kwargs=None, x_data=None):
188
"""Continuous wavelet transform baseline recognition."""
189
```
190
191
[Classification Methods](./classification.md)
192
193
### Miscellaneous Methods
194
195
Specialized algorithms including spline-based denoising and manual baseline point interpolation for specific use cases.
196
197
```python { .api }
198
def beads(data, lam_0=0.5, lam_1=5, lam_2=4, asymmetry=0.1, max_iter=15, tol=1e-3, x_data=None):
199
"""Baseline Estimation And Denoising using Splines."""
200
201
def interp_pts(data, baseline_points, interp_method='linear', x_data=None):
202
"""Interpolate baseline from manually selected points."""
203
```
204
205
[Miscellaneous Methods](./misc.md)
206
207
### Optimization Methods
208
209
Advanced methods for parameter optimization, collaborative baseline correction, and adaptive parameter selection for improved baseline fitting performance.
210
211
```python { .api }
212
def collab_pls(data, average_dataset=True, method='asls', method_kwargs=None, x_data=None):
213
"""Collaborative Penalized Least Squares."""
214
215
def adaptive_minmax(data, x_data=None, poly_order=None, method='modpoly', weights=None, constrained_fraction=0.01, constrained_weight=1e5, estimation_poly_order=2, method_kwargs=None):
216
"""Adaptive min-max baseline correction."""
217
218
def optimize_extended_range(data, x_data=None, method='asls', side='both', width_scale=0.1, height_scale=1., sigma_scale=1./12., min_value=2, max_value=8, step=1, pad_kwargs=None, method_kwargs=None):
219
"""Optimize parameters using extended range."""
220
```
221
222
[Optimization Methods](./optimizers.md)
223
224
### Two-Dimensional Methods
225
226
Complete 2D versions of most baseline correction algorithms for processing images, 2D spectra, and other two-dimensional experimental data.
227
228
```python { .api }
229
class Baseline2D:
230
"""Main interface for 2D baseline correction algorithms."""
231
def __init__(self, x_data=None, z_data=None, check_finite=True, assume_sorted=False, output_dtype=None):
232
"""Initialize 2D baseline corrector."""
233
```
234
235
[Two-Dimensional Methods](./two-d.md)
236
237
## Common Types
238
239
```python { .api }
240
# Common return type for all baseline correction methods
241
BaselineResult = tuple[np.ndarray, dict]
242
243
# Weight arrays for iterative methods
244
WeightArray = np.ndarray
245
246
# Parameter dictionaries contain method-specific results
247
class ParameterDict(dict):
248
"""
249
Dictionary containing method results and convergence information.
250
251
Common keys:
252
- 'weights': Final weight array used for fitting
253
- 'tol_history': Convergence tolerance history for iterative methods
254
- Method-specific parameters vary by algorithm
255
"""
256
257
# Common parameter types
258
class PaddingKwargs(dict):
259
"""
260
Padding parameters for edge handling in windowed methods.
261
262
Common keys:
263
- 'mode': Padding mode ('reflect', 'constant', 'edge', etc.)
264
- 'constant_values': Values to use for constant padding
265
"""
266
```