0
# Classification-Based Methods
1
2
Baseline correction algorithms that classify data points as belonging to baseline or peak regions using statistical, morphological, or signal processing approaches. These methods use pattern recognition and thresholding techniques to distinguish between baseline and peak components, making them suitable for data with well-defined spectral features.
3
4
## Capabilities
5
6
### Dietrich Classification Baseline
7
8
Statistical classification method using local statistics and interpolation for baseline estimation.
9
10
```python { .api }
11
def dietrich(data, smooth_half_window=None, interp_half_window=None, max_iter=100, tol=1e-3, x_data=None, pad_kwargs=None, **kwargs):
12
"""
13
Dietrich classification baseline correction.
14
15
Parameters:
16
- data (array-like): Input y-values to fit baseline
17
- smooth_half_window (int, optional): Half-window size for initial smoothing
18
- interp_half_window (int, optional): Half-window size for interpolation
19
- max_iter (int): Maximum iterations for convergence
20
- tol (float): Convergence tolerance for baseline changes
21
- x_data (array-like, optional): Input x-values
22
- pad_kwargs (dict, optional): Padding parameters for edge handling
23
- **kwargs: Additional padding and processing parameters
24
25
Returns:
26
tuple: (baseline, params) with classification statistics and convergence history
27
"""
28
```
29
30
### Golotvin Classification
31
32
Morphological classification approach using local minima detection for baseline region identification.
33
34
```python { .api }
35
def golotvin(data, half_window=None, x_data=None, pad_kwargs=None, **kwargs):
36
"""
37
Golotvin classification baseline using morphological operations.
38
39
Parameters:
40
- data (array-like): Input y-values to fit baseline
41
- half_window (int, optional): Half-window size for morphological operations
42
- x_data (array-like, optional): Input x-values
43
- pad_kwargs (dict, optional): Padding parameters for edge handling
44
- **kwargs: Additional morphological processing parameters
45
46
Returns:
47
tuple: (baseline, params) with morphological operation details
48
"""
49
```
50
51
### Standard Deviation Distribution
52
53
Classification based on local standard deviation patterns to identify baseline regions.
54
55
```python { .api }
56
def std_distribution(data, half_window=None, x_data=None, pad_kwargs=None, **kwargs):
57
"""
58
Standard deviation distribution baseline classification.
59
60
Parameters:
61
- data (array-like): Input y-values to fit baseline
62
- half_window (int, optional): Half-window size for standard deviation calculation
63
- x_data (array-like, optional): Input x-values
64
- pad_kwargs (dict, optional): Padding parameters for windowing operations
65
- **kwargs: Additional statistical processing parameters
66
67
Returns:
68
tuple: (baseline, params) with local variance statistics
69
"""
70
```
71
72
### Fast Chromatographic Baseline (FastChrom)
73
74
Rapid classification method optimized for chromatographic data with threshold-based peak detection.
75
76
```python { .api }
77
def fastchrom(data, half_window=None, threshold=None, x_data=None, pad_kwargs=None, **kwargs):
78
"""
79
Fast chromatographic baseline correction with threshold classification.
80
81
Parameters:
82
- data (array-like): Input y-values to fit baseline
83
- half_window (int, optional): Half-window size for local analysis
84
- threshold (float, optional): Classification threshold for peak detection
85
- x_data (array-like, optional): Input x-values
86
- pad_kwargs (dict, optional): Padding parameters for windowing
87
- **kwargs: Additional threshold and processing parameters
88
89
Returns:
90
tuple: (baseline, params) with threshold statistics and classification results
91
"""
92
```
93
94
### Fully Automatic Baseline Correction (FABC)
95
96
Automated classification and correction method requiring minimal parameter tuning for robust baseline estimation.
97
98
```python { .api }
99
def fabc(data, lam=1e5, diff_order=2, weights=None, weights_as_mask=False, x_data=None):
100
"""
101
Fully automatic baseline correction using intelligent classification.
102
103
Parameters:
104
- data (array-like): Input y-values to fit baseline
105
- lam (float): Smoothing parameter for regularization
106
- diff_order (int): Order of difference penalty matrix
107
- weights (array-like, optional): Initial weight array or mask
108
- weights_as_mask (bool): Whether to treat weights as binary mask
109
- x_data (array-like, optional): Input x-values
110
111
Returns:
112
tuple: (baseline, params) with automatic parameter selection results
113
"""
114
```
115
116
### Continuous Wavelet Transform Baseline Recognition (CWT-BR)
117
118
Advanced signal processing approach using wavelet transforms for multi-scale baseline-peak classification.
119
120
```python { .api }
121
def cwt_br(data, poly_order=2, scales=None, num_std=1.0, ridge_kwargs=None, x_data=None):
122
"""
123
Continuous wavelet transform baseline recognition.
124
125
Parameters:
126
- data (array-like): Input y-values to fit baseline
127
- poly_order (int): Order of polynomial for final baseline fitting
128
- scales (array-like, optional): Wavelet scales for multi-resolution analysis
129
- num_std (float): Number of standard deviations for ridge detection threshold
130
- ridge_kwargs (dict, optional): Additional parameters for ridge detection
131
- x_data (array-like, optional): Input x-values
132
133
Returns:
134
tuple: (baseline, params) with wavelet analysis results and ridge detection info
135
"""
136
```
137
138
## Usage Examples
139
140
### Automatic baseline correction with FABC:
141
142
```python
143
import numpy as np
144
from pybaselines.classification import fabc
145
146
# Sample chromatographic data with multiple peaks
147
x = np.linspace(0, 500, 2000)
148
baseline_true = 10 + 0.02 * x + 0.00005 * x**2
149
peak1 = 150 * np.exp(-((x - 100) / 15)**2)
150
peak2 = 200 * np.exp(-((x - 250) / 20)**2)
151
peak3 = 120 * np.exp(-((x - 400) / 12)**2)
152
data = baseline_true + peak1 + peak2 + peak3 + np.random.normal(0, 2, len(x))
153
154
# Fully automatic baseline correction
155
baseline, params = fabc(data, lam=1e5)
156
corrected = data - baseline
157
158
print("FABC automatically determined optimal parameters")
159
```
160
161
### Wavelet-based classification:
162
163
```python
164
from pybaselines.classification import cwt_br
165
166
# Multi-scale wavelet analysis for complex spectra
167
scales = np.arange(1, 20) # Define wavelet scales
168
baseline, params = cwt_br(data, poly_order=3, scales=scales, num_std=1.5)
169
corrected = data - baseline
170
171
print(f"Detected ridges at scales: {params.get('detected_scales', [])}")
172
```
173
174
### Fast chromatographic analysis:
175
176
```python
177
from pybaselines.classification import fastchrom
178
179
# Rapid baseline correction for high-throughput analysis
180
baseline, params = fastchrom(data, half_window=20, threshold=0.1)
181
corrected = data - baseline
182
183
# Optimized for speed while maintaining accuracy
184
```
185
186
### Statistical classification with standard deviation:
187
188
```python
189
from pybaselines.classification import std_distribution
190
191
# Identify baseline regions based on local variance
192
baseline, params = std_distribution(data, half_window=25)
193
corrected = data - baseline
194
195
# Works well when baseline regions have consistent noise levels
196
```
197
198
### Morphological classification:
199
200
```python
201
from pybaselines.classification import golotvin
202
203
# Use morphological operations to find baseline points
204
baseline, params = golotvin(data, half_window=15)
205
corrected = data - baseline
206
207
# Effective for data with clear morphological differences
208
```
209
210
### Iterative statistical classification:
211
212
```python
213
from pybaselines.classification import dietrich
214
215
# Iterative approach with statistical smoothing and interpolation
216
baseline, params = dietrich(data, smooth_half_window=10, interp_half_window=30, max_iter=50)
217
corrected = data - baseline
218
219
print(f"Converged in {len(params.get('tol_history', []))} iterations")
220
```