0
# Optimization Methods
1
2
Advanced baseline correction algorithms that use optimization strategies, parameter tuning, and collaborative approaches to improve baseline estimation. These methods enhance existing algorithms through automated parameter selection, multi-dataset collaboration, and adaptive parameter adjustment based on data characteristics.
3
4
## Capabilities
5
6
### Collaborative Penalized Least Squares (Collab-PLS)
7
8
Enhances baseline correction by leveraging information from multiple related datasets or multiple algorithms applied to the same dataset.
9
10
```python { .api }
11
def collab_pls(data, average_dataset=True, method='asls', method_kwargs=None, x_data=None):
12
"""
13
Collaborative Penalized Least Squares for enhanced baseline correction.
14
15
Uses collaborative filtering principles to improve baseline estimation by
16
combining information from multiple datasets or algorithm variants.
17
18
Parameters:
19
- data (array-like): Input y-values or list of datasets for collaborative correction
20
If single array: applies collaborative enhancement
21
If list of arrays: performs multi-dataset collaboration
22
- average_dataset (bool): Whether to average results across multiple datasets
23
- method (str): Base baseline correction method to enhance
24
Options: 'asls', 'airpls', 'arpls', 'iarpls', etc.
25
- method_kwargs (dict, optional): Parameters for the base correction method
26
- x_data (array-like, optional): Input x-values (single array or list matching data)
27
28
Returns:
29
tuple: (baseline, params) with collaboration statistics and method performance
30
Additional keys: 'collaboration_weights', 'method_performance', 'convergence_metrics'
31
"""
32
```
33
34
### Optimize Extended Range
35
36
Automatically determines optimal parameters by testing methods across extended parameter ranges and selecting best-performing combinations.
37
38
```python { .api }
39
def optimize_extended_range(data, x_data=None, method='asls', side='both', width_scale=0.1, height_scale=1.0, sigma_scale=1.0/12.0, min_value=2, max_value=8, step=1, pad_kwargs=None, method_kwargs=None):
40
"""
41
Optimize baseline correction parameters using extended range testing.
42
43
Systematically tests parameter combinations across extended ranges to find
44
optimal settings based on objective quality metrics.
45
46
Parameters:
47
- data (array-like): Input y-values to fit baseline
48
- x_data (array-like, optional): Input x-values
49
- method (str): Baseline correction method to optimize
50
Options: 'asls', 'airpls', 'arpls', 'modpoly', etc.
51
- side (str): Which side of peaks to analyze for optimization
52
Options: 'both', 'left', 'right'
53
- width_scale (float): Scale factor for peak width estimation in optimization
54
- height_scale (float): Scale factor for peak height estimation
55
- sigma_scale (float): Scale factor for noise estimation
56
- min_value (int): Minimum parameter value for range testing
57
- max_value (int): Maximum parameter value for range testing
58
- step (int): Step size for parameter range testing
59
- pad_kwargs (dict, optional): Padding parameters for edge handling
60
- method_kwargs (dict, optional): Additional method-specific parameters
61
62
Returns:
63
tuple: (baseline, params) with optimization results and best parameters
64
Additional keys: 'optimal_params', 'parameter_scores', 'tested_range'
65
"""
66
```
67
68
### Adaptive MinMax
69
70
Automatically adapts to data range and characteristics by analyzing minimum and maximum values to optimize baseline fitting parameters.
71
72
```python { .api }
73
def adaptive_minmax(data, x_data=None, poly_order=None, method='modpoly', weights=None, constrained_fraction=0.01, constrained_weight=1e5, estimation_poly_order=2, method_kwargs=None):
74
"""
75
Adaptive min-max baseline correction with automatic parameter adjustment.
76
77
Analyzes data range and distribution to automatically adapt method parameters
78
for optimal baseline-peak separation based on min-max characteristics.
79
80
Parameters:
81
- data (array-like): Input y-values to fit baseline
82
- x_data (array-like, optional): Input x-values
83
- poly_order (int, optional): Order of polynomial for baseline fitting
84
If None, automatically determined from data
85
- method (str): Base method for adaptive enhancement
86
Options: 'modpoly', 'asls', 'airpls', 'arpls', etc.
87
- weights (array-like, optional): Initial weight array
88
- constrained_fraction (float): Fraction of points to use for range constraints
89
- constrained_weight (float): Weight for range constraint enforcement
90
- estimation_poly_order (int): Polynomial order for initial range estimation
91
- method_kwargs (dict, optional): Additional parameters for base method
92
93
Returns:
94
tuple: (baseline, params) with adaptive parameter selection results
95
Additional keys: 'adapted_params', 'data_characteristics', 'constraint_info'
96
"""
97
```
98
99
### Custom Baseline Correction
100
101
Provides customizable baseline correction with region-specific parameter adjustment and sampling strategies for complex datasets.
102
103
```python { .api }
104
def custom_bc(data, x_data=None, method='asls', regions=((None, None),), sampling=1, lam=None, diff_order=2, method_kwargs=None):
105
"""
106
Customized baseline correction with region-specific parameter control.
107
108
Enables fine-grained control over baseline correction by applying different
109
parameters or methods to specific regions of the data.
110
111
Parameters:
112
- data (array-like): Input y-values to fit baseline
113
- x_data (array-like, optional): Input x-values
114
- method (str): Baseline correction method to apply
115
Options: 'asls', 'airpls', 'arpls', 'modpoly', 'imodpoly', etc.
116
- regions (tuple of tuples): Defines data regions for custom treatment
117
Each tuple: (start_x, end_x) or (start_idx, end_idx)
118
Use (None, None) for entire dataset
119
- sampling (int): Data sampling factor for computational efficiency
120
sampling=1: use all points, sampling=2: use every 2nd point, etc.
121
- lam (float, optional): Smoothing parameter (method-dependent)
122
- diff_order (int): Order of difference penalty for penalized methods
123
- method_kwargs (dict, optional): Additional method-specific parameters
124
Can include region-specific parameter overrides
125
126
Returns:
127
tuple: (baseline, params) with region-specific correction details
128
Additional keys: 'region_results', 'sampling_info', 'method_parameters'
129
"""
130
```
131
132
## Usage Examples
133
134
### Collaborative baseline correction with multiple datasets:
135
136
```python
137
import numpy as np
138
from pybaselines.optimizers import collab_pls
139
140
# Multiple related spectroscopic datasets (e.g., time series or batch measurements)
141
x = np.linspace(0, 1000, 1000)
142
datasets = []
143
for i in range(5):
144
baseline = 10 + 0.02 * x + 0.00001 * x**2 + 2 * np.sin(0.01 * x * (1 + 0.1 * i))
145
peaks = 100 * np.exp(-((x - 300 - 50*i) / 30)**2)
146
noise = np.random.normal(0, 1, len(x))
147
datasets.append(baseline + peaks + noise)
148
149
# Collaborative correction leverages information across all datasets
150
baseline, params = collab_pls(datasets, average_dataset=True, method='asls',
151
method_kwargs={'lam': 1e6, 'p': 0.01})
152
print(f"Collaboration improved performance by {params.get('improvement_factor', 'N/A'):.2f}x")
153
```
154
155
### Automatic parameter optimization:
156
157
```python
158
from pybaselines.optimizers import optimize_extended_range
159
160
# Sample complex spectroscopic data
161
x = np.linspace(0, 2000, 2000)
162
baseline = 50 + 0.01 * x + 0.000005 * x**2
163
peaks = (200 * np.exp(-((x - 400) / 60)**2) +
164
150 * np.exp(-((x - 800) / 40)**2) +
165
180 * np.exp(-((x - 1200) / 50)**2) +
166
120 * np.exp(-((x - 1600) / 35)**2))
167
data = baseline + peaks + np.random.normal(0, 2, len(x))
168
169
# Automatically find optimal parameters
170
baseline, params = optimize_extended_range(data, method='asls',
171
min_value=3, max_value=7, step=1)
172
optimal_lam = params['optimal_params']['lam']
173
optimal_p = params['optimal_params']['p']
174
175
print(f"Optimal parameters: λ={optimal_lam:.1e}, p={optimal_p:.3f}")
176
print(f"Tested {len(params['tested_range'])} parameter combinations")
177
```
178
179
### Adaptive baseline correction based on data characteristics:
180
181
```python
182
from pybaselines.optimizers import adaptive_minmax
183
184
# Data with varying dynamic range
185
data_range = np.max(data) - np.min(data)
186
baseline_level = np.percentile(data, 5)
187
188
# Automatically adapt to data characteristics
189
baseline, params = adaptive_minmax(data, method='modpoly',
190
constrained_fraction=0.02,
191
constrained_weight=1e5)
192
193
adapted_poly_order = params['adapted_params']['poly_order']
194
print(f"Adapted polynomial order: {adapted_poly_order}")
195
print(f"Data range: {data_range:.1f}, baseline level: {baseline_level:.1f}")
196
```
197
198
### Region-specific custom baseline correction:
199
200
```python
201
from pybaselines.optimizers import custom_bc
202
203
# Define different regions requiring different treatment
204
regions = [(0, 300), (300, 700), (700, 1200), (1200, 2000)]
205
region_methods = ['asls', 'airpls', 'modpoly', 'asls']
206
region_params = [
207
{'lam': 1e5, 'p': 0.01}, # Region 1: gentle smoothing
208
{'lam': 1e6}, # Region 2: automatic airPLS
209
{'poly_order': 2}, # Region 3: polynomial fitting
210
{'lam': 1e7, 'p': 0.001} # Region 4: strong smoothing
211
]
212
213
# Apply region-specific correction
214
baseline_total = np.zeros_like(data)
215
for i, (start, end) in enumerate(regions):
216
region_data = data[start:end]
217
region_x = x[start:end] if x is not None else None
218
219
baseline_region, params_region = custom_bc(
220
region_data, x_data=region_x, method=region_methods[i],
221
regions=((None, None),), method_kwargs=region_params[i]
222
)
223
baseline_total[start:end] = baseline_region
224
225
corrected = data - baseline_total
226
print(f"Applied {len(regions)} different correction strategies")
227
```
228
229
### Collaborative optimization with parameter tuning:
230
231
```python
232
# Combine collaborative approach with parameter optimization
233
datasets_subset = datasets[:3] # Use subset for optimization
234
235
# First, find optimal parameters using collaboration
236
baseline_collab, params_collab = collab_pls(datasets_subset, method='asls')
237
238
# Then apply optimized parameters to extended range testing
239
best_params = params_collab.get('optimal_method_params', {'lam': 1e6, 'p': 0.01})
240
baseline_final, params_final = optimize_extended_range(
241
data, method='asls', method_kwargs=best_params
242
)
243
244
print("Combined collaborative learning with parameter optimization")
245
print(f"Final performance score: {params_final.get('best_score', 'N/A')}")
246
```