A library of algorithms for the baseline correction of experimental data.
—
Advanced baseline correction algorithms that use optimization strategies, parameter tuning, and collaborative approaches to improve baseline estimation. These methods enhance existing algorithms through automated parameter selection, multi-dataset collaboration, and adaptive parameter adjustment based on data characteristics.
Enhances baseline correction by leveraging information from multiple related datasets or multiple algorithms applied to the same dataset.
def collab_pls(data, average_dataset=True, method='asls', method_kwargs=None, x_data=None):
"""
Collaborative Penalized Least Squares for enhanced baseline correction.
Uses collaborative filtering principles to improve baseline estimation by
combining information from multiple datasets or algorithm variants.
Parameters:
- data (array-like): Input y-values or list of datasets for collaborative correction
If single array: applies collaborative enhancement
If list of arrays: performs multi-dataset collaboration
- average_dataset (bool): Whether to average results across multiple datasets
- method (str): Base baseline correction method to enhance
Options: 'asls', 'airpls', 'arpls', 'iarpls', etc.
- method_kwargs (dict, optional): Parameters for the base correction method
- x_data (array-like, optional): Input x-values (single array or list matching data)
Returns:
tuple: (baseline, params) with collaboration statistics and method performance
Additional keys: 'collaboration_weights', 'method_performance', 'convergence_metrics'
"""Automatically determines optimal parameters by testing methods across extended parameter ranges and selecting best-performing combinations.
def optimize_extended_range(data, x_data=None, method='asls', side='both', width_scale=0.1, height_scale=1.0, sigma_scale=1.0/12.0, min_value=2, max_value=8, step=1, pad_kwargs=None, method_kwargs=None):
"""
Optimize baseline correction parameters using extended range testing.
Systematically tests parameter combinations across extended ranges to find
optimal settings based on objective quality metrics.
Parameters:
- data (array-like): Input y-values to fit baseline
- x_data (array-like, optional): Input x-values
- method (str): Baseline correction method to optimize
Options: 'asls', 'airpls', 'arpls', 'modpoly', etc.
- side (str): Which side of peaks to analyze for optimization
Options: 'both', 'left', 'right'
- width_scale (float): Scale factor for peak width estimation in optimization
- height_scale (float): Scale factor for peak height estimation
- sigma_scale (float): Scale factor for noise estimation
- min_value (int): Minimum parameter value for range testing
- max_value (int): Maximum parameter value for range testing
- step (int): Step size for parameter range testing
- pad_kwargs (dict, optional): Padding parameters for edge handling
- method_kwargs (dict, optional): Additional method-specific parameters
Returns:
tuple: (baseline, params) with optimization results and best parameters
Additional keys: 'optimal_params', 'parameter_scores', 'tested_range'
"""Automatically adapts to data range and characteristics by analyzing minimum and maximum values to optimize baseline fitting parameters.
def adaptive_minmax(data, x_data=None, poly_order=None, method='modpoly', weights=None, constrained_fraction=0.01, constrained_weight=1e5, estimation_poly_order=2, method_kwargs=None):
"""
Adaptive min-max baseline correction with automatic parameter adjustment.
Analyzes data range and distribution to automatically adapt method parameters
for optimal baseline-peak separation based on min-max characteristics.
Parameters:
- data (array-like): Input y-values to fit baseline
- x_data (array-like, optional): Input x-values
- poly_order (int, optional): Order of polynomial for baseline fitting
If None, automatically determined from data
- method (str): Base method for adaptive enhancement
Options: 'modpoly', 'asls', 'airpls', 'arpls', etc.
- weights (array-like, optional): Initial weight array
- constrained_fraction (float): Fraction of points to use for range constraints
- constrained_weight (float): Weight for range constraint enforcement
- estimation_poly_order (int): Polynomial order for initial range estimation
- method_kwargs (dict, optional): Additional parameters for base method
Returns:
tuple: (baseline, params) with adaptive parameter selection results
Additional keys: 'adapted_params', 'data_characteristics', 'constraint_info'
"""Provides customizable baseline correction with region-specific parameter adjustment and sampling strategies for complex datasets.
def custom_bc(data, x_data=None, method='asls', regions=((None, None),), sampling=1, lam=None, diff_order=2, method_kwargs=None):
"""
Customized baseline correction with region-specific parameter control.
Enables fine-grained control over baseline correction by applying different
parameters or methods to specific regions of the data.
Parameters:
- data (array-like): Input y-values to fit baseline
- x_data (array-like, optional): Input x-values
- method (str): Baseline correction method to apply
Options: 'asls', 'airpls', 'arpls', 'modpoly', 'imodpoly', etc.
- regions (tuple of tuples): Defines data regions for custom treatment
Each tuple: (start_x, end_x) or (start_idx, end_idx)
Use (None, None) for entire dataset
- sampling (int): Data sampling factor for computational efficiency
sampling=1: use all points, sampling=2: use every 2nd point, etc.
- lam (float, optional): Smoothing parameter (method-dependent)
- diff_order (int): Order of difference penalty for penalized methods
- method_kwargs (dict, optional): Additional method-specific parameters
Can include region-specific parameter overrides
Returns:
tuple: (baseline, params) with region-specific correction details
Additional keys: 'region_results', 'sampling_info', 'method_parameters'
"""import numpy as np
from pybaselines.optimizers import collab_pls
# Multiple related spectroscopic datasets (e.g., time series or batch measurements)
x = np.linspace(0, 1000, 1000)
datasets = []
for i in range(5):
baseline = 10 + 0.02 * x + 0.00001 * x**2 + 2 * np.sin(0.01 * x * (1 + 0.1 * i))
peaks = 100 * np.exp(-((x - 300 - 50*i) / 30)**2)
noise = np.random.normal(0, 1, len(x))
datasets.append(baseline + peaks + noise)
# Collaborative correction leverages information across all datasets
baseline, params = collab_pls(datasets, average_dataset=True, method='asls',
method_kwargs={'lam': 1e6, 'p': 0.01})
print(f"Collaboration improved performance by {params.get('improvement_factor', 'N/A'):.2f}x")from pybaselines.optimizers import optimize_extended_range
# Sample complex spectroscopic data
x = np.linspace(0, 2000, 2000)
baseline = 50 + 0.01 * x + 0.000005 * x**2
peaks = (200 * np.exp(-((x - 400) / 60)**2) +
150 * np.exp(-((x - 800) / 40)**2) +
180 * np.exp(-((x - 1200) / 50)**2) +
120 * np.exp(-((x - 1600) / 35)**2))
data = baseline + peaks + np.random.normal(0, 2, len(x))
# Automatically find optimal parameters
baseline, params = optimize_extended_range(data, method='asls',
min_value=3, max_value=7, step=1)
optimal_lam = params['optimal_params']['lam']
optimal_p = params['optimal_params']['p']
print(f"Optimal parameters: λ={optimal_lam:.1e}, p={optimal_p:.3f}")
print(f"Tested {len(params['tested_range'])} parameter combinations")from pybaselines.optimizers import adaptive_minmax
# Data with varying dynamic range
data_range = np.max(data) - np.min(data)
baseline_level = np.percentile(data, 5)
# Automatically adapt to data characteristics
baseline, params = adaptive_minmax(data, method='modpoly',
constrained_fraction=0.02,
constrained_weight=1e5)
adapted_poly_order = params['adapted_params']['poly_order']
print(f"Adapted polynomial order: {adapted_poly_order}")
print(f"Data range: {data_range:.1f}, baseline level: {baseline_level:.1f}")from pybaselines.optimizers import custom_bc
# Define different regions requiring different treatment
regions = [(0, 300), (300, 700), (700, 1200), (1200, 2000)]
region_methods = ['asls', 'airpls', 'modpoly', 'asls']
region_params = [
{'lam': 1e5, 'p': 0.01}, # Region 1: gentle smoothing
{'lam': 1e6}, # Region 2: automatic airPLS
{'poly_order': 2}, # Region 3: polynomial fitting
{'lam': 1e7, 'p': 0.001} # Region 4: strong smoothing
]
# Apply region-specific correction
baseline_total = np.zeros_like(data)
for i, (start, end) in enumerate(regions):
region_data = data[start:end]
region_x = x[start:end] if x is not None else None
baseline_region, params_region = custom_bc(
region_data, x_data=region_x, method=region_methods[i],
regions=((None, None),), method_kwargs=region_params[i]
)
baseline_total[start:end] = baseline_region
corrected = data - baseline_total
print(f"Applied {len(regions)} different correction strategies")# Combine collaborative approach with parameter optimization
datasets_subset = datasets[:3] # Use subset for optimization
# First, find optimal parameters using collaboration
baseline_collab, params_collab = collab_pls(datasets_subset, method='asls')
# Then apply optimized parameters to extended range testing
best_params = params_collab.get('optimal_method_params', {'lam': 1e6, 'p': 0.01})
baseline_final, params_final = optimize_extended_range(
data, method='asls', method_kwargs=best_params
)
print("Combined collaborative learning with parameter optimization")
print(f"Final performance score: {params_final.get('best_score', 'N/A')}")Install with Tessl CLI
npx tessl i tessl/pypi-pybaselines