A library of algorithms for the baseline correction of experimental data.
—
Baseline correction algorithms that classify data points as belonging to baseline or peak regions using statistical, morphological, or signal processing approaches. These methods use pattern recognition and thresholding techniques to distinguish between baseline and peak components, making them suitable for data with well-defined spectral features.
Statistical classification method using local statistics and interpolation for baseline estimation.
def dietrich(data, smooth_half_window=None, interp_half_window=None, max_iter=100, tol=1e-3, x_data=None, pad_kwargs=None, **kwargs):
"""
Dietrich classification baseline correction.
Parameters:
- data (array-like): Input y-values to fit baseline
- smooth_half_window (int, optional): Half-window size for initial smoothing
- interp_half_window (int, optional): Half-window size for interpolation
- max_iter (int): Maximum iterations for convergence
- tol (float): Convergence tolerance for baseline changes
- x_data (array-like, optional): Input x-values
- pad_kwargs (dict, optional): Padding parameters for edge handling
- **kwargs: Additional padding and processing parameters
Returns:
tuple: (baseline, params) with classification statistics and convergence history
"""Morphological classification approach using local minima detection for baseline region identification.
def golotvin(data, half_window=None, x_data=None, pad_kwargs=None, **kwargs):
"""
Golotvin classification baseline using morphological operations.
Parameters:
- data (array-like): Input y-values to fit baseline
- half_window (int, optional): Half-window size for morphological operations
- x_data (array-like, optional): Input x-values
- pad_kwargs (dict, optional): Padding parameters for edge handling
- **kwargs: Additional morphological processing parameters
Returns:
tuple: (baseline, params) with morphological operation details
"""Classification based on local standard deviation patterns to identify baseline regions.
def std_distribution(data, half_window=None, x_data=None, pad_kwargs=None, **kwargs):
"""
Standard deviation distribution baseline classification.
Parameters:
- data (array-like): Input y-values to fit baseline
- half_window (int, optional): Half-window size for standard deviation calculation
- x_data (array-like, optional): Input x-values
- pad_kwargs (dict, optional): Padding parameters for windowing operations
- **kwargs: Additional statistical processing parameters
Returns:
tuple: (baseline, params) with local variance statistics
"""Rapid classification method optimized for chromatographic data with threshold-based peak detection.
def fastchrom(data, half_window=None, threshold=None, x_data=None, pad_kwargs=None, **kwargs):
"""
Fast chromatographic baseline correction with threshold classification.
Parameters:
- data (array-like): Input y-values to fit baseline
- half_window (int, optional): Half-window size for local analysis
- threshold (float, optional): Classification threshold for peak detection
- x_data (array-like, optional): Input x-values
- pad_kwargs (dict, optional): Padding parameters for windowing
- **kwargs: Additional threshold and processing parameters
Returns:
tuple: (baseline, params) with threshold statistics and classification results
"""Automated classification and correction method requiring minimal parameter tuning for robust baseline estimation.
def fabc(data, lam=1e5, diff_order=2, weights=None, weights_as_mask=False, x_data=None):
"""
Fully automatic baseline correction using intelligent classification.
Parameters:
- data (array-like): Input y-values to fit baseline
- lam (float): Smoothing parameter for regularization
- diff_order (int): Order of difference penalty matrix
- weights (array-like, optional): Initial weight array or mask
- weights_as_mask (bool): Whether to treat weights as binary mask
- x_data (array-like, optional): Input x-values
Returns:
tuple: (baseline, params) with automatic parameter selection results
"""Advanced signal processing approach using wavelet transforms for multi-scale baseline-peak classification.
def cwt_br(data, poly_order=2, scales=None, num_std=1.0, ridge_kwargs=None, x_data=None):
"""
Continuous wavelet transform baseline recognition.
Parameters:
- data (array-like): Input y-values to fit baseline
- poly_order (int): Order of polynomial for final baseline fitting
- scales (array-like, optional): Wavelet scales for multi-resolution analysis
- num_std (float): Number of standard deviations for ridge detection threshold
- ridge_kwargs (dict, optional): Additional parameters for ridge detection
- x_data (array-like, optional): Input x-values
Returns:
tuple: (baseline, params) with wavelet analysis results and ridge detection info
"""import numpy as np
from pybaselines.classification import fabc
# Sample chromatographic data with multiple peaks
x = np.linspace(0, 500, 2000)
baseline_true = 10 + 0.02 * x + 0.00005 * x**2
peak1 = 150 * np.exp(-((x - 100) / 15)**2)
peak2 = 200 * np.exp(-((x - 250) / 20)**2)
peak3 = 120 * np.exp(-((x - 400) / 12)**2)
data = baseline_true + peak1 + peak2 + peak3 + np.random.normal(0, 2, len(x))
# Fully automatic baseline correction
baseline, params = fabc(data, lam=1e5)
corrected = data - baseline
print("FABC automatically determined optimal parameters")from pybaselines.classification import cwt_br
# Multi-scale wavelet analysis for complex spectra
scales = np.arange(1, 20) # Define wavelet scales
baseline, params = cwt_br(data, poly_order=3, scales=scales, num_std=1.5)
corrected = data - baseline
print(f"Detected ridges at scales: {params.get('detected_scales', [])}")from pybaselines.classification import fastchrom
# Rapid baseline correction for high-throughput analysis
baseline, params = fastchrom(data, half_window=20, threshold=0.1)
corrected = data - baseline
# Optimized for speed while maintaining accuracyfrom pybaselines.classification import std_distribution
# Identify baseline regions based on local variance
baseline, params = std_distribution(data, half_window=25)
corrected = data - baseline
# Works well when baseline regions have consistent noise levelsfrom pybaselines.classification import golotvin
# Use morphological operations to find baseline points
baseline, params = golotvin(data, half_window=15)
corrected = data - baseline
# Effective for data with clear morphological differencesfrom pybaselines.classification import dietrich
# Iterative approach with statistical smoothing and interpolation
baseline, params = dietrich(data, smooth_half_window=10, interp_half_window=30, max_iter=50)
corrected = data - baseline
print(f"Converged in {len(params.get('tol_history', []))} iterations")Install with Tessl CLI
npx tessl i tessl/pypi-pybaselines