A library of algorithms for the baseline correction of experimental data.
—
Baseline correction algorithms designed for 2D data arrays such as images, spectroscopic maps, chromatographic surfaces, and other spatially-resolved measurements. These methods extend 1D algorithms to handle spatial correlation and provide consistent baseline correction across both dimensions while preserving important spatial features.
The main interface for 2D baseline correction, providing access to all two-dimensional variants of baseline correction algorithms.
class Baseline2D:
"""
Main interface for 2D baseline correction algorithms.
Provides object-oriented access to two-dimensional versions of most 1D baseline
correction methods, with additional capabilities for handling spatial data.
Parameters:
- x_data (array-like, optional): x-coordinates of the 2D grid
- z_data (array-like, optional): z-coordinates of the 2D grid
- check_finite (bool, default=True): Check for finite values in input data
- assume_sorted (bool, default=False): Assume coordinate arrays are sorted
- output_dtype (type, optional): Data type for output arrays
Attributes:
- x (numpy.ndarray): x-coordinates for the 2D grid
- x_domain (numpy.ndarray): [x_min, x_max] coordinate range
- z (numpy.ndarray): z-coordinates for the 2D grid
- z_domain (numpy.ndarray): [z_min, z_max] coordinate range
Methods:
- All 1D baseline correction methods adapted for 2D data
- _get_method(method_name): Access methods by string name for programmatic use
"""Two-dimensional versions of Whittaker-smoothing based algorithms that apply penalized least squares with 2D smoothness constraints.
# Available 2D Whittaker methods:
# - asls_2d: 2D Asymmetric Least Squares
# - iasls_2d: 2D Improved AsLS
# - airpls_2d: 2D Adaptive Iteratively Reweighted PLS
# - arpls_2d: 2D Asymmetrically Reweighted PLS
# - drpls_2d: 2D Doubly Reweighted PLS
# - iarpls_2d: 2D Improved arPLS
# - aspls_2d: 2D Adaptive Smoothness PLS
# - psalsa_2d: 2D Peaked Signal's AsLS Algorithm
# - derpsalsa_2d: 2D Derivative Peak-Screening AsLS
def asls_2d(data, lam=1e6, p=1e-2, diff_order=2, max_iter=50, tol=1e-3, weights=None, x_data=None, z_data=None):
"""
2D Asymmetric Least Squares baseline correction.
Applies AsLS algorithm to 2D data with smoothness penalties in both dimensions,
preserving spatial correlation while correcting baseline variations.
Parameters:
- data (array-like): 2D input array to fit baseline (shape: [x_points, z_points])
- lam (float or tuple): Smoothing parameter(s). If float, same for both dimensions.
If tuple: (lam_x, lam_z) for dimension-specific smoothing
- p (float): Asymmetry parameter for peak-baseline separation
- diff_order (int or tuple): Order of difference penalty. Single int or (order_x, order_z)
- max_iter (int): Maximum iterations for convergence
- tol (float): Convergence tolerance for iterative fitting
- weights (array-like, optional): 2D weight array matching data dimensions
- x_data (array-like, optional): x-coordinate values
- z_data (array-like, optional): z-coordinate values
Returns:
tuple: (baseline_2d, params) with 2D baseline array and processing parameters
"""Polynomial surface fitting methods that extend 1D polynomial approaches to 2D surfaces with various robustness strategies.
# Available 2D Polynomial methods:
# - poly_2d: 2D polynomial surface fitting
# - modpoly_2d: 2D modified polynomial with iterative masking
# - imodpoly_2d: 2D improved modified polynomial
# - penalized_poly_2d: 2D penalized polynomial with robust fitting
# - quant_reg_2d: 2D quantile regression polynomial
# - goldindec_2d: 2D Goldindec algorithm
def poly_2d(data, poly_order=(2, 2), weights=None, return_coef=False, x_data=None, z_data=None):
"""
2D polynomial surface baseline fitting.
Fits polynomial surfaces to 2D data for baseline correction, allowing
different polynomial orders in each dimension.
Parameters:
- data (array-like): 2D input array to fit baseline
- poly_order (int or tuple): Polynomial order. If int, same for both dimensions.
If tuple: (order_x, order_z) for each dimension
- weights (array-like, optional): 2D weight array for data points
- return_coef (bool): Whether to return surface coefficients
- x_data (array-like, optional): x-coordinate values
- z_data (array-like, optional): z-coordinate values
Returns:
tuple: (baseline_surface, params) with optional coefficient matrix
"""Smoothing-based 2D algorithms using morphological and statistical operations adapted for spatial data.
# Available 2D Smooth methods:
# - noise_median_2d: 2D noise-median smoothing
# - snip_2d: 2D SNIP algorithm
# - swima_2d: 2D small-window moving average
# - ipsa_2d: 2D Iterative Polynomial Smoothing
# - ria_2d: 2D Range Independent Algorithm
def snip_2d(data, max_half_window=None, decreasing=False, smooth_half_window=None, filter_order=2, x_data=None, z_data=None):
"""
2D Statistical Sensitive Non-linear Iterative Peak algorithm.
Applies SNIP baseline correction to 2D data using morphological operations
that preserve spatial features while removing baseline variations.
Parameters:
- data (array-like): 2D input array to correct
- max_half_window (int or tuple): Maximum half-window size for operations
If int, same for both dimensions
- decreasing (bool): Whether to use decreasing window sizes
- smooth_half_window (int or tuple): Smoothing window size
- filter_order (int): Order of smoothing filter
- x_data (array-like, optional): x-coordinate values
- z_data (array-like, optional): z-coordinate values
Returns:
tuple: (baseline_2d, params) with spatial processing details
"""Morphological operations extended to 2D for baseline correction using structural elements and spatial filtering.
# Available 2D Morphological methods:
# - mpls_2d: 2D morphological penalized least squares
# - mor_2d: 2D morphological opening
# - imor_2d: 2D improved morphological baseline
# - mormol_2d: 2D morphological with mollification
# - amormol_2d: 2D averaging morphological with mollification
# - rolling_ball_2d: 2D rolling ball baseline
# - mwmv_2d: 2D moving window minimum value
# - tophat_2d: 2D top-hat morphological baseline
def rolling_ball_2d(data, half_window=None, x_data=None, z_data=None):
"""
2D rolling ball baseline correction.
Applies morphological rolling ball operation in 2D to estimate baseline
by simulating a ball rolling under the data surface.
Parameters:
- data (array-like): 2D input array to correct
- half_window (int or tuple): Half-size of rolling ball structuring element
If int, creates circular ball. If tuple: (radius_x, radius_z)
- x_data (array-like, optional): x-coordinate values
- z_data (array-like, optional): z-coordinate values
Returns:
tuple: (baseline_2d, params) with morphological operation details
"""Spline-based methods extended to 2D using tensor product B-splines for flexible surface modeling.
# Available 2D Spline methods:
# - mixture_model_2d: 2D mixture model splines
# - irsqr_2d: 2D iterative reweighted spline quantile regression
# - pspline_asls_2d: 2D penalized spline AsLS
# - pspline_iasls_2d: 2D penalized spline IAsLS
# - pspline_airpls_2d: 2D penalized spline airPLS
def mixture_model_2d(data, lam=1e5, p=1e-2, num_knots=(10, 10), spline_degree=(3, 3), diff_order=(3, 3), max_iter=50, tol=1e-3, weights=None):
"""
2D mixture model baseline using tensor product splines.
Estimates 2D baseline using spline surfaces with mixture model approach
for optimal baseline-peak separation across spatial dimensions.
Parameters:
- data (array-like): 2D input array to fit baseline
- lam (float or tuple): Smoothing parameter(s) for spline regularization
- p (float): Asymmetry parameter for mixture model
- num_knots (int or tuple): Number of knots in each dimension
- spline_degree (int or tuple): Degree of spline basis in each dimension
- diff_order (int or tuple): Order of difference penalty in each dimension
- max_iter (int): Maximum iterations for convergence
- tol (float): Convergence tolerance
- weights (array-like, optional): 2D weight array
Returns:
tuple: (baseline_surface, params) with spline fitting details
"""import numpy as np
from pybaselines.two_d import Baseline2D
# Create sample 2D spectroscopic map data
x = np.linspace(0, 100, 50)
z = np.linspace(0, 100, 50)
X, Z = np.meshgrid(x, z, indexing='ij')
# Create 2D baseline surface with spatial variation
baseline_2d = 10 + 0.1*X + 0.05*Z + 0.001*X*Z
# Add 2D peaks (e.g., spatial features in spectroscopic map)
peak1 = 50 * np.exp(-((X-25)**2 + (Z-25)**2) / 200)
peak2 = 40 * np.exp(-((X-75)**2 + (Z-75)**2) / 150)
data_2d = baseline_2d + peak1 + peak2 + np.random.normal(0, 1, X.shape)
# Initialize 2D baseline correction
baseline_2d_corrector = Baseline2D(x_data=x, z_data=z)
# Apply 2D AsLS baseline correction
baseline_est, params = baseline_2d_corrector.asls(data_2d, lam=1e5, p=0.01)
corrected_2d = data_2d - baseline_est
print(f"2D baseline shape: {baseline_est.shape}")
print(f"Spatial correlation preserved: {params.get('spatial_consistency', True)}")# Rolling ball method for 2D chromatographic data
baseline_morph, params_morph = baseline_2d_corrector.rolling_ball(data_2d, half_window=5)
corrected_morph = data_2d - baseline_morph
# 2D SNIP for spectroscopic imaging
baseline_snip, params_snip = baseline_2d_corrector.snip(data_2d, max_half_window=8)
corrected_snip = data_2d - baseline_snip# Different smoothing in x and z directions
baseline_aniso, params_aniso = baseline_2d_corrector.asls(
data_2d,
lam=(1e5, 1e6), # Stronger smoothing in z-direction
diff_order=(2, 3) # Different penalty orders
)
corrected_aniso = data_2d - baseline_aniso
print("Applied anisotropic smoothing with dimension-specific parameters")# Flexible spline surface baseline
baseline_spline, params_spline = baseline_2d_corrector.mixture_model(
data_2d,
num_knots=(8, 8), # 8x8 knot grid
spline_degree=(3, 3), # Cubic splines in both directions
lam=1e4
)
corrected_spline = data_2d - baseline_spline
print(f"Used {params_spline.get('total_knots', 64)} knots for 2D spline surface")# For large spectroscopic imaging datasets
large_data = np.random.randn(200, 200) + np.sin(np.linspace(0, 10, 200))[:, None]
# Use efficient 2D polynomial for large data
baseline_large, params_large = baseline_2d_corrector.poly(
large_data,
poly_order=(3, 3) # Moderate polynomial order for efficiency
)
# Alternative: downsample for initial correction, then refine
downsampled = large_data[::4, ::4] # Downsample by factor of 4
baseline_down, _ = baseline_2d_corrector.asls(downsampled, lam=1e5)
# Upsample baseline back to full resolution
from scipy.interpolate import interp2d
f = interp2d(np.arange(0, 200, 4), np.arange(0, 200, 4), baseline_down.T, kind='cubic')
baseline_upsampled = f(np.arange(200), np.arange(200)).T
print("Efficient processing of large 2D datasets using downsampling approach")# Apply different methods to different spatial regions
region1_data = data_2d[:25, :25] # Top-left quadrant
region2_data = data_2d[25:, 25:] # Bottom-right quadrant
# Different methods for different regions
baseline_r1, _ = baseline_2d_corrector.asls(region1_data, lam=1e5)
baseline_r2, _ = baseline_2d_corrector.rolling_ball(region2_data, half_window=3)
# Combine regional corrections
baseline_combined = np.zeros_like(data_2d)
baseline_combined[:25, :25] = baseline_r1
baseline_combined[25:, 25:] = baseline_r2
# Smooth transitions between regions using interpolation
print("Applied region-specific 2D baseline correction methods")Install with Tessl CLI
npx tessl i tessl/pypi-pybaselines