Spatial econometric regression models for analyzing geographically-related data interactions.
Overall
score
87%
Ordinary least squares regression with comprehensive spatial and non-spatial diagnostic capabilities. spreg provides both base OLS estimation and full diagnostic models with extensive testing options.
Core OLS estimation without diagnostics, providing essential regression coefficients and variance-covariance matrices with optional robust standard error corrections.
class BaseOLS:
def __init__(self, y, x, robust=None, gwk=None, sig2n_k=False):
"""
Ordinary least squares estimation (no diagnostics or constant added).
Parameters:
- y (array): nx1 dependent variable
- x (array): nxk independent variables, excluding constant
- robust (str, optional): 'white' for White correction, 'hac' for HAC correction
- gwk (pysal W object, optional): Kernel spatial weights for HAC estimation
- sig2n_k (bool): If True, use n-k for sigma^2 estimation; if False, use n
Attributes:
- betas (array): kx1 estimated coefficients
- u (array): nx1 residuals
- predy (array): nx1 predicted values
- vm (array): kxk variance-covariance matrix
- sig2 (float): Sigma squared
- n (int): Number of observations
- k (int): Number of parameters
"""Complete OLS implementation with spatial and non-spatial diagnostic tests, supporting SLX specifications and regime-based analysis.
class OLS:
def __init__(self, y, x, w=None, robust=None, gwk=None, sig2n_k=False,
nonspat_diag=True, spat_diag=False, moran=False,
white_test=False, vif=False, slx_lags=0, slx_vars='All',
regimes=None, vm=False, constant_regi='one', cols2regi='all',
regime_err_sep=False, cores=False, name_y=None, name_x=None,
name_w=None, name_ds=None, latex=False):
"""
Ordinary least squares with extensive diagnostics.
Parameters:
- y (array): nx1 dependent variable
- x (array): nxk independent variables (constant added automatically)
- w (pysal W object, optional): Spatial weights for spatial diagnostics
- robust (str, optional): 'white' or 'hac' for robust standard errors
- gwk (pysal W object, optional): Kernel weights for HAC estimation
- sig2n_k (bool): Use n-k for sigma^2 estimation
- nonspat_diag (bool): Compute non-spatial diagnostics (default True)
- spat_diag (bool): Compute spatial diagnostics (requires w)
- moran (bool): Compute Moran's I test on residuals
- white_test (bool): Compute White's heteroskedasticity test
- vif (bool): Compute variance inflation factors
- slx_lags (int): Number of spatial lags of X to include
- slx_vars (str/list): Variables to be spatially lagged ('All' or list)
- regimes (list/Series, optional): Regime identifier for observations
- vm (bool): Include variance-covariance matrix in output
- constant_regi (str): 'one' (constant across regimes) or 'many'
- cols2regi (str/list): Variables that vary by regime ('all' or list)
- regime_err_sep (bool): Run separate regressions for each regime
- cores (bool): Use multiprocessing for regime estimation
- name_y, name_x, name_w, name_ds (str): Variable and dataset names
- latex (bool): Format output for LaTeX
Attributes:
- All BaseOLS attributes plus:
- r2 (float): R-squared
- ar2 (float): Adjusted R-squared
- f_stat (tuple): F-statistic (value, p-value)
- t_stat (list): t-statistics with p-values for each coefficient
- jarque_bera (dict): Jarque-Bera normality test results
- breusch_pagan (dict): Breusch-Pagan heteroskedasticity test
- white (dict): White heteroskedasticity test (if white_test=True)
- koenker_bassett (dict): Koenker-Bassett test results
- lm_error (dict): LM test for spatial error (if spat_diag=True)
- lm_lag (dict): LM test for spatial lag (if spat_diag=True)
- rlm_error (dict): Robust LM test for spatial error
- rlm_lag (dict): Robust LM test for spatial lag
- lm_sarma (dict): LM test for SARMA specification
- moran_res (dict): Moran's I test on residuals (if moran=True)
- vif (dict): Variance inflation factors (if vif=True)
- summary (str): Comprehensive formatted results
- output (DataFrame): Formatted results table
"""import numpy as np
import spreg
from libpysal import weights
# Prepare data
n = 100
y = np.random.randn(n, 1)
x = np.random.randn(n, 3)
# Basic OLS without diagnostics
base_ols = spreg.BaseOLS(y, x)
print("Coefficients:", base_ols.betas.flatten())
print("R-squared would need manual calculation")
# Full OLS with non-spatial diagnostics
ols_model = spreg.OLS(y, x, nonspat_diag=True, name_y='y',
name_x=['x1', 'x2', 'x3'])
print(ols_model.summary)
print("R-squared:", ols_model.r2)
print("F-statistic:", ols_model.f_stat)import numpy as np
import spreg
from libpysal import weights
# Create spatial data
n = 49 # 7x7 grid
y = np.random.randn(n, 1)
x = np.random.randn(n, 2)
w = weights.lat2W(7, 7) # 7x7 lattice weights
# OLS with spatial diagnostics
spatial_ols = spreg.OLS(y, x, w=w, spat_diag=True, moran=True,
name_y='y', name_x=['x1', 'x2'])
print(spatial_ols.summary)
print("LM Error test:", spatial_ols.lm_error)
print("LM Lag test:", spatial_ols.lm_lag)
print("Moran's I on residuals:", spatial_ols.moran_res)
# Check if spatial dependence is detected
if spatial_ols.lm_error['p-value'] < 0.05:
print("Spatial error dependence detected")
if spatial_ols.lm_lag['p-value'] < 0.05:
print("Spatial lag dependence detected")import numpy as np
import spreg
from libpysal import weights
# Spatial lag of X (SLX) model
n = 100
y = np.random.randn(n, 1)
x = np.random.randn(n, 2)
w = weights.KNN.from_array(np.random.randn(n, 2), k=5)
# Include spatial lags of X variables
slx_model = spreg.OLS(y, x, w=w, slx_lags=1, slx_vars='All',
spat_diag=True, name_y='y', name_x=['x1', 'x2'])
print(slx_model.summary)
print("Number of coefficients (includes spatial lags):", slx_model.k)import numpy as np
import spreg
# OLS with White robust standard errors
n = 100
y = np.random.randn(n, 1)
x = np.random.randn(n, 2)
# White correction for heteroskedasticity
white_ols = spreg.OLS(y, x, robust='white', nonspat_diag=True,
name_y='y', name_x=['x1', 'x2'])
print(white_ols.summary)
print("Uses White-corrected standard errors")
# HAC correction requires spatial weights kernel
from libpysal import weights
w_kernel = weights.DistanceBand.from_array(np.random.randn(n, 2),
threshold=1.0, binary=False)
hac_ols = spreg.OLS(y, x, robust='hac', gwk=w_kernel,
name_y='y', name_x=['x1', 'x2'])
print("Uses HAC-corrected standard errors")import numpy as np
import spreg
# OLS with regimes
n = 100
y = np.random.randn(n, 1)
x = np.random.randn(n, 2)
regimes = np.random.choice(['A', 'B', 'C'], n)
# Different intercepts and slopes by regime
regime_ols = spreg.OLS(y, x, regimes=regimes, constant_regi='many',
cols2regi='all', name_y='y', name_x=['x1', 'x2'],
name_regimes='region')
print(regime_ols.summary)
print("Number of regimes:", regime_ols.nr)
print("Chow test results:", regime_ols.chow)
# Separate regression for each regime
separate_ols = spreg.OLS(y, x, regimes=regimes, regime_err_sep=True,
name_y='y', name_x=['x1', 'x2'])
print("Individual regime results:", separate_ols.multi.keys())r2: Proportion of variance explained by the modelar2: Adjusted R-squared, penalized for number of parametersf_stat: Overall model significance testbreusch_pagan: Tests for heteroskedasticity related to fitted valueswhite: General heteroskedasticity test (if requested)koenker_bassett: Studentized version of Breusch-Paganlm_error: Tests for spatial error dependencelm_lag: Tests for spatial lag dependencerlm_error, rlm_lag: Robust versions accounting for local misspecificationlm_sarma: Joint test for both error and lag dependencemoran_res: Moran's I test on regression residualsvif: Variance inflation factors for detecting multicollinearityA VIF > 10 typically indicates problematic multicollinearity.
Install with Tessl CLI
npx tessl i tessl/pypi-spregdocs
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10