0
# spreg
1
2
A comprehensive Python library for spatial econometric regression analysis. spreg provides methods for analyzing processes where observations interact with one another across geographic space, offering simultaneous autoregressive spatial regression models, diagnostic tools, and regime-based analysis capabilities.
3
4
## Package Information
5
6
- **Package Name**: spreg
7
- **Language**: Python
8
- **Installation**: `pip install spreg`
9
10
## Core Imports
11
12
```python
13
import spreg
14
```
15
16
Common specific imports:
17
18
```python
19
from spreg import OLS, TSLS, GM_Error_Het, GM_Error_Hom
20
from spreg import GM_Combo_Het, GM_Combo_Hom
21
from spreg import ML_Error, ML_Lag
22
```
23
24
## Basic Usage
25
26
```python
27
import numpy as np
28
import spreg
29
from libpysal import weights
30
31
# Prepare data
32
n = 100
33
y = np.random.randn(n, 1)
34
x = np.random.randn(n, 2)
35
w = weights.lat2W(10, 10) # 10x10 spatial weights
36
37
# Basic OLS with spatial diagnostics
38
ols_model = spreg.OLS(y, x, w=w, spat_diag=True, name_y='y', name_x=['x1', 'x2'])
39
print(ols_model.summary)
40
41
# Spatial error model with heteroskedasticity
42
spatial_model = spreg.GM_Error_Het(y, x, w=w.sparse, name_y='y', name_x=['x1', 'x2'])
43
print(spatial_model.summary)
44
45
# Two-stage least squares for endogenous variables
46
yend = np.random.randn(n, 1) # endogenous variable
47
q = np.random.randn(n, 1) # instrument
48
tsls_model = spreg.TSLS(y, x, yend, q, name_y='y', name_x=['x1', 'x2'],
49
name_yend=['yend'], name_q=['instrument'])
50
print(tsls_model.summary)
51
```
52
53
## Architecture
54
55
spreg follows a consistent class-based architecture for regression models:
56
57
- **Base Classes**: Provide core estimation without diagnostics (e.g., `BaseOLS`, `BaseTSLS`)
58
- **Full Classes**: Extend base classes with comprehensive diagnostics and output formatting
59
- **Spatial Models**: Handle spatial dependence through error correlation (`*_Error`) or lag structures (`*_Lag`)
60
- **Regime Models**: Allow parameters to vary across spatial or categorical regimes
61
- **Diagnostic Functions**: Standalone functions for testing spatial autocorrelation, heteroskedasticity, and model specification
62
63
The library integrates seamlessly with the PySAL ecosystem, using `libpysal.weights` objects for spatial relationships and supporting common NumPy/SciPy data structures.
64
65
## Capabilities
66
67
### OLS Regression Models
68
69
Ordinary least squares with extensive spatial and non-spatial diagnostic capabilities, supporting robust standard errors and spatial lag of X (SLX) specifications.
70
71
```python { .api }
72
class OLS:
73
def __init__(self, y, x, w=None, robust=None, gwk=None, sig2n_k=False,
74
nonspat_diag=True, spat_diag=False, moran=False,
75
white_test=False, vif=False, slx_lags=0, slx_vars='All',
76
regimes=None, vm=False, constant_regi='one', cols2regi='all',
77
regime_err_sep=False, cores=False, name_y=None, name_x=None,
78
name_w=None, name_ds=None, latex=False): ...
79
```
80
81
[OLS Models](./ols-models.md)
82
83
### Two-Stage Least Squares
84
85
Two-stage least squares estimation for handling endogenous variables, with spatial diagnostic capabilities and regime-based analysis.
86
87
```python { .api }
88
class TSLS:
89
def __init__(self, y, x, yend, q, h=None, robust=None, gwk=None, sig2n_k=False,
90
nonspat_diag=True, spat_diag=False, slx_lags=0, slx_vars='All',
91
regimes=None, vm=False, name_y=None, name_x=None, name_yend=None,
92
name_q=None, name_h=None, name_w=None, name_ds=None, latex=False): ...
93
```
94
95
[Two-Stage Least Squares](./tsls-models.md)
96
97
### Spatial Error Models
98
99
GMM estimation of spatial error models with options for heteroskedasticity, homoskedasticity, and combined spatial lag-error specifications (SARAR models).
100
101
```python { .api }
102
class GM_Error_Het:
103
def __init__(self, y, x, w, max_iter=1, epsilon=0.0000001, step1c=False,
104
inv_method='power_exp', hard_bound=False, vm=False, name_y=None,
105
name_x=None, name_w=None, name_ds=None, latex=False): ...
106
107
class GM_Error_Hom:
108
def __init__(self, y, x, w, hard_bound=False, vm=False, name_y=None,
109
name_x=None, name_w=None, name_ds=None, latex=False): ...
110
111
class GM_Combo_Het:
112
def __init__(self, y, x, yend, q, w, w_lags=1, lag_q=True, max_iter=1,
113
epsilon=0.0000001, step1c=False, inv_method='power_exp',
114
hard_bound=False, vm=False, name_y=None, name_x=None,
115
name_yend=None, name_q=None, name_w=None, name_ds=None,
116
latex=False): ...
117
```
118
119
[Spatial Error Models](./spatial-error-models.md)
120
121
### Maximum Likelihood Models
122
123
Full information maximum likelihood estimation for spatial lag and error models with analytical derivatives and concentrated log-likelihood functions.
124
125
```python { .api }
126
class ML_Error:
127
def __init__(self, y, x, w, epsilon=0.0000001, hard_bound=False, vm=False,
128
name_y=None, name_x=None, name_w=None, name_ds=None, latex=False): ...
129
130
class ML_Lag:
131
def __init__(self, y, x, w, epsilon=0.0000001, hard_bound=False, vm=False,
132
name_y=None, name_x=None, name_w=None, name_ds=None, latex=False): ...
133
```
134
135
[Maximum Likelihood Models](./ml-models.md)
136
137
### Regime-Based Models
138
139
Spatial regression models allowing parameters to vary across regimes, with options for separate or joint estimation and extensive regime-specific diagnostics.
140
141
```python { .api }
142
class GM_Error_Het_Regimes:
143
def __init__(self, y, x, regimes, w, constant_regi='many', cols2regi='all',
144
regime_err_sep=False, regime_lag_sep=False, cores=False,
145
max_iter=1, epsilon=0.0000001, step1c=False,
146
inv_method='power_exp', hard_bound=False, vm=False,
147
name_y=None, name_x=None, name_regimes=None, name_w=None,
148
name_ds=None, latex=False): ...
149
```
150
151
[Regime Models](./regime-models.md)
152
153
### Panel Data Models
154
155
Fixed effects and random effects panel data models with spatial error and lag specifications for analyzing spatially-correlated panel datasets.
156
157
```python { .api }
158
class Panel_FE_Error:
159
def __init__(self, y, x, w, epsilon=0.0000001, hard_bound=False, vm=False,
160
name_y=None, name_x=None, name_w=None, name_ds=None,
161
latex=False): ...
162
163
class Panel_RE_Error:
164
def __init__(self, y, x, w, epsilon=0.0000001, hard_bound=False, vm=False,
165
name_y=None, name_x=None, name_w=None, name_ds=None,
166
latex=False): ...
167
```
168
169
[Panel Models](./panel-models.md)
170
171
### SUR Models
172
173
Seemingly unrelated regressions with spatial error and lag structures for simultaneous equation systems with cross-equation correlation.
174
175
```python { .api }
176
class SUR:
177
def __init__(self, bigy, bigX, df_name=None, sur_constant=True, name_bigy=None,
178
name_bigX=None, name_ds=None, vm=False, latex=False): ...
179
```
180
181
[SUR Models](./sur-models.md)
182
183
### Probit Models
184
185
Spatial probit regression for binary choice models with spatial dependence, supporting various spatial structures and diagnostic tests.
186
187
```python { .api }
188
class Probit:
189
def __init__(self, y, x, w=None, optim='newton', scalem='phimean', maxiter=100,
190
vm=False, name_y=None, name_x=None, name_w=None, name_ds=None,
191
latex=False, hard_bound=False): ...
192
```
193
194
[Probit Models](./probit-models.md)
195
196
### Diagnostic Functions
197
198
Comprehensive diagnostic testing for spatial autocorrelation, heteroskedasticity, normality, multicollinearity, and model specification in spatial regression contexts.
199
200
```python { .api }
201
class LMtests:
202
def __init__(self, ols, w, tests=["all"]): ...
203
204
class MoranRes:
205
def __init__(self, ols, w, z=False): ...
206
207
class AKtest:
208
def __init__(self, iv, w, case='nosp'): ...
209
210
def jarque_bera(reg): ...
211
def breusch_pagan(reg): ...
212
def white(reg): ...
213
```
214
215
[Diagnostic Functions](./diagnostics.md)
216
217
### Utility Functions
218
219
Core utilities for spatial operations, matrix computations, GMM optimization, and data generation for simulation studies.
220
221
```python { .api }
222
def get_lags(w, x, w_lags): ...
223
def get_spFilter(w, lamb, sf): ...
224
def optim_moments(moments_in, vcX, all_par, start, hard_bound): ...
225
def set_endog(y, x, w, yend, q, w_lags, lag_q, slx_lags, slx_vars): ...
226
227
class RegressionPropsY: ...
228
class RegressionPropsVM: ...
229
```
230
231
[Utilities](./utilities.md)
232
233
## Common Parameters
234
235
Most spreg models share these common parameters:
236
237
- `y` (array): nx1 dependent variable
238
- `x` (array): nxk independent variables (constant added automatically unless suppressed)
239
- `w` (pysal W object or sparse matrix): Spatial weights matrix
240
- `regimes` (list/Series): Regime identifier for observations
241
- `vm` (boolean): Include variance-covariance matrix in output
242
- `name_y`, `name_x`, `name_w`, `name_ds` (strings): Variable and dataset names for output
243
- `latex` (boolean): Format output for LaTeX
244
- `hard_bound` (boolean): Raise exception if spatial parameters outside [-1,1]
245
246
## Common Attributes
247
248
All spreg regression models provide these standard attributes:
249
250
- `betas` (array): Estimated coefficients
251
- `u` (array): Residuals
252
- `predy` (array): Predicted values
253
- `vm` (array): Variance-covariance matrix (if requested)
254
- `n` (int): Number of observations
255
- `k` (int): Number of parameters
256
- `output` (DataFrame): Formatted results table
257
- `summary` (string): Comprehensive summary with diagnostics
258
259
Spatial models additionally provide:
260
- `e_filtered` (array): Spatially filtered residuals
261
- `pr2` (float): Pseudo R-squared
262
- Spatial parameters (`rho` for lag, `lambda` for error)