0
# Spatial Error Models
1
2
GMM estimation of spatial error models with options for heteroskedasticity assumptions, endogenous variables, and combined spatial lag-error specifications (SARAR models).
3
4
## Capabilities
5
6
### Heteroskedastic Spatial Error Models
7
8
GMM spatial error models that allow for heteroskedasticity, using the Arraiz et al. methodology for robust estimation.
9
10
```python { .api }
11
class GM_Error_Het:
12
def __init__(self, y, x, w, max_iter=1, epsilon=0.0000001, step1c=False,
13
inv_method='power_exp', hard_bound=False, vm=False, name_y=None,
14
name_x=None, name_w=None, name_ds=None, latex=False):
15
"""
16
GMM spatial error model with heteroskedasticity.
17
18
Parameters:
19
- y (array): nx1 dependent variable
20
- x (array): nxk independent variables (constant added automatically)
21
- w (sparse matrix): nxn spatial weights matrix (sparse format)
22
- max_iter (int): Maximum iterations for steps 2a and 2b (default 1)
23
- epsilon (float): Convergence criterion (default 1e-7)
24
- step1c (bool): Include Step 1c from Arraiz et al. methodology
25
- inv_method (str): 'power_exp' (default) or 'true_inv' for matrix inversion
26
- hard_bound (bool): Raise exception if lambda outside [-1,1]
27
- vm (bool): Include variance-covariance matrix in output
28
- name_y, name_x, name_w, name_ds (str): Variable and dataset names
29
- latex (bool): Format output for LaTeX
30
31
Attributes:
32
- betas (array): kx1 estimated coefficients (includes lambda)
33
- u (array): nx1 residuals
34
- predy (array): nx1 predicted values
35
- e_filtered (array): nx1 spatially filtered residuals
36
- vm (array): kxk variance-covariance matrix (if requested)
37
- sig2 (float): Sigma squared
38
- pr2 (float): Pseudo R-squared
39
- iteration (int): Number of iterations performed
40
- iter_stop (str): Convergence criterion reached
41
- summary (str): Formatted results
42
- output (DataFrame): Results table
43
"""
44
45
class GM_Endog_Error_Het:
46
def __init__(self, y, x, yend, q, w, max_iter=1, epsilon=0.0000001,
47
step1c=False, inv_method='power_exp', hard_bound=False,
48
vm=False, name_y=None, name_x=None, name_yend=None,
49
name_q=None, name_w=None, name_ds=None, latex=False):
50
"""
51
GMM spatial error model with heteroskedasticity and endogenous variables.
52
53
Parameters:
54
- y (array): nx1 dependent variable
55
- x (array): nxk exogenous independent variables
56
- yend (array): nxp endogenous variables
57
- q (array): nxq external instruments
58
- w (sparse matrix): nxn spatial weights matrix
59
- Additional parameters same as GM_Error_Het
60
61
Attributes:
62
- All GM_Error_Het attributes plus:
63
- z (array): Combined exogenous and endogenous variables
64
- h (array): All instruments (x and q combined)
65
"""
66
67
class GM_Combo_Het:
68
def __init__(self, y, x, yend, q, w, w_lags=1, lag_q=True, max_iter=1,
69
epsilon=0.0000001, step1c=False, inv_method='power_exp',
70
hard_bound=False, vm=False, name_y=None, name_x=None,
71
name_yend=None, name_q=None, name_w=None, name_ds=None,
72
latex=False):
73
"""
74
GMM spatial lag and error model (SARAR) with heteroskedasticity.
75
76
Parameters:
77
- y (array): nx1 dependent variable
78
- x (array): nxk exogenous independent variables
79
- yend (array): nxp endogenous variables (should include Wy for SARAR)
80
- q (array): nxq external instruments
81
- w (sparse matrix): nxn spatial weights matrix
82
- w_lags (int): Orders of W to include as instruments (default 1)
83
- lag_q (bool): Include spatial lags of additional instruments
84
- Additional parameters same as GM_Error_Het
85
86
Attributes:
87
- All GM_Endog_Error_Het attributes plus:
88
- rho (float): Spatial lag parameter (coefficient on Wy)
89
- Contains both rho (lag) and lambda (error) parameters in betas
90
"""
91
```
92
93
### Homoskedastic Spatial Error Models
94
95
GMM spatial error models assuming homoskedasticity, using the Drukker et al. methodology for efficient estimation.
96
97
```python { .api }
98
class GM_Error_Hom:
99
def __init__(self, y, x, w, hard_bound=False, vm=False, name_y=None,
100
name_x=None, name_w=None, name_ds=None, latex=False):
101
"""
102
GMM spatial error model assuming homoskedasticity.
103
104
Parameters:
105
- y (array): nx1 dependent variable
106
- x (array): nxk independent variables (constant added automatically)
107
- w (sparse matrix): nxn spatial weights matrix
108
- hard_bound (bool): Raise exception if lambda outside [-1,1]
109
- vm (bool): Include variance-covariance matrix
110
- name_y, name_x, name_w, name_ds (str): Variable and dataset names
111
- latex (bool): LaTeX formatting
112
113
Attributes:
114
- betas (array): kx1 estimated coefficients (includes lambda)
115
- u (array): nx1 residuals
116
- predy (array): nx1 predicted values
117
- e_filtered (array): nx1 spatially filtered residuals
118
- vm (array): kxk variance-covariance matrix (if requested)
119
- sig2 (float): Sigma squared
120
- pr2 (float): Pseudo R-squared
121
- summary (str): Formatted results
122
- output (DataFrame): Results table
123
"""
124
125
class GM_Endog_Error_Hom:
126
def __init__(self, y, x, yend, q, w, hard_bound=False, vm=False,
127
name_y=None, name_x=None, name_yend=None, name_q=None,
128
name_w=None, name_ds=None, latex=False):
129
"""
130
GMM spatial error model with homoskedasticity and endogenous variables.
131
132
Parameters and attributes similar to GM_Endog_Error_Het but with
133
homoskedasticity assumption for more efficient estimation.
134
"""
135
136
class GM_Combo_Hom:
137
def __init__(self, y, x, yend, q, w, w_lags=1, lag_q=True, hard_bound=False,
138
vm=False, name_y=None, name_x=None, name_yend=None,
139
name_q=None, name_w=None, name_ds=None, latex=False):
140
"""
141
GMM spatial lag and error model (SARAR) assuming homoskedasticity.
142
143
Parameters and attributes similar to GM_Combo_Het but with
144
homoskedasticity assumption.
145
"""
146
```
147
148
### Wrapper Classes
149
150
Convenient wrapper classes that automatically select appropriate spatial error models based on specification.
151
152
```python { .api }
153
class GMM_Error:
154
def __init__(self, y, x, w, yend=None, q=None, estimator='het',
155
add_wy=False, slx_lags=0, slx_vars='All', vm=False,
156
name_y=None, name_x=None, name_yend=None, name_q=None,
157
name_w=None, name_ds=None, latex=False, **kwargs):
158
"""
159
Comprehensive wrapper for GMM spatial error models.
160
161
Parameters:
162
- y, x, w: Standard regression variables and spatial weights
163
- yend (array, optional): Endogenous variables
164
- q (array, optional): External instruments
165
- estimator (str): 'het' (heteroskedastic), 'hom' (homoskedastic),
166
or 'kp98' (Kelejian-Prucha 1998)
167
- add_wy (bool): Include spatial lag of y (creates SARAR model)
168
- slx_lags (int): Number of spatial lags of X to include
169
- slx_vars (str/list): Variables to spatially lag
170
- Additional naming and formatting parameters
171
- **kwargs: Additional parameters passed to underlying estimator
172
173
The wrapper automatically instantiates the appropriate model class
174
based on the estimator choice and presence of endogenous variables.
175
"""
176
```
177
178
## Usage Examples
179
180
### Basic Spatial Error Model
181
182
```python
183
import numpy as np
184
import spreg
185
from libpysal import weights
186
187
# Generate spatial data
188
n = 49 # 7x7 grid
189
x = np.random.randn(n, 2)
190
w = weights.lat2W(7, 7)
191
w_sparse = w.sparse
192
193
# Create spatial error structure
194
lambda_true = 0.5
195
e = np.random.randn(n, 1)
196
# Spatial error: v = λWv + e, so v = (I - λW)^(-1)e
197
I_lW_inv = np.linalg.inv(np.eye(n) - lambda_true * w.full()[0])
198
v = I_lW_inv @ e
199
200
# Dependent variable with spatial error
201
y = 1 + 2 * x[:, 0:1] + 3 * x[:, 1:2] + v
202
203
# Estimate spatial error model (heteroskedastic)
204
error_model = spreg.GM_Error_Het(y, x, w_sparse, name_y='y',
205
name_x=['x1', 'x2'])
206
207
print(error_model.summary)
208
print(f"Estimated lambda: {error_model.betas[-1][0]:.3f} (true: {lambda_true})")
209
print(f"Pseudo R-squared: {error_model.pr2:.3f}")
210
```
211
212
### Spatial Error Model with Homoskedasticity
213
214
```python
215
import numpy as np
216
import spreg
217
from libpysal import weights
218
219
# Same data setup as above
220
n = 49
221
x = np.random.randn(n, 2)
222
w = weights.lat2W(7, 7)
223
y = np.random.randn(n, 1) # simplified for demonstration
224
225
# Homoskedastic spatial error model (more efficient if assumption holds)
226
hom_error = spreg.GM_Error_Hom(y, x, w.sparse, name_y='y',
227
name_x=['x1', 'x2'])
228
229
print(hom_error.summary)
230
print("Assumes homoskedastic errors for efficiency")
231
```
232
233
### Spatial Error with Endogenous Variables
234
235
```python
236
import numpy as np
237
import spreg
238
from libpysal import weights
239
240
# Data with endogeneity and spatial error
241
n = 100
242
x = np.random.randn(n, 2)
243
z = np.random.randn(n, 2) # instruments
244
w = weights.KNN.from_array(np.random.randn(n, 2), k=5)
245
246
# Endogenous variable
247
yend = 1.5 * z[:, 0:1] + 0.8 * z[:, 1:2] + np.random.randn(n, 1)
248
249
# Dependent variable with endogeneity and spatial error
250
error_term = np.random.randn(n, 1)
251
y = 1 + x[:, 0:1] + 2 * x[:, 1:2] + 1.2 * yend + error_term
252
253
# Spatial error model with endogenous variables
254
endog_error = spreg.GM_Endog_Error_Het(y, x, yend, z, w.sparse,
255
name_y='y', name_x=['x1', 'x2'],
256
name_yend=['yend'],
257
name_q=['z1', 'z2'])
258
259
print(endog_error.summary)
260
print("Handles both endogeneity and spatial error dependence")
261
```
262
263
### SARAR Model (Spatial Lag and Error)
264
265
```python
266
import numpy as np
267
import spreg
268
from libpysal import weights
269
from spreg.utils import lag_spatial
270
271
# SARAR model: y = ρWy + Xβ + u, u = λWu + ε
272
n = 100
273
x = np.random.randn(n, 2)
274
w = weights.KNN.from_array(np.random.randn(n, 2), k=5)
275
276
# Create Wy as endogenous variable
277
y_temp = np.random.randn(n, 1)
278
wy = lag_spatial(w, y_temp)
279
280
# Use spatial lags of X as instruments
281
wx1 = lag_spatial(w, x[:, 0:1])
282
wx2 = lag_spatial(w, x[:, 1:2])
283
q = np.hstack([wx1, wx2]) # instruments
284
285
# Final y includes spatial lag
286
y = 1 + 0.4 * wy + x[:, 0:1] + 2 * x[:, 1:2] + np.random.randn(n, 1)
287
288
# SARAR model (spatial lag + spatial error)
289
sarar_model = spreg.GM_Combo_Het(y, x, wy, q, w.sparse, w_lags=1,
290
name_y='y', name_x=['x1', 'x2'],
291
name_yend=['W_y'], name_q=['W_x1', 'W_x2'])
292
293
print(sarar_model.summary)
294
print("Estimates both spatial lag (rho) and spatial error (lambda) parameters")
295
print(f"Rho (spatial lag): estimated from coefficient on W_y")
296
print(f"Lambda (spatial error): {sarar_model.betas[-1][0]:.3f}")
297
```
298
299
### Using the GMM_Error Wrapper
300
301
```python
302
import numpy as np
303
import spreg
304
from libpysal import weights
305
306
# Data setup
307
n = 100
308
x = np.random.randn(n, 2)
309
y = np.random.randn(n, 1)
310
w = weights.KNN.from_array(np.random.randn(n, 2), k=5)
311
312
# Use wrapper for automatic model selection
313
# Heteroskedastic spatial error
314
het_wrapper = spreg.GMM_Error(y, x, w, estimator='het',
315
name_y='y', name_x=['x1', 'x2'])
316
317
# Homoskedastic spatial error
318
hom_wrapper = spreg.GMM_Error(y, x, w, estimator='hom',
319
name_y='y', name_x=['x1', 'x2'])
320
321
# SARAR model using wrapper
322
sarar_wrapper = spreg.GMM_Error(y, x, w, add_wy=True, estimator='het',
323
name_y='y', name_x=['x1', 'x2'])
324
325
print("Wrapper automatically selects appropriate model class")
326
```
327
328
### Convergence and Iteration Control
329
330
```python
331
import numpy as np
332
import spreg
333
from libpysal import weights
334
335
# Control iteration for heteroskedastic models
336
n = 100
337
x = np.random.randn(n, 2)
338
y = np.random.randn(n, 1)
339
w = weights.KNN.from_array(np.random.randn(n, 2), k=5)
340
341
# Multiple iterations for better convergence
342
multi_iter = spreg.GM_Error_Het(y, x, w.sparse, max_iter=3,
343
epsilon=1e-8, step1c=True,
344
name_y='y', name_x=['x1', 'x2'])
345
346
print(multi_iter.summary)
347
print(f"Converged after {multi_iter.iteration} iterations")
348
print(f"Convergence criterion: {multi_iter.iter_stop}")
349
350
# Alternative inversion method for numerical stability
351
true_inv = spreg.GM_Error_Het(y, x, w.sparse, inv_method='true_inv',
352
name_y='y', name_x=['x1', 'x2'])
353
print("Uses true matrix inversion instead of power expansion")
354
```
355
356
## Model Selection Guidelines
357
358
### Heteroskedastic vs Homoskedastic
359
- **Use heteroskedastic models** (`GM_Error_Het`) when:
360
- Error variance is not constant across observations
361
- Robust estimation is preferred (default choice)
362
- Working with diverse spatial units (e.g., different sized regions)
363
364
- **Use homoskedastic models** (`GM_Error_Hom`) when:
365
- Confident that error variance is constant
366
- Seeking more efficient estimation
367
- Working with regular spatial grids
368
369
### Basic vs Endogenous vs SARAR
370
- **Basic spatial error** (`GM_Error_*`): Pure spatial error dependence
371
- **Endogenous spatial error** (`GM_Endog_Error_*`): Spatial error + endogenous variables
372
- **SARAR models** (`GM_Combo_*`): Both spatial lag and spatial error dependence
373
374
### Iteration and Convergence
375
- **max_iter**: Start with 1, increase if convergence issues
376
- **epsilon**: Default 1e-7 usually sufficient
377
- **step1c**: Include for Arraiz et al. full methodology
378
- **inv_method**: Use 'true_inv' if power expansion fails
379
380
### Instrument Selection for SARAR
381
- Use spatial lags of X variables (Wx) as instruments for Wy
382
- Higher-order spatial lags (W²x, W³x) for additional instruments
383
- Ensure instruments are relevant (strong correlation with Wy)
384
385
## Diagnostic Interpretation
386
387
### Spatial Parameters
388
- **Lambda (λ)**: Spatial error parameter, should be in [-1,1]
389
- Positive λ indicates positive spatial error correlation
390
- λ near ±1 may indicate model misspecification
391
392
### Model Fit
393
- **Pseudo R-squared**: Cannot use standard R² due to spatial transformation
394
- Compare across spatial models with same data
395
396
### Convergence
397
- Check `iteration` and `iter_stop` attributes
398
- Non-convergence may indicate identification problems
399
- Try different starting values or iteration controls