or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

diagnostics.mdindex.mdml-models.mdols-models.mdpanel-models.mdprobit-models.mdregime-models.mdspatial-error-models.mdsur-models.mdtsls-models.mdutilities.md

ols-models.mddocs/

0

# OLS Models

1

2

Ordinary least squares regression with comprehensive spatial and non-spatial diagnostic capabilities. spreg provides both base OLS estimation and full diagnostic models with extensive testing options.

3

4

## Capabilities

5

6

### Base OLS Estimation

7

8

Core OLS estimation without diagnostics, providing essential regression coefficients and variance-covariance matrices with optional robust standard error corrections.

9

10

```python { .api }

11

class BaseOLS:

12

def __init__(self, y, x, robust=None, gwk=None, sig2n_k=False):

13

"""

14

Ordinary least squares estimation (no diagnostics or constant added).

15

16

Parameters:

17

- y (array): nx1 dependent variable

18

- x (array): nxk independent variables, excluding constant

19

- robust (str, optional): 'white' for White correction, 'hac' for HAC correction

20

- gwk (pysal W object, optional): Kernel spatial weights for HAC estimation

21

- sig2n_k (bool): If True, use n-k for sigma^2 estimation; if False, use n

22

23

Attributes:

24

- betas (array): kx1 estimated coefficients

25

- u (array): nx1 residuals

26

- predy (array): nx1 predicted values

27

- vm (array): kxk variance-covariance matrix

28

- sig2 (float): Sigma squared

29

- n (int): Number of observations

30

- k (int): Number of parameters

31

"""

32

```

33

34

### Full OLS with Diagnostics

35

36

Complete OLS implementation with spatial and non-spatial diagnostic tests, supporting SLX specifications and regime-based analysis.

37

38

```python { .api }

39

class OLS:

40

def __init__(self, y, x, w=None, robust=None, gwk=None, sig2n_k=False,

41

nonspat_diag=True, spat_diag=False, moran=False,

42

white_test=False, vif=False, slx_lags=0, slx_vars='All',

43

regimes=None, vm=False, constant_regi='one', cols2regi='all',

44

regime_err_sep=False, cores=False, name_y=None, name_x=None,

45

name_w=None, name_ds=None, latex=False):

46

"""

47

Ordinary least squares with extensive diagnostics.

48

49

Parameters:

50

- y (array): nx1 dependent variable

51

- x (array): nxk independent variables (constant added automatically)

52

- w (pysal W object, optional): Spatial weights for spatial diagnostics

53

- robust (str, optional): 'white' or 'hac' for robust standard errors

54

- gwk (pysal W object, optional): Kernel weights for HAC estimation

55

- sig2n_k (bool): Use n-k for sigma^2 estimation

56

- nonspat_diag (bool): Compute non-spatial diagnostics (default True)

57

- spat_diag (bool): Compute spatial diagnostics (requires w)

58

- moran (bool): Compute Moran's I test on residuals

59

- white_test (bool): Compute White's heteroskedasticity test

60

- vif (bool): Compute variance inflation factors

61

- slx_lags (int): Number of spatial lags of X to include

62

- slx_vars (str/list): Variables to be spatially lagged ('All' or list)

63

- regimes (list/Series, optional): Regime identifier for observations

64

- vm (bool): Include variance-covariance matrix in output

65

- constant_regi (str): 'one' (constant across regimes) or 'many'

66

- cols2regi (str/list): Variables that vary by regime ('all' or list)

67

- regime_err_sep (bool): Run separate regressions for each regime

68

- cores (bool): Use multiprocessing for regime estimation

69

- name_y, name_x, name_w, name_ds (str): Variable and dataset names

70

- latex (bool): Format output for LaTeX

71

72

Attributes:

73

- All BaseOLS attributes plus:

74

- r2 (float): R-squared

75

- ar2 (float): Adjusted R-squared

76

- f_stat (tuple): F-statistic (value, p-value)

77

- t_stat (list): t-statistics with p-values for each coefficient

78

- jarque_bera (dict): Jarque-Bera normality test results

79

- breusch_pagan (dict): Breusch-Pagan heteroskedasticity test

80

- white (dict): White heteroskedasticity test (if white_test=True)

81

- koenker_bassett (dict): Koenker-Bassett test results

82

- lm_error (dict): LM test for spatial error (if spat_diag=True)

83

- lm_lag (dict): LM test for spatial lag (if spat_diag=True)

84

- rlm_error (dict): Robust LM test for spatial error

85

- rlm_lag (dict): Robust LM test for spatial lag

86

- lm_sarma (dict): LM test for SARMA specification

87

- moran_res (dict): Moran's I test on residuals (if moran=True)

88

- vif (dict): Variance inflation factors (if vif=True)

89

- summary (str): Comprehensive formatted results

90

- output (DataFrame): Formatted results table

91

"""

92

```

93

94

## Usage Examples

95

96

### Basic OLS Regression

97

98

```python

99

import numpy as np

100

import spreg

101

from libpysal import weights

102

103

# Prepare data

104

n = 100

105

y = np.random.randn(n, 1)

106

x = np.random.randn(n, 3)

107

108

# Basic OLS without diagnostics

109

base_ols = spreg.BaseOLS(y, x)

110

print("Coefficients:", base_ols.betas.flatten())

111

print("R-squared would need manual calculation")

112

113

# Full OLS with non-spatial diagnostics

114

ols_model = spreg.OLS(y, x, nonspat_diag=True, name_y='y',

115

name_x=['x1', 'x2', 'x3'])

116

print(ols_model.summary)

117

print("R-squared:", ols_model.r2)

118

print("F-statistic:", ols_model.f_stat)

119

```

120

121

### OLS with Spatial Diagnostics

122

123

```python

124

import numpy as np

125

import spreg

126

from libpysal import weights

127

128

# Create spatial data

129

n = 49 # 7x7 grid

130

y = np.random.randn(n, 1)

131

x = np.random.randn(n, 2)

132

w = weights.lat2W(7, 7) # 7x7 lattice weights

133

134

# OLS with spatial diagnostics

135

spatial_ols = spreg.OLS(y, x, w=w, spat_diag=True, moran=True,

136

name_y='y', name_x=['x1', 'x2'])

137

138

print(spatial_ols.summary)

139

print("LM Error test:", spatial_ols.lm_error)

140

print("LM Lag test:", spatial_ols.lm_lag)

141

print("Moran's I on residuals:", spatial_ols.moran_res)

142

143

# Check if spatial dependence is detected

144

if spatial_ols.lm_error['p-value'] < 0.05:

145

print("Spatial error dependence detected")

146

if spatial_ols.lm_lag['p-value'] < 0.05:

147

print("Spatial lag dependence detected")

148

```

149

150

### OLS with SLX Specification

151

152

```python

153

import numpy as np

154

import spreg

155

from libpysal import weights

156

157

# Spatial lag of X (SLX) model

158

n = 100

159

y = np.random.randn(n, 1)

160

x = np.random.randn(n, 2)

161

w = weights.KNN.from_array(np.random.randn(n, 2), k=5)

162

163

# Include spatial lags of X variables

164

slx_model = spreg.OLS(y, x, w=w, slx_lags=1, slx_vars='All',

165

spat_diag=True, name_y='y', name_x=['x1', 'x2'])

166

167

print(slx_model.summary)

168

print("Number of coefficients (includes spatial lags):", slx_model.k)

169

```

170

171

### OLS with Robust Standard Errors

172

173

```python

174

import numpy as np

175

import spreg

176

177

# OLS with White robust standard errors

178

n = 100

179

y = np.random.randn(n, 1)

180

x = np.random.randn(n, 2)

181

182

# White correction for heteroskedasticity

183

white_ols = spreg.OLS(y, x, robust='white', nonspat_diag=True,

184

name_y='y', name_x=['x1', 'x2'])

185

186

print(white_ols.summary)

187

print("Uses White-corrected standard errors")

188

189

# HAC correction requires spatial weights kernel

190

from libpysal import weights

191

w_kernel = weights.DistanceBand.from_array(np.random.randn(n, 2),

192

threshold=1.0, binary=False)

193

hac_ols = spreg.OLS(y, x, robust='hac', gwk=w_kernel,

194

name_y='y', name_x=['x1', 'x2'])

195

print("Uses HAC-corrected standard errors")

196

```

197

198

### Regime-Based OLS

199

200

```python

201

import numpy as np

202

import spreg

203

204

# OLS with regimes

205

n = 100

206

y = np.random.randn(n, 1)

207

x = np.random.randn(n, 2)

208

regimes = np.random.choice(['A', 'B', 'C'], n)

209

210

# Different intercepts and slopes by regime

211

regime_ols = spreg.OLS(y, x, regimes=regimes, constant_regi='many',

212

cols2regi='all', name_y='y', name_x=['x1', 'x2'],

213

name_regimes='region')

214

215

print(regime_ols.summary)

216

print("Number of regimes:", regime_ols.nr)

217

print("Chow test results:", regime_ols.chow)

218

219

# Separate regression for each regime

220

separate_ols = spreg.OLS(y, x, regimes=regimes, regime_err_sep=True,

221

name_y='y', name_x=['x1', 'x2'])

222

print("Individual regime results:", separate_ols.multi.keys())

223

```

224

225

## Common Diagnostic Interpretations

226

227

### R-squared and Model Fit

228

- `r2`: Proportion of variance explained by the model

229

- `ar2`: Adjusted R-squared, penalized for number of parameters

230

- `f_stat`: Overall model significance test

231

232

### Heteroskedasticity Tests

233

- `breusch_pagan`: Tests for heteroskedasticity related to fitted values

234

- `white`: General heteroskedasticity test (if requested)

235

- `koenker_bassett`: Studentized version of Breusch-Pagan

236

237

### Spatial Dependence Tests

238

- `lm_error`: Tests for spatial error dependence

239

- `lm_lag`: Tests for spatial lag dependence

240

- `rlm_error`, `rlm_lag`: Robust versions accounting for local misspecification

241

- `lm_sarma`: Joint test for both error and lag dependence

242

- `moran_res`: Moran's I test on regression residuals

243

244

### Multicollinearity

245

- `vif`: Variance inflation factors for detecting multicollinearity

246

247

A VIF > 10 typically indicates problematic multicollinearity.

248

249

## Model Selection Guidelines

250

251

1. **Start with basic OLS** with non-spatial diagnostics

252

2. **Add spatial diagnostics** if working with spatial data

253

3. **Check for spatial dependence**:

254

- If LM Error is significant → consider spatial error model

255

- If LM Lag is significant → consider spatial lag model

256

- If both significant → use robust tests to distinguish

257

4. **Check for heteroskedasticity**: Use robust standard errors if detected

258

5. **Consider SLX specification** for spatially-lagged independent variables

259

6. **Use regime models** when parameters vary systematically across groups