or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

builtins.mdcategorical.mdcontrasts.mdhigh-level.mdindex.mdmatrix-building.mdsplines.mdtransforms.mdutilities.md

splines.mddocs/

0

# Spline Functions

1

2

B-splines and cubic regression splines for modeling non-linear relationships in statistical models. Patsy provides implementations compatible with R and MGCV, allowing flexible smooth terms in formulas.

3

4

## Capabilities

5

6

### B-Splines

7

8

Generates B-spline basis functions for non-linear curve fitting, providing smooth approximation of arbitrary functions.

9

10

```python { .api }

11

def bs(x, df=None, knots=None, degree=3, include_intercept=False, lower_bound=None, upper_bound=None):

12

"""

13

Generate B-spline basis for x, allowing non-linear fits.

14

15

Parameters:

16

- x: Array-like data to create spline basis for

17

- df (int or None): Number of degrees of freedom (columns in output)

18

- knots (array-like or None): Interior knot locations (default: equally spaced quantiles)

19

- degree (int): Degree of the spline (default: 3 for cubic)

20

- include_intercept (bool): Whether basis spans intercept term (default: False)

21

- lower_bound (float or None): Lower boundary for spline

22

- upper_bound (float or None): Upper boundary for spline

23

24

Returns:

25

2D array with basis functions as columns

26

27

Note: Must specify at least one of df and knots

28

"""

29

```

30

31

#### Usage Examples

32

33

```python

34

import patsy

35

import numpy as np

36

import pandas as pd

37

38

# Sample data with non-linear relationship

39

x = np.linspace(0, 10, 100)

40

y = 2 * np.sin(x) + np.random.normal(0, 0.1, 100)

41

data = pd.DataFrame({'x': x, 'y': y})

42

43

# Basic B-spline with 4 degrees of freedom

44

design = patsy.dmatrix("bs(x, df=4)", data)

45

print(f"B-spline basis shape: {design.shape}")

46

47

# B-spline with custom knots

48

knots = [2, 4, 6, 8]

49

design = patsy.dmatrix("bs(x, knots=knots)", data, extra_env={'knots': knots})

50

51

# Higher degree spline

52

design = patsy.dmatrix("bs(x, df=6, degree=5)", data)

53

54

# Include intercept in basis

55

design = patsy.dmatrix("bs(x, df=4, include_intercept=True)", data)

56

57

# Complete model with B-splines

58

y_matrix, X_matrix = patsy.dmatrices("y ~ bs(x, df=5)", data)

59

```

60

61

### Cubic Regression Splines

62

63

Natural cubic splines with optional constraints, compatible with MGCV's cubic regression splines.

64

65

```python { .api }

66

def cr(x, df=None, knots=None, lower_bound=None, upper_bound=None, constraints=None):

67

"""

68

Generate natural cubic spline basis for x with optional constraints.

69

70

Parameters:

71

- x: Array-like data to create spline basis for

72

- df (int or None): Number of degrees of freedom

73

- knots (array-like or None): Interior knot locations

74

- lower_bound (float or None): Lower boundary for spline

75

- upper_bound (float or None): Upper boundary for spline

76

- constraints (str or None): Constraint type ('center' for centering constraint)

77

78

Returns:

79

2D array with natural cubic spline basis functions

80

"""

81

```

82

83

#### Usage Examples

84

85

```python

86

import patsy

87

import numpy as np

88

89

# Basic cubic regression spline

90

x = np.linspace(-2, 2, 50)

91

y = x**3 + 0.5 * x + np.random.normal(0, 0.2, 50)

92

data = {'x': x, 'y': y}

93

94

# Natural cubic spline with 5 degrees of freedom

95

design = patsy.dmatrix("cr(x, df=5)", data)

96

97

# With centering constraint

98

design = patsy.dmatrix("cr(x, df=5, constraints='center')", data)

99

100

# Complete model

101

y_matrix, X_matrix = patsy.dmatrices("y ~ cr(x, df=6)", data)

102

```

103

104

### Cyclic Cubic Splines

105

106

Cubic splines with cyclic boundary conditions, useful for periodic data.

107

108

```python { .api }

109

def cc(x, df=None, knots=None, lower_bound=None, upper_bound=None, constraints=None):

110

"""

111

Generate cyclic cubic spline basis for x with optional constraints.

112

113

Parameters:

114

- x: Array-like data to create spline basis for

115

- df (int or None): Number of degrees of freedom

116

- knots (array-like or None): Interior knot locations

117

- lower_bound (float or None): Lower boundary for cyclic period

118

- upper_bound (float or None): Upper boundary for cyclic period

119

- constraints (str or None): Constraint type ('center' for centering constraint)

120

121

Returns:

122

2D array with cyclic cubic spline basis functions

123

"""

124

```

125

126

#### Usage Examples

127

128

```python

129

import patsy

130

import numpy as np

131

132

# Cyclic data (e.g., seasonal patterns, angles)

133

t = np.linspace(0, 2*np.pi, 100)

134

y = np.sin(2*t) + 0.5*np.cos(3*t) + np.random.normal(0, 0.1, 100)

135

data = {'t': t, 'y': y}

136

137

# Cyclic cubic spline

138

design = patsy.dmatrix("cc(t, df=8)", data)

139

140

# With explicit boundaries for the cyclic period

141

design = patsy.dmatrix("cc(t, df=8, lower_bound=0, upper_bound=6.28)", data)

142

143

# Complete model for seasonal data

144

y_matrix, X_matrix = patsy.dmatrices("y ~ cc(t, df=10)", data)

145

```

146

147

### Tensor Product Smooths

148

149

Multi-dimensional smooth terms as tensor products of univariate smooths, for modeling interactions between smooth functions.

150

151

```python { .api }

152

def te(*args, constraints=None):

153

"""

154

Generate tensor product smooth of several covariates.

155

156

Parameters:

157

- *args: Multiple smooth terms (s1, s2, ..., sn) as marginal univariate smooths

158

- constraints (str or None): Constraint type for the tensor product

159

160

Returns:

161

2D array with tensor product basis functions

162

163

Note: Marginal smooths must transform data into basis function arrays.

164

The resulting basis dimension is the product of marginal basis dimensions.

165

"""

166

```

167

168

#### Usage Examples

169

170

```python

171

import patsy

172

import numpy as np

173

174

# Two-dimensional smooth surface

175

x1 = np.random.uniform(-2, 2, 100)

176

x2 = np.random.uniform(-2, 2, 100)

177

y = x1**2 + x2**2 + x1*x2 + np.random.normal(0, 0.5, 100)

178

data = {'x1': x1, 'x2': x2, 'y': y}

179

180

# Tensor product of cubic regression splines

181

# Note: This requires careful setup of the marginal smooths

182

design = patsy.dmatrix("te(cr(x1, df=5), cr(x2, df=5))", data)

183

184

# Three-dimensional tensor product

185

x3 = np.random.uniform(-1, 1, 100)

186

data['x3'] = x3

187

design = patsy.dmatrix("te(cr(x1, df=4), cr(x2, df=4), cr(x3, df=3))", data)

188

189

# Complete model with tensor product smooth

190

y_matrix, X_matrix = patsy.dmatrices("y ~ te(cr(x1, df=5), cr(x2, df=5))", data)

191

```

192

193

## Spline Usage Patterns

194

195

### Choosing Spline Types

196

197

| Spline Type | Best For | Characteristics |

198

|-------------|----------|-----------------|

199

| B-splines (`bs`) | General smooth curves | Flexible, local support, compatible with R |

200

| Cubic regression (`cr`) | Natural smooth curves | Natural boundary conditions, MGCV compatible |

201

| Cyclic cubic (`cc`) | Periodic/seasonal data | Cyclic boundary conditions |

202

| Tensor products (`te`) | Multi-dimensional smooths | Interaction of smooth terms |

203

204

### Integration with Linear Models

205

206

```python

207

import patsy

208

import numpy as np

209

from sklearn.linear_model import LinearRegression

210

211

# Generate sample data

212

np.random.seed(42)

213

x = np.linspace(0, 10, 100)

214

y = 2*np.sin(x) + 0.5*x + np.random.normal(0, 0.3, 100)

215

data = {'x': x, 'y': y}

216

217

# Create spline design matrix

218

y_matrix, X_matrix = patsy.dmatrices("y ~ bs(x, df=6)", data)

219

220

# Fit with scikit-learn

221

model = LinearRegression(fit_intercept=False) # Intercept already in design matrix

222

model.fit(X_matrix, y_matrix.ravel())

223

224

# Predict on new data

225

x_new = np.linspace(0, 10, 50)

226

data_new = {'x': x_new}

227

X_new = patsy.dmatrix("bs(x, df=6)", data_new)

228

y_pred = model.predict(X_new)

229

```

230

231

### Combining Splines with Other Terms

232

233

```python

234

# Mixed models with splines and linear terms

235

y, X = patsy.dmatrices("y ~ x1 + bs(x2, df=4) + C(group)", data)

236

237

# Multiple spline terms

238

y, X = patsy.dmatrices("y ~ bs(x1, df=3) + bs(x2, df=5)", data)

239

240

# Spline interactions

241

y, X = patsy.dmatrices("y ~ bs(x1, df=3) * bs(x2, df=3)", data)

242

```

243

244

## Advanced Spline Features

245

246

### Boundary Handling

247

248

Splines handle boundaries differently:

249

- **B-splines**: Can specify explicit bounds

250

- **Natural splines**: Linear extrapolation beyond boundaries

251

- **Cyclic splines**: Periodic boundary conditions

252

253

### Constraint Options

254

255

- **Centering constraints**: Remove the overall mean from spline basis

256

- **Custom constraints**: Apply specific parameter constraints

257

- **Integration constraints**: Ensure specific integral properties

258

259

### Stateful Transform Nature

260

261

All spline functions are stateful transforms, meaning:

262

- They remember the training data characteristics

263

- They can be consistently applied to new data

264

- They integrate with Patsy's incremental processing system