0
# Utilities & Advanced Functions
1
2
Statistical analysis tools and advanced design manipulation functions for evaluating and modifying experimental designs. These utilities provide essential functionality for design assessment, model building, and regression analysis.
3
4
## Capabilities
5
6
### Regression Analysis
7
8
Tools for building regression models and evaluating prediction variance from experimental designs.
9
10
#### Variance of Regression Error
11
12
Compute the variance of the regression error at specific points, useful for evaluating design quality and prediction uncertainty.
13
14
```python { .api }
15
def var_regression_matrix(H, x, model, sigma=1):
16
"""
17
Compute the variance of the regression error
18
19
Parameters:
20
- H: 2d-array, regression matrix (design matrix)
21
- x: 2d-array, coordinates to calculate regression error variance at
22
- model: str, string of tokens defining regression model (e.g., '1 x1 x2 x1*x2')
23
- sigma: scalar, estimate of error variance (default: 1)
24
25
Returns:
26
- var: scalar, variance of regression error evaluated at x
27
"""
28
```
29
30
**Key Features:**
31
- Evaluates prediction variance at any point in the design space
32
- Supports arbitrary polynomial and interaction models
33
- Essential for assessing design quality and prediction uncertainty
34
- Used in optimal design algorithms and design comparison
35
36
**Model Specification:**
37
The model string uses space-separated terms:
38
- `"1"`: intercept/constant term
39
- `"x0"`, `"x1"`, `"x2"`: linear terms for factors 0, 1, 2
40
- `"x0*x1"`: interaction between factors 0 and 1
41
- `"x0*x0"`: quadratic term for factor 0
42
- `"x0*x1*x2"`: three-way interaction
43
44
**Usage Examples:**
45
```python
46
import pyDOE3
47
import numpy as np
48
49
# Create a design matrix (e.g., from factorial design)
50
design = pyDOE3.ff2n(3) # 2^3 factorial design
51
52
# Define evaluation points
53
eval_points = np.array([[0, 0, 0], # center point
54
[1, 1, 1], # corner point
55
[0.5, 0, -0.5]]) # arbitrary point
56
57
# Linear model: y = β₀ + β₁x₁ + β₂x₂ + β₃x₃
58
linear_model = "1 x0 x1 x2"
59
60
# Calculate prediction variance at each point
61
for i, point in enumerate(eval_points):
62
var = pyDOE3.var_regression_matrix(design, point, linear_model, sigma=2.0)
63
print(f"Point {i+1} prediction variance: {var:.4f}")
64
65
# Quadratic model with interactions
66
quadratic_model = "1 x0 x1 x2 x0*x0 x1*x1 x2*x2 x0*x1 x0*x2 x1*x2"
67
var_quad = pyDOE3.var_regression_matrix(design, [0, 0, 0], quadratic_model)
68
print(f"Center point quadratic model variance: {var_quad:.4f}")
69
```
70
71
### Regression Matrix Construction
72
73
Build regression matrices from design matrices and model specifications for statistical analysis.
74
75
#### Matrix Builder
76
77
```python { .api }
78
def build_regression_matrix(H, model, build=None):
79
"""
80
Build a regression matrix using a DOE matrix and list of monomials
81
82
Parameters:
83
- H: 2d-array, design matrix
84
- model: str, space-separated string of model terms
85
- build: bool-array, optional, which terms to include (default: all)
86
87
Returns:
88
- R: 2d-array, expanded regression matrix with model terms
89
"""
90
```
91
92
**Usage Example:**
93
```python
94
import pyDOE3
95
96
# Design matrix
97
design = pyDOE3.ccdesign(2) # Central composite design
98
99
# Build regression matrix for quadratic model
100
model_terms = "1 x0 x1 x0*x0 x1*x1 x0*x1"
101
regression_matrix = pyDOE3.build_regression_matrix(design, model_terms)
102
103
print(f"Design shape: {design.shape}")
104
print(f"Regression matrix shape: {regression_matrix.shape}")
105
106
# Selective term inclusion
107
include_terms = [True, True, True, False, False, True] # skip quadratic terms
108
selective_matrix = pyDOE3.build_regression_matrix(design, model_terms, include_terms)
109
```
110
111
### String Search Utility
112
113
Helper function for pattern matching in model specifications.
114
115
```python { .api }
116
def grep(haystack, needle):
117
"""
118
Generator function for finding all occurrences of a substring
119
120
Parameters:
121
- haystack: str, string to search in
122
- needle: str, substring to find
123
124
Yields:
125
- int, starting positions of matches
126
"""
127
```
128
129
## Design Evaluation Workflow
130
131
### Complete Analysis Example
132
133
```python
134
import pyDOE3
135
import numpy as np
136
137
# Step 1: Create experimental design
138
design = pyDOE3.bbdesign(3, center=3)
139
print(f"Design shape: {design.shape}")
140
141
# Step 2: Define model
142
model = "1 x0 x1 x2 x0*x0 x1*x1 x2*x2 x0*x1 x0*x2 x1*x2"
143
144
# Step 3: Build regression matrix
145
reg_matrix = pyDOE3.build_regression_matrix(design, model)
146
print(f"Regression matrix shape: {reg_matrix.shape}")
147
148
# Step 4: Evaluate prediction variance at key points
149
test_points = [
150
[0, 0, 0], # center
151
[1, 1, 1], # corner
152
[-1, -1, -1], # opposite corner
153
[1, 0, 0], # face center
154
[0.5, 0.5, 0.5] # intermediate
155
]
156
157
print("\nPrediction Variance Analysis:")
158
print("Point\t\tVariance")
159
print("-" * 30)
160
161
for i, point in enumerate(test_points):
162
var = pyDOE3.var_regression_matrix(design, point, model, sigma=1.0)
163
print(f"Point {i+1:2d}\t\t{var:8.4f}")
164
165
# Step 5: Design quality assessment
166
XtX = reg_matrix.T @ reg_matrix
167
det_XtX = np.linalg.det(XtX)
168
trace_inv_XtX = np.trace(np.linalg.inv(XtX))
169
170
print(f"\nDesign Quality Metrics:")
171
print(f"Determinant(X'X): {det_XtX:.6f}")
172
print(f"Trace(inv(X'X)): {trace_inv_XtX:.6f}")
173
```
174
175
### Design Comparison Utility
176
177
```python
178
def compare_designs(designs, names, model, eval_points):
179
"""
180
Compare multiple designs based on prediction variance
181
"""
182
results = {}
183
184
for name, design in zip(names, designs):
185
variances = []
186
for point in eval_points:
187
var = pyDOE3.var_regression_matrix(design, point, model)
188
variances.append(var)
189
190
results[name] = {
191
'mean_variance': np.mean(variances),
192
'max_variance': np.max(variances),
193
'variances': variances
194
}
195
196
return results
197
198
# Example usage
199
designs = [
200
pyDOE3.bbdesign(3),
201
pyDOE3.ccdesign(3),
202
pyDOE3.lhs(3, samples=15)
203
]
204
names = ['Box-Behnken', 'Central Composite', 'Latin Hypercube']
205
model = "1 x0 x1 x2 x0*x0 x1*x1 x2*x2"
206
207
test_points = [[0,0,0], [1,1,1], [-1,-1,-1]]
208
comparison = compare_designs(designs, names, model, test_points)
209
```
210
211
## Integration with Other pyDOE3 Functions
212
213
The utilities complement all other pyDOE3 capabilities:
214
215
### With Classical Designs
216
```python
217
# Evaluate factorial design quality
218
factorial = pyDOE3.ff2n(4)
219
model = "1 x0 x1 x2 x3 x0*x1 x0*x2 x1*x2"
220
center_var = pyDOE3.var_regression_matrix(factorial, [0,0,0,0], model)
221
```
222
223
### With Response Surface Designs
224
```python
225
# Assess RSM design prediction capability
226
bb_design = pyDOE3.bbdesign(3)
227
rsm_model = "1 x0 x1 x2 x0*x0 x1*x1 x2*x2 x0*x1 x0*x2 x1*x2"
228
prediction_variance_map = []
229
230
for i in np.linspace(-1, 1, 11):
231
for j in np.linspace(-1, 1, 11):
232
var = pyDOE3.var_regression_matrix(bb_design, [i, j, 0], rsm_model)
233
prediction_variance_map.append([i, j, var])
234
```
235
236
### With Optimal Designs
237
```python
238
# Validate optimal design performance
239
candidates = pyDOE3.doe_optimal.generate_candidate_set(3, 5)
240
optimal_design, info = pyDOE3.doe_optimal.optimal_design(
241
candidates, n_points=15, degree=2, criterion="D"
242
)
243
244
# Compare with theoretical optimal
245
model = "1 x0 x1 x2 x0*x0 x1*x1 x2*x2 x0*x1 x0*x2 x1*x2"
246
avg_var = np.mean([
247
pyDOE3.var_regression_matrix(optimal_design, point, model)
248
for point in candidates[::10] # sample of candidate points
249
])
250
```
251
252
## Error Handling and Validation
253
254
### Common Issues and Solutions
255
256
**Model-Design Compatibility:**
257
```python
258
try:
259
var = pyDOE3.var_regression_matrix(design, point, model)
260
except ValueError as e:
261
if "don't suit together" in str(e):
262
print("Error: Model has more parameters than design can support")
263
print("Solution: Use simpler model or larger design")
264
```
265
266
**Rank Deficiency:**
267
- Occurs when design matrix is singular
268
- Solution: Add more design points or simplify model
269
- Check: `np.linalg.matrix_rank(regression_matrix)`
270
271
**Point Evaluation:**
272
- Ensure evaluation points are within reasonable bounds
273
- Use same coding/scaling as design matrix
274
- For coded designs, use [-1, 1] ranges
275
276
## Types
277
278
```python { .api }
279
import numpy as np
280
from typing import List, Optional, Union, Generator
281
282
# Core types
283
DesignMatrix = np.ndarray
284
RegressionMatrix = np.ndarray
285
ModelString = str
286
EvaluationPoint = Union[List[float], np.ndarray]
287
288
# Utility types
289
VarianceEstimate = float
290
ModelTerms = List[str]
291
TermSelector = Optional[List[bool]]
292
293
# Generator type for grep function
294
PositionGenerator = Generator[int, None, None]
295
```
296
297
## Statistical Background
298
299
The regression error variance formula used is:
300
301
**Var(ŷ(x)) = σ² × x'(X'X)⁻¹x**
302
303
Where:
304
- **σ²**: error variance estimate
305
- **x**: evaluation point (expanded with model terms)
306
- **X**: design matrix (expanded with model terms)
307
- **(X'X)⁻¹**: inverse of information matrix
308
309
This variance represents the uncertainty in predictions at point x, making it essential for:
310
- Design quality assessment
311
- Optimal design algorithms
312
- Prediction interval construction
313
- Design space exploration