CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-pydoe3

Design of experiments library for Python with comprehensive experimental design capabilities

Pending
Overview
Eval results
Files

utilities-advanced.mddocs/

Utilities & Advanced Functions

Statistical analysis tools and advanced design manipulation functions for evaluating and modifying experimental designs. These utilities provide essential functionality for design assessment, model building, and regression analysis.

Capabilities

Regression Analysis

Tools for building regression models and evaluating prediction variance from experimental designs.

Variance of Regression Error

Compute the variance of the regression error at specific points, useful for evaluating design quality and prediction uncertainty.

def var_regression_matrix(H, x, model, sigma=1):
    """
    Compute the variance of the regression error
    
    Parameters:
    - H: 2d-array, regression matrix (design matrix)
    - x: 2d-array, coordinates to calculate regression error variance at
    - model: str, string of tokens defining regression model (e.g., '1 x1 x2 x1*x2')
    - sigma: scalar, estimate of error variance (default: 1)
    
    Returns:
    - var: scalar, variance of regression error evaluated at x
    """

Key Features:

  • Evaluates prediction variance at any point in the design space
  • Supports arbitrary polynomial and interaction models
  • Essential for assessing design quality and prediction uncertainty
  • Used in optimal design algorithms and design comparison

Model Specification: The model string uses space-separated terms:

  • "1": intercept/constant term
  • "x0", "x1", "x2": linear terms for factors 0, 1, 2
  • "x0*x1": interaction between factors 0 and 1
  • "x0*x0": quadratic term for factor 0
  • "x0*x1*x2": three-way interaction

Usage Examples:

import pyDOE3
import numpy as np

# Create a design matrix (e.g., from factorial design)
design = pyDOE3.ff2n(3)  # 2^3 factorial design

# Define evaluation points
eval_points = np.array([[0, 0, 0],      # center point
                       [1, 1, 1],      # corner point
                       [0.5, 0, -0.5]]) # arbitrary point

# Linear model: y = β₀ + β₁x₁ + β₂x₂ + β₃x₃
linear_model = "1 x0 x1 x2"

# Calculate prediction variance at each point
for i, point in enumerate(eval_points):
    var = pyDOE3.var_regression_matrix(design, point, linear_model, sigma=2.0)
    print(f"Point {i+1} prediction variance: {var:.4f}")

# Quadratic model with interactions
quadratic_model = "1 x0 x1 x2 x0*x0 x1*x1 x2*x2 x0*x1 x0*x2 x1*x2"
var_quad = pyDOE3.var_regression_matrix(design, [0, 0, 0], quadratic_model)
print(f"Center point quadratic model variance: {var_quad:.4f}")

Regression Matrix Construction

Build regression matrices from design matrices and model specifications for statistical analysis.

Matrix Builder

def build_regression_matrix(H, model, build=None):
    """
    Build a regression matrix using a DOE matrix and list of monomials
    
    Parameters:
    - H: 2d-array, design matrix
    - model: str, space-separated string of model terms
    - build: bool-array, optional, which terms to include (default: all)
    
    Returns:
    - R: 2d-array, expanded regression matrix with model terms
    """

Usage Example:

import pyDOE3

# Design matrix
design = pyDOE3.ccdesign(2)  # Central composite design

# Build regression matrix for quadratic model
model_terms = "1 x0 x1 x0*x0 x1*x1 x0*x1"
regression_matrix = pyDOE3.build_regression_matrix(design, model_terms)

print(f"Design shape: {design.shape}")
print(f"Regression matrix shape: {regression_matrix.shape}")

# Selective term inclusion
include_terms = [True, True, True, False, False, True]  # skip quadratic terms
selective_matrix = pyDOE3.build_regression_matrix(design, model_terms, include_terms)

String Search Utility

Helper function for pattern matching in model specifications.

def grep(haystack, needle):
    """
    Generator function for finding all occurrences of a substring
    
    Parameters:
    - haystack: str, string to search in
    - needle: str, substring to find
    
    Yields:
    - int, starting positions of matches
    """

Design Evaluation Workflow

Complete Analysis Example

import pyDOE3
import numpy as np

# Step 1: Create experimental design
design = pyDOE3.bbdesign(3, center=3)
print(f"Design shape: {design.shape}")

# Step 2: Define model
model = "1 x0 x1 x2 x0*x0 x1*x1 x2*x2 x0*x1 x0*x2 x1*x2"

# Step 3: Build regression matrix
reg_matrix = pyDOE3.build_regression_matrix(design, model)
print(f"Regression matrix shape: {reg_matrix.shape}")

# Step 4: Evaluate prediction variance at key points
test_points = [
    [0, 0, 0],        # center
    [1, 1, 1],        # corner
    [-1, -1, -1],     # opposite corner
    [1, 0, 0],        # face center
    [0.5, 0.5, 0.5]   # intermediate
]

print("\nPrediction Variance Analysis:")
print("Point\t\tVariance")
print("-" * 30)

for i, point in enumerate(test_points):
    var = pyDOE3.var_regression_matrix(design, point, model, sigma=1.0)
    print(f"Point {i+1:2d}\t\t{var:8.4f}")

# Step 5: Design quality assessment
XtX = reg_matrix.T @ reg_matrix
det_XtX = np.linalg.det(XtX)
trace_inv_XtX = np.trace(np.linalg.inv(XtX))

print(f"\nDesign Quality Metrics:")
print(f"Determinant(X'X): {det_XtX:.6f}")
print(f"Trace(inv(X'X)): {trace_inv_XtX:.6f}")

Design Comparison Utility

def compare_designs(designs, names, model, eval_points):
    """
    Compare multiple designs based on prediction variance
    """
    results = {}
    
    for name, design in zip(names, designs):
        variances = []
        for point in eval_points:
            var = pyDOE3.var_regression_matrix(design, point, model)
            variances.append(var)
        
        results[name] = {
            'mean_variance': np.mean(variances),
            'max_variance': np.max(variances),
            'variances': variances
        }
    
    return results

# Example usage
designs = [
    pyDOE3.bbdesign(3),
    pyDOE3.ccdesign(3),
    pyDOE3.lhs(3, samples=15)
]
names = ['Box-Behnken', 'Central Composite', 'Latin Hypercube']
model = "1 x0 x1 x2 x0*x0 x1*x1 x2*x2"

test_points = [[0,0,0], [1,1,1], [-1,-1,-1]]
comparison = compare_designs(designs, names, model, test_points)

Integration with Other pyDOE3 Functions

The utilities complement all other pyDOE3 capabilities:

With Classical Designs

# Evaluate factorial design quality
factorial = pyDOE3.ff2n(4)
model = "1 x0 x1 x2 x3 x0*x1 x0*x2 x1*x2"
center_var = pyDOE3.var_regression_matrix(factorial, [0,0,0,0], model)

With Response Surface Designs

# Assess RSM design prediction capability
bb_design = pyDOE3.bbdesign(3)
rsm_model = "1 x0 x1 x2 x0*x0 x1*x1 x2*x2 x0*x1 x0*x2 x1*x2"
prediction_variance_map = []

for i in np.linspace(-1, 1, 11):
    for j in np.linspace(-1, 1, 11):
        var = pyDOE3.var_regression_matrix(bb_design, [i, j, 0], rsm_model)
        prediction_variance_map.append([i, j, var])

With Optimal Designs

# Validate optimal design performance
candidates = pyDOE3.doe_optimal.generate_candidate_set(3, 5)
optimal_design, info = pyDOE3.doe_optimal.optimal_design(
    candidates, n_points=15, degree=2, criterion="D"
)

# Compare with theoretical optimal
model = "1 x0 x1 x2 x0*x0 x1*x1 x2*x2 x0*x1 x0*x2 x1*x2"
avg_var = np.mean([
    pyDOE3.var_regression_matrix(optimal_design, point, model)
    for point in candidates[::10]  # sample of candidate points
])

Error Handling and Validation

Common Issues and Solutions

Model-Design Compatibility:

try:
    var = pyDOE3.var_regression_matrix(design, point, model)
except ValueError as e:
    if "don't suit together" in str(e):
        print("Error: Model has more parameters than design can support")
        print("Solution: Use simpler model or larger design")

Rank Deficiency:

  • Occurs when design matrix is singular
  • Solution: Add more design points or simplify model
  • Check: np.linalg.matrix_rank(regression_matrix)

Point Evaluation:

  • Ensure evaluation points are within reasonable bounds
  • Use same coding/scaling as design matrix
  • For coded designs, use [-1, 1] ranges

Types

import numpy as np
from typing import List, Optional, Union, Generator

# Core types
DesignMatrix = np.ndarray
RegressionMatrix = np.ndarray
ModelString = str
EvaluationPoint = Union[List[float], np.ndarray]

# Utility types
VarianceEstimate = float
ModelTerms = List[str]
TermSelector = Optional[List[bool]]

# Generator type for grep function
PositionGenerator = Generator[int, None, None]

Statistical Background

The regression error variance formula used is:

Var(ŷ(x)) = σ² × x'(X'X)⁻¹x

Where:

  • σ²: error variance estimate
  • x: evaluation point (expanded with model terms)
  • X: design matrix (expanded with model terms)
  • (X'X)⁻¹: inverse of information matrix

This variance represents the uncertainty in predictions at point x, making it essential for:

  • Design quality assessment
  • Optimal design algorithms
  • Prediction interval construction
  • Design space exploration

Install with Tessl CLI

npx tessl i tessl/pypi-pydoe3

docs

classical-factorial.md

index.md

optimal-design.md

response-surface.md

sampling-randomized.md

taguchi-robust.md

utilities-advanced.md

tile.json