0
# Stheno
1
2
A comprehensive Python library for Gaussian process modeling that enables probabilistic machine learning with support for multiple backend frameworks including AutoGrad, TensorFlow, PyTorch, and JAX. Stheno provides a flexible and expressive API for constructing sophisticated Gaussian process models, including support for multi-output regression, sparse approximations, inducing points, custom kernels and means, batched computation, and hyperparameter optimization.
3
4
## Package Information
5
6
- **Package Name**: stheno
7
- **Language**: Python
8
- **Installation**: `pip install stheno`
9
10
## Core Imports
11
12
```python
13
import stheno
14
```
15
16
For specific backend support:
17
18
```python
19
# Backend-specific imports (choose one)
20
from stheno.autograd import GP, EQ # AutoGrad backend
21
from stheno.tensorflow import GP, EQ # TensorFlow backend
22
from stheno.torch import GP, EQ # PyTorch backend
23
from stheno.jax import GP, EQ # JAX backend
24
```
25
26
Each backend provides the same API but uses different computational frameworks for automatic differentiation and GPU acceleration.
27
28
Common objects and functions:
29
30
```python
31
from stheno import GP, Measure, FDD, Normal, PseudoObs, Obs
32
from stheno import EQ, Matern52, Linear # Kernels from mlkernels
33
```
34
35
## Basic Usage
36
37
```python
38
import stheno
39
import numpy as np
40
41
# Create a Gaussian process with an exponential quadratic kernel
42
gp = stheno.GP(kernel=stheno.EQ())
43
44
# Generate sample data
45
x = np.linspace(0, 2, 10)
46
y = np.sin(x) + 0.1 * np.random.randn(len(x))
47
48
# Create finite-dimensional distribution at observation points
49
fdd = gp(x)
50
51
# Condition on observations to get posterior (using | operator)
52
posterior = gp | (fdd, y)
53
54
# Make predictions at new points
55
x_new = np.linspace(0, 2, 100)
56
pred_fdd = posterior(x_new)
57
58
# Get mean and credible bounds
59
mean, lower, upper = pred_fdd.marginal_credible_bounds()
60
61
# Sample from the posterior
62
samples = pred_fdd.sample(5)
63
```
64
65
## Architecture
66
67
Stheno's architecture is built around several key concepts:
68
69
- **Gaussian Process (GP)**: The core probabilistic model representing a distribution over functions
70
- **Measure**: Container that manages collections of GPs and their relationships
71
- **Finite-Dimensional Distribution (FDD)**: Represents GP evaluated at specific input points
72
- **Observations**: Structured containers for conditioning data, including sparse approximations
73
- **Kernels and Means**: Building blocks from mlkernels library for defining GP priors
74
- **Backend Support**: Multiple computational backends through LAB abstraction layer
75
76
This design enables flexible GP model construction, efficient inference, and seamless integration with modern ML frameworks while maintaining mathematical rigor and computational efficiency.
77
78
## Capabilities
79
80
### Core Gaussian Process Operations
81
82
Fundamental GP construction, evaluation, conditioning, and posterior inference. Includes creating GPs with custom kernels and means, evaluating at points to create finite-dimensional distributions, and conditioning on observations.
83
84
```python { .api }
85
class GP:
86
def __init__(self, mean=None, kernel=None, *, measure=None, name=None): ...
87
def __call__(self, x, noise=None) -> FDD: ...
88
def condition(self, *args): ...
89
def __or__(self, *args): ... # Shorthand for condition
90
```
91
92
```python { .api }
93
class FDD(Normal):
94
def __init__(self, p, x, noise=None): ...
95
p: GP # Process of FDD
96
x: Any # Inputs
97
noise: Optional[Any] # Additive noise
98
```
99
100
[Core GP Operations](./core-gp.md)
101
102
### GP Arithmetic and Transformations
103
104
Mathematical operations for combining and transforming Gaussian processes, including addition, multiplication, differentiation, input transformations, and dimension selection.
105
106
```python { .api }
107
def cross(*ps) -> GP: ... # Cartesian product of processes
108
109
class GP:
110
def __add__(self, other): ... # Addition
111
def __mul__(self, other): ... # Multiplication
112
def shift(self, shift): ... # Input shifting
113
def stretch(self, stretch): ... # Input stretching
114
def transform(self, f): ... # Input transformation
115
def select(self, *dims): ... # Dimension selection
116
def diff(self, dim=0): ... # Differentiation
117
```
118
119
[GP Arithmetic and Transformations](./gp-operations.md)
120
121
### Measure and Model Management
122
123
Advanced model organization using measures to manage collections of GPs, naming, cross-referencing, and maintaining relationships between processes.
124
125
```python { .api }
126
class Measure:
127
def __init__(self): ...
128
def add_independent_gp(self, p, mean, kernel): ...
129
def name(self, p, name): ...
130
def __call__(self, p): ...
131
def condition(self, obs): ...
132
def sample(self, *args): ...
133
def logpdf(self, *args): ...
134
```
135
136
[Measure and Model Management](./measure.md)
137
138
### Observations and Conditioning
139
140
Structured observation handling including standard observations and sparse approximations (VFE, FITC, DTC) for scalable GP inference with inducing points.
141
142
```python { .api }
143
class Observations:
144
def __init__(self, fdd, y): ...
145
def __init__(self, *pairs): ...
146
147
class PseudoObservations:
148
def __init__(self, u, fdd, y): ...
149
def elbo(self, measure): ...
150
method: str # "vfe", "fitc", or "dtc"
151
152
class PseudoObservationsFITC(PseudoObservations): ...
153
class PseudoObservationsDTC(PseudoObservations): ...
154
```
155
156
[Observations and Conditioning](./observations.md)
157
158
### Multi-Output Gaussian Processes
159
160
Support for multi-output GPs using specialized kernel and mean functions that handle vector-valued processes and cross-covariances between outputs.
161
162
```python { .api }
163
class MultiOutputKernel:
164
def __init__(self, measure, *ps): ...
165
measure: Measure
166
ps: Tuple[GP, ...]
167
168
class MultiOutputMean:
169
def __init__(self, measure, *ps): ...
170
def __call__(self, x): ...
171
```
172
173
[Multi-Output GPs](./multi-output.md)
174
175
### Random Variables and Distributions
176
177
Probabilistic computation with Normal distributions, including marginal calculations, sampling, entropy, KL divergence, and Wasserstein distances.
178
179
```python { .api }
180
class Normal(RandomVector):
181
def __init__(self, mean, var): ...
182
def __init__(self, var): ...
183
mean: Any # Mean vector
184
var: Any # Covariance matrix
185
def marginals(self): ...
186
def logpdf(self, x): ...
187
def sample(self, num=1, noise=None): ...
188
def entropy(self): ...
189
def kl(self, other): ...
190
def w2(self, other): ...
191
```
192
193
[Random Variables and Distributions](./random.md)
194
195
### Lazy Computation
196
197
Efficient computation with lazy evaluation for vectors and matrices that build values on-demand using custom rules and caching.
198
199
```python { .api }
200
class LazyVector:
201
def __init__(self): ...
202
def add_rule(self, indices, builder): ...
203
204
class LazyMatrix:
205
def __init__(self): ...
206
def add_rule(self, indices, builder): ...
207
def add_left_rule(self, i_left, indices, builder): ...
208
def add_right_rule(self, i_right, indices, builder): ...
209
```
210
211
[Lazy Computation](./lazy.md)
212
213
## Types
214
215
```python { .api }
216
# Core types from stheno
217
class BreakingChangeWarning(UserWarning): ...
218
219
# Random object hierarchy
220
class Random: ...
221
class RandomProcess(Random): ...
222
class RandomVector(Random): ...
223
224
# Re-exported from dependencies
225
B = lab # Backend abstraction
226
matrix = matrix # Structured matrices
227
```
228
229
## External Dependencies
230
231
Stheno re-exports functionality from several key libraries:
232
233
- **mlkernels**: All kernel and mean function classes (EQ, RQ, Matern52, Linear, ZeroMean, OneMean, etc.)
234
- **lab**: Backend abstraction available as `B` for array operations and linear algebra
235
- **matrix**: Structured matrix operations for efficient computation