Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano
npx @tessl/cli install tessl/pypi-pymc3@3.11.00
# PyMC3
1
2
A probabilistic programming library for Python that allows users to build Bayesian models with a simple Python API and fit them using Markov Chain Monte Carlo (MCMC) methods. PyMC3 provides a comprehensive suite of tools for Bayesian statistical modeling, including a large collection of probability distributions, advanced MCMC sampling algorithms, variational inference methods, and model checking utilities.
3
4
## Package Information
5
6
- **Package Name**: pymc3
7
- **Language**: Python
8
- **Installation**: `pip install pymc3`
9
- **Version**: 3.11.6
10
11
## Core Imports
12
13
```python
14
import pymc3 as pm
15
```
16
17
Common workflow imports:
18
19
```python
20
import pymc3 as pm
21
import numpy as np
22
import matplotlib.pyplot as plt
23
import arviz as az # For plotting and diagnostics
24
```
25
26
## Basic Usage
27
28
```python
29
import pymc3 as pm
30
import numpy as np
31
32
# Generate synthetic data
33
np.random.seed(42)
34
n = 100
35
true_alpha = 1.0
36
true_beta = 2.5
37
true_sigma = 0.5
38
x = np.random.randn(n)
39
y = true_alpha + true_beta * x + np.random.normal(0, true_sigma, n)
40
41
# Build Bayesian linear regression model
42
with pm.Model() as model:
43
# Priors for unknown model parameters
44
alpha = pm.Normal('alpha', mu=0, sigma=10)
45
beta = pm.Normal('beta', mu=0, sigma=10)
46
sigma = pm.HalfNormal('sigma', sigma=1)
47
48
# Expected value of outcome
49
mu = alpha + beta * x
50
51
# Likelihood (sampling distribution) of observations
52
y_obs = pm.Normal('y_obs', mu=mu, sigma=sigma, observed=y)
53
54
# Inference - draw posterior samples
55
trace = pm.sample(1000, tune=1000, return_inferencedata=True)
56
57
# Analyze results
58
print(pm.summary(trace))
59
60
# Plot results
61
import arviz as az
62
az.plot_trace(trace)
63
plt.show()
64
```
65
66
## Architecture
67
68
PyMC3 follows a hierarchical architecture that separates model specification from inference:
69
70
- **Model Container**: Central `Model` class manages all random variables, transformations, and computational graph
71
- **Distributions**: Rich library of probability distributions with automatic broadcasting and shape inference
72
- **Variables**: Random variables (priors, observed data) and deterministic transformations
73
- **Inference Engines**: MCMC samplers (NUTS, Metropolis), variational inference, and optimization methods
74
- **Backends**: Trace storage and retrieval systems with integration to ArviZ for analysis and visualization
75
76
The library leverages Theano for automatic differentiation, enabling gradient-based inference methods and efficient computation on CPU/GPU.
77
78
## Capabilities
79
80
### Probability Distributions
81
82
Comprehensive library of 60+ probability distributions including continuous, discrete, multivariate, time series, and mixture distributions. All distributions support automatic broadcasting, shape inference, and parameter transformations.
83
84
```python { .api }
85
# Continuous distributions
86
class Normal: ...
87
class Beta: ...
88
class Gamma: ...
89
class StudentT: ...
90
91
# Discrete distributions
92
class Binomial: ...
93
class Poisson: ...
94
class Categorical: ...
95
96
# Multivariate distributions
97
class MvNormal: ...
98
class Dirichlet: ...
99
class LKJCorr: ...
100
```
101
102
[Distributions](./distributions.md)
103
104
### Model Building and Core Constructs
105
106
Essential components for building Bayesian models including the Model context manager, random variables, deterministic variables, and potential functions for custom likelihood terms.
107
108
```python { .api }
109
class Model:
110
"""Main model container class."""
111
112
def Deterministic(name, var, model=None): ...
113
def Potential(name, var, model=None): ...
114
def Data(name, value, **kwargs): ...
115
116
class ObservedRV: ...
117
class FreeRV: ...
118
```
119
120
[Model Building](./modeling.md)
121
122
### MCMC Sampling Methods
123
124
Advanced Markov Chain Monte Carlo algorithms including the No-U-Turn Sampler (NUTS), Hamiltonian Monte Carlo, and various Metropolis methods with automatic step method assignment and tuning.
125
126
```python { .api }
127
def sample(draws=1000, step=None, init='auto', chains=None, **kwargs):
128
"""Draw samples from the posterior distribution using MCMC."""
129
130
def sample_posterior_predictive(trace, samples=None, model=None, **kwargs):
131
"""Generate posterior predictive samples."""
132
133
class NUTS: ...
134
class HamiltonianMC: ...
135
class Metropolis: ...
136
```
137
138
[Sampling](./sampling.md)
139
140
### Variational Inference
141
142
Scalable approximate inference methods including Automatic Differentiation Variational Inference (ADVI), Stein Variational Gradient Descent, and normalizing flows with various approximation families.
143
144
```python { .api }
145
def fit(n=10000, method='advi', model=None, **kwargs):
146
"""Fit a model using variational inference."""
147
148
class ADVI: ...
149
class FullRankADVI: ...
150
class SVGD: ...
151
class MeanField: ...
152
class FullRank: ...
153
```
154
155
[Variational Inference](./variational.md)
156
157
### Gaussian Processes
158
159
Flexible Gaussian process implementation with various covariance functions, mean functions, and inference methods including marginal, latent, and sparse GP formulations.
160
161
```python { .api }
162
class Marginal: ...
163
class Latent: ...
164
class MarginalSparse: ...
165
class MarginalKron: ...
166
167
# Available in pm.gp.cov and pm.gp.mean modules
168
class ExpQuad: ...
169
class Matern52: ...
170
class Linear: ...
171
```
172
173
[Gaussian Processes](./gaussian-processes.md)
174
175
### Generalized Linear Models
176
177
High-level interface for generalized linear models with family-specific distributions and link functions, providing streamlined model specification for common regression tasks.
178
179
```python { .api }
180
class GLM:
181
"""Generalized Linear Model implementation."""
182
183
class LinearComponent:
184
"""Linear component for GLM construction."""
185
186
# Available in pm.glm.families module
187
class Normal: ...
188
class Binomial: ...
189
class Poisson: ...
190
```
191
192
[GLM](./glm.md)
193
194
### Statistics and Plotting (ArviZ Integration)
195
196
Comprehensive model diagnostics, convergence assessment, and visualization capabilities through tight integration with ArviZ, including posterior analysis, model comparison, and publication-ready plots.
197
198
```python { .api }
199
# Convergence diagnostics
200
def r_hat(trace): ...
201
def ess(trace): ...
202
def mcse(trace): ...
203
204
# Model comparison
205
def compare(models, ic='waic'): ...
206
def waic(trace, model=None): ...
207
def loo(trace, model=None): ...
208
209
# Plotting functions
210
def plot_trace(trace): ...
211
def plot_posterior(trace): ...
212
def plot_forest(trace): ...
213
```
214
215
[Statistics and Plotting](./stats-plots.md)
216
217
### MCMC Step Methods
218
219
Comprehensive suite of MCMC step methods including Hamiltonian Monte Carlo, Metropolis variants, and specialized samplers with automatic step method selection and adaptive tuning.
220
221
```python { .api }
222
class NUTS: ...
223
class HamiltonianMC: ...
224
class Metropolis: ...
225
class Slice: ...
226
class EllipticalSlice: ...
227
class CompoundStep: ...
228
```
229
230
[Step Methods](./step-methods.md)
231
232
### Data Handling
233
234
Specialized data structures for observed data, minibatch processing, and generator-based data sources, enabling efficient memory usage and streaming data processing.
235
236
```python { .api }
237
class Data: ...
238
class Minibatch: ...
239
class GeneratorAdapter: ...
240
241
def align_minibatches(*minibatches): ...
242
def get_data(filename): ...
243
```
244
245
[Data Handling](./data-handling.md)
246
247
### Mathematical Functions
248
249
Comprehensive mathematical functions for tensor operations, link functions, and specialized computations with automatic differentiation support.
250
251
```python { .api }
252
def logit(p): ...
253
def invlogit(x): ...
254
def probit(p): ...
255
def invprobit(x): ...
256
def logsumexp(x): ...
257
def logaddexp(a, b): ...
258
def expand_packed_triangular(n, packed): ...
259
def kronecker(*Ks): ...
260
```
261
262
[Mathematical Functions](./math-functions.md)