The Python ensemble sampling toolkit for affine-invariant MCMC
npx @tessl/cli install tessl/pypi-emcee@3.1.00
# emcee
1
2
emcee is a stable, well-tested Python implementation of the affine-invariant ensemble sampler for Markov chain Monte Carlo (MCMC) proposed by Goodman & Weare (2010). It provides a robust framework for Bayesian parameter estimation and model comparison through efficient ensemble sampling methods, specifically designed for scientific computing and statistical analysis.
3
4
## Package Information
5
6
- **Package Name**: emcee
7
- **Language**: Python
8
- **Installation**: `pip install emcee`
9
10
## Core Imports
11
12
```python
13
import emcee
14
```
15
16
For specific components:
17
18
```python
19
from emcee import EnsembleSampler, State
20
from emcee import moves, backends, autocorr
21
```
22
23
## Basic Usage
24
25
```python
26
import emcee
27
import numpy as np
28
29
# Define log probability function
30
def log_prob(theta):
31
# Example: 2D Gaussian
32
return -0.5 * np.sum(theta**2)
33
34
# Set up sampler
35
nwalkers = 32
36
ndim = 2
37
sampler = emcee.EnsembleSampler(nwalkers, ndim, log_prob)
38
39
# Initialize walker positions
40
initial_state = np.random.randn(nwalkers, ndim)
41
42
# Run MCMC
43
sampler.run_mcmc(initial_state, nsteps=1000)
44
45
# Get results
46
chain = sampler.get_chain()
47
log_prob_samples = sampler.get_log_prob()
48
```
49
50
## Architecture
51
52
emcee's architecture centers on ensemble-based MCMC sampling:
53
54
- **EnsembleSampler**: The main sampling engine that coordinates an ensemble of walkers
55
- **State**: Encapsulates the current state of all walkers (positions, log probabilities, blobs)
56
- **Moves**: Proposal algorithms that generate new walker positions (stretch, walk, differential evolution, etc.)
57
- **Backends**: Storage systems for persisting chains (in-memory, HDF5)
58
- **Autocorr**: Analysis tools for assessing chain convergence and autocorrelation
59
60
The ensemble approach enables efficient sampling by having walkers interact and learn from each other, making it particularly effective for complex, multimodal distributions.
61
62
## Capabilities
63
64
### Ensemble Sampling
65
66
Core ensemble MCMC sampling functionality with the EnsembleSampler class, supporting various initialization methods, sampling control, and result retrieval.
67
68
```python { .api }
69
class EnsembleSampler:
70
def __init__(self, nwalkers: int, ndim: int, log_prob_fn: callable,
71
pool=None, moves=None, args=None, kwargs=None,
72
backend=None, vectorize: bool = False, blobs_dtype=None,
73
parameter_names=None): ...
74
75
def run_mcmc(self, initial_state, nsteps: int, **kwargs): ...
76
def sample(self, initial_state, iterations: int = 1, **kwargs): ...
77
def get_chain(self, **kwargs): ...
78
def get_log_prob(self, **kwargs): ...
79
def get_autocorr_time(self, **kwargs): ...
80
```
81
82
[Ensemble Sampling](./ensemble-sampling.md)
83
84
### Proposal Moves
85
86
Comprehensive collection of proposal move algorithms for generating new walker positions, including stretch moves, differential evolution, kernel density estimation, and Metropolis-Hastings variants.
87
88
```python { .api }
89
class StretchMove:
90
def __init__(self, a: float = 2.0): ...
91
92
class DEMove:
93
def __init__(self, sigma: float = 1e-5, gamma0: float = None): ...
94
95
class KDEMove:
96
def __init__(self, bw_method=None): ...
97
```
98
99
[Proposal Moves](./moves.md)
100
101
### Storage Backends
102
103
Flexible storage systems for persisting MCMC chains, supporting both in-memory and file-based backends with features like compression, chunking, and resumable sampling.
104
105
```python { .api }
106
class Backend:
107
def __init__(self, dtype=None): ...
108
def get_chain(self, **kwargs): ...
109
def get_log_prob(self, **kwargs): ...
110
111
class HDFBackend(Backend):
112
def __init__(self, filename: str, name: str = "mcmc", read_only: bool = False): ...
113
```
114
115
[Storage Backends](./backends.md)
116
117
### Convergence Analysis
118
119
Statistical tools for assessing MCMC chain convergence through autocorrelation analysis, including integrated autocorrelation time estimation and diagnostic functions.
120
121
```python { .api }
122
def integrated_time(x, c: int = 5, tol: int = 50, quiet: bool = False,
123
has_walkers: bool = True): ...
124
125
def function_1d(x): ...
126
127
class AutocorrError(Exception): ...
128
```
129
130
[Convergence Analysis](./autocorr.md)
131
132
### State Management
133
134
Walker state representation and manipulation, providing a unified interface for handling walker positions, log probabilities, metadata blobs, and random number generator states.
135
136
```python { .api }
137
class State:
138
def __init__(self, coords, log_prob=None, blobs=None, random_state=None,
139
copy: bool = False): ...
140
141
coords: np.ndarray
142
log_prob: np.ndarray
143
blobs: any
144
random_state: any
145
```
146
147
[State Management](./state.md)
148
149
## Types
150
151
```python { .api }
152
# Core state representation
153
class State:
154
coords: np.ndarray # Walker positions [nwalkers, ndim]
155
log_prob: np.ndarray # Log probabilities [nwalkers]
156
blobs: any # Metadata blobs
157
random_state: any # Random number generator state
158
159
# Exception for autocorrelation analysis
160
class AutocorrError(Exception):
161
pass
162
163
# Model representation (internal)
164
from collections import namedtuple
165
Model = namedtuple("Model", ["log_prob_fn", "compute_log_prob_fn", "map_fn", "random"])
166
```