0
# Ensemble Sampling
1
2
The core functionality of emcee is provided by the EnsembleSampler class, which implements the affine-invariant ensemble sampler for MCMC. This sampler coordinates an ensemble of walkers that explore the parameter space collectively, making it highly effective for complex, multimodal distributions.
3
4
## Capabilities
5
6
### EnsembleSampler Class
7
8
The main sampling engine that manages an ensemble of walkers and orchestrates the MCMC sampling process.
9
10
```python { .api }
11
class EnsembleSampler:
12
def __init__(self, nwalkers: int, ndim: int, log_prob_fn: callable,
13
pool=None, moves=None, args=None, kwargs=None,
14
backend=None, vectorize: bool = False, blobs_dtype=None,
15
parameter_names=None):
16
"""
17
Initialize the ensemble sampler.
18
19
Args:
20
nwalkers: Number of walkers in the ensemble (must be >= 2*ndim)
21
ndim: Number of dimensions in parameter space
22
log_prob_fn: Function that returns log probability for given parameters
23
pool: Parallel processing pool (multiprocessing, MPI, etc.)
24
moves: Single move, list of moves, or weighted list of moves
25
args: Extra positional arguments for log_prob_fn
26
kwargs: Extra keyword arguments for log_prob_fn
27
backend: Storage backend (Backend, HDFBackend, etc.)
28
vectorize: If True, log_prob_fn accepts list of positions
29
blobs_dtype: Data type for blob storage
30
parameter_names: Names for parameters (enables dict parameter passing)
31
"""
32
```
33
34
### Running MCMC
35
36
Methods for executing MCMC sampling with various control options.
37
38
```python { .api }
39
def run_mcmc(self, initial_state, nsteps: int, **kwargs):
40
"""
41
Run MCMC for a fixed number of steps.
42
43
Args:
44
initial_state: Starting positions (State object or array [nwalkers, ndim])
45
nsteps: Number of MCMC steps to run
46
**kwargs: Additional sampling options (tune, skip_initial_state_check, etc.)
47
48
Returns:
49
State: Final state of the ensemble
50
"""
51
52
def sample(self, initial_state, iterations: int = 1, tune: bool = False,
53
skip_initial_state_check: bool = False, thin_by: int = 1,
54
thin=None, store: bool = True, progress: bool = False):
55
"""
56
Generator function for step-by-step MCMC sampling.
57
58
Args:
59
initial_state: Starting positions
60
iterations: Number of iterations to yield
61
tune: Whether to tune move parameters during sampling
62
skip_initial_state_check: Skip walker independence check
63
thin_by: Only store every thin_by steps
64
thin: Deprecated, use thin_by instead
65
store: Whether to store samples in backend
66
progress: Show progress bar
67
68
Yields:
69
State: Current state after each iteration
70
"""
71
```
72
73
### Result Retrieval
74
75
Methods for accessing sampling results and diagnostic information.
76
77
```python { .api }
78
def get_chain(self, flat: bool = False, thin: int = 1, discard: int = 0):
79
"""
80
Get the stored chain of MCMC samples.
81
82
Args:
83
flat: Flatten chain across ensemble dimension
84
thin: Take every thin steps
85
discard: Discard first discard steps as burn-in
86
87
Returns:
88
ndarray: Chain with shape [steps, nwalkers, ndim] or [steps*nwalkers, ndim] if flat
89
"""
90
91
def get_log_prob(self, flat: bool = False, thin: int = 1, discard: int = 0):
92
"""
93
Get log probability values for each sample.
94
95
Returns:
96
ndarray: Log probabilities with shape [steps, nwalkers] or [steps*nwalkers] if flat
97
"""
98
99
def get_blobs(self, flat: bool = False, thin: int = 1, discard: int = 0):
100
"""
101
Get blob data for each sample.
102
103
Returns:
104
ndarray or None: Blob data if available
105
"""
106
107
def get_autocorr_time(self, tol: int = 50, c: int = 5, quiet: bool = False):
108
"""
109
Compute integrated autocorrelation time for each parameter.
110
111
Args:
112
tol: Tolerance for autocorrelation time convergence
113
c: Window factor for autocorrelation analysis
114
quiet: Suppress warnings
115
116
Returns:
117
ndarray: Autocorrelation times for each parameter
118
"""
119
120
def get_last_sample(self):
121
"""
122
Get the last sample from the chain.
123
124
Returns:
125
State: Last sampled state
126
"""
127
```
128
129
### Sampler Properties
130
131
Properties for accessing sampler state and results.
132
133
```python { .api }
134
@property
135
def chain(self):
136
"""Get the full chain as ndarray [steps, nwalkers, ndim]"""
137
138
@property
139
def lnprobability(self):
140
"""Get log probabilities as ndarray [steps, nwalkers]"""
141
142
@property
143
def acceptance_fraction(self):
144
"""Get acceptance fraction for each walker"""
145
146
@property
147
def acor(self):
148
"""Get autocorrelation time (deprecated, use get_autocorr_time())"""
149
150
@property
151
def flatchain(self):
152
"""Get flattened chain [steps*nwalkers, ndim]"""
153
154
@property
155
def flatlnprobability(self):
156
"""Get flattened log probabilities [steps*nwalkers]"""
157
158
@property
159
def backend(self):
160
"""Get the backend storage object"""
161
```
162
163
### Sampler Control
164
165
Methods for controlling and resetting the sampler state.
166
167
```python { .api }
168
def reset(self):
169
"""
170
Reset the sampler to its initial state.
171
Clears all stored samples and resets iteration counter.
172
"""
173
```
174
175
### Walker Independence Validation
176
177
Function for checking whether initial walker positions are sufficiently independent.
178
179
```python { .api }
180
def walkers_independent(coords):
181
"""
182
Check if walker positions are linearly independent.
183
184
Args:
185
coords: Walker positions [nwalkers, ndim]
186
187
Returns:
188
bool: True if walkers are sufficiently independent
189
"""
190
```
191
192
## Usage Examples
193
194
### Basic Sampling
195
196
```python
197
import emcee
198
import numpy as np
199
200
def log_prob(theta):
201
# 2D Gaussian example
202
return -0.5 * np.sum(theta**2)
203
204
# Set up ensemble
205
nwalkers, ndim = 32, 2
206
sampler = emcee.EnsembleSampler(nwalkers, ndim, log_prob)
207
208
# Initialize walkers
209
pos = np.random.randn(nwalkers, ndim)
210
211
# Run sampling
212
sampler.run_mcmc(pos, nsteps=1000)
213
214
# Access results
215
chain = sampler.get_chain(discard=100, flat=True)
216
log_prob_vals = sampler.get_log_prob(discard=100, flat=True)
217
```
218
219
### Progressive Sampling with Tuning
220
221
```python
222
# Initial burn-in with tuning
223
state = sampler.run_mcmc(pos, 500, tune=True)
224
225
# Production sampling
226
final_state = sampler.run_mcmc(state, 1000, tune=False)
227
228
# Check autocorrelation
229
tau = sampler.get_autocorr_time()
230
print(f"Autocorrelation time: {tau}")
231
```
232
233
### Using Generator Interface
234
235
```python
236
# Step-by-step sampling
237
pos = np.random.randn(nwalkers, ndim)
238
239
for i, state in enumerate(sampler.sample(pos, iterations=1000)):
240
if i % 100 == 0:
241
print(f"Step {i}, acceptance fraction: {np.mean(sampler.acceptance_fraction)}")
242
```
243
244
### Parallel Sampling
245
246
```python
247
from multiprocessing import Pool
248
249
def log_prob(theta):
250
return -0.5 * np.sum(theta**2)
251
252
# Use multiprocessing pool
253
with Pool() as pool:
254
sampler = emcee.EnsembleSampler(nwalkers, ndim, log_prob, pool=pool)
255
sampler.run_mcmc(pos, 1000)
256
```