0
# Model Interface
1
2
The core CmdStanModel class that handles Stan program compilation and provides methods for all inference algorithms. This is the primary interface for working with Stan models in CmdStanPy.
3
4
## Capabilities
5
6
### CmdStanModel Class
7
8
The main class for encapsulating Stan models, handling compilation, and providing inference methods.
9
10
```python { .api }
11
class CmdStanModel:
12
def __init__(
13
self,
14
model_name=None,
15
stan_file=None,
16
exe_file=None,
17
force_compile=False,
18
stanc_options=None,
19
cpp_options=None,
20
user_header=None,
21
compile=None
22
):
23
"""
24
Create a CmdStanModel instance.
25
26
Parameters:
27
- model_name (str, optional): Model name for output files
28
- stan_file (str or PathLike, optional): Path to Stan source file
29
- exe_file (str or PathLike, optional): Path to compiled executable
30
- force_compile (bool): Force recompilation of model
31
- stanc_options (dict, optional): Options for stanc compiler
32
- cpp_options (dict, optional): Options for C++ compiler
33
- user_header (str or PathLike, optional): Path to user header file
34
- compile (bool, optional): Whether to compile on instantiation
35
"""
36
```
37
38
**Basic Usage:**
39
40
```python
41
from cmdstanpy import CmdStanModel
42
43
# Create model from Stan file
44
model = CmdStanModel(stan_file="my_model.stan")
45
46
# Create model with custom options
47
model = CmdStanModel(
48
stan_file="my_model.stan",
49
stanc_options={"O1": True},
50
cpp_options={"STAN_THREADS": True}
51
)
52
```
53
54
### MCMC Sampling
55
56
Run Hamiltonian Monte Carlo sampling using the No-U-Turn Sampler (NUTS).
57
58
```python { .api }
59
def sample(
60
self,
61
data=None,
62
chains=4,
63
parallel_chains=None,
64
threads_per_chain=None,
65
seed=None,
66
chain_ids=None,
67
inits=None,
68
iter_warmup=1000,
69
iter_sampling=1000,
70
save_warmup=False,
71
thin=1,
72
max_treedepth=10,
73
metric=None,
74
step_size=None,
75
adapt_engaged=True,
76
adapt_delta=0.8,
77
adapt_init_phase=15,
78
adapt_metric_window=25,
79
adapt_step_size=50,
80
fixed_param=False,
81
output_dir=None,
82
sig_figs=None,
83
validate_csv=True,
84
show_console=False,
85
refresh=None,
86
time_fmt=None,
87
timeout=None,
88
force_one_process_per_chain=None
89
):
90
"""
91
Run MCMC sampling.
92
93
Parameters:
94
- data (dict, str, or PathLike, optional): Model data
95
- chains (int): Number of chains to run
96
- parallel_chains (int, optional): Number of chains to run in parallel
97
- threads_per_chain (int, optional): Threads per chain (requires STAN_THREADS)
98
- seed (int, optional): Random seed
99
- chain_ids (list of int, optional): Chain identifiers
100
- inits (dict, list, str, optional): Initial parameter values
101
- iter_warmup (int): Number of warmup iterations per chain
102
- iter_sampling (int): Number of sampling iterations per chain
103
- save_warmup (bool): Save warmup draws in output
104
- thin (int): Period between saved samples
105
- max_treedepth (int): Maximum tree depth for NUTS
106
- metric (str or array, optional): Mass matrix ("diag_e", "dense_e", or array)
107
- step_size (float or array, optional): Initial step size
108
- adapt_engaged (bool): Enable adaptation
109
- adapt_delta (float): Target acceptance probability
110
- adapt_init_phase (int): Initial adaptation phase iterations
111
- adapt_metric_window (int): Metric adaptation window
112
- adapt_step_size (int): Step size adaptation iterations
113
- fixed_param (bool): Run with fixed parameters (no sampling)
114
- output_dir (str or PathLike, optional): Directory for output files
115
- sig_figs (int, optional): Significant figures in output (1-18)
116
- validate_csv (bool): Validate CSV output format
117
- show_console (bool): Display console output
118
- refresh (int, optional): Progress update frequency
119
- time_fmt (str, optional): Timestamp format for output files
120
- timeout (float, optional): Timeout in seconds
121
- force_one_process_per_chain (bool, optional): Force single process per chain
122
123
Returns:
124
CmdStanMCMC: MCMC results container
125
"""
126
```
127
128
**Usage Example:**
129
130
```python
131
# Basic sampling
132
fit = model.sample(data={"N": 100, "y": [1, 2, 3, ...]})
133
134
# Advanced sampling configuration
135
fit = model.sample(
136
data=data,
137
chains=4,
138
parallel_chains=4,
139
iter_warmup=2000,
140
iter_sampling=2000,
141
adapt_delta=0.95,
142
max_treedepth=12,
143
seed=12345
144
)
145
```
146
147
### Optimization
148
149
Run optimization algorithms to find maximum likelihood or maximum a posteriori estimates.
150
151
```python { .api }
152
def optimize(
153
self,
154
data=None,
155
seed=None,
156
inits=None,
157
algorithm=None,
158
iter=2000,
159
jacobian=False,
160
output_dir=None,
161
sig_figs=None,
162
show_console=False,
163
refresh=None,
164
time_fmt=None,
165
timeout=None
166
):
167
"""
168
Run optimization for MLE/MAP estimation.
169
170
Parameters:
171
- data (dict, str, or PathLike, optional): Model data
172
- seed (int, optional): Random seed
173
- inits (dict, str, optional): Initial parameter values
174
- algorithm (str, optional): Optimization algorithm ("lbfgs", "bfgs", "newton")
175
- iter (int): Maximum number of iterations
176
- jacobian (bool): Save Jacobian matrix
177
- output_dir (str or PathLike, optional): Directory for output files
178
- sig_figs (int, optional): Significant figures in output
179
- show_console (bool): Display console output
180
- refresh (int, optional): Progress update frequency
181
- time_fmt (str, optional): Timestamp format for output files
182
- timeout (float, optional): Timeout in seconds
183
184
Returns:
185
CmdStanMLE: Optimization results container
186
"""
187
```
188
189
**Usage Example:**
190
191
```python
192
# Basic optimization
193
mle = model.optimize(data=data)
194
195
# L-BFGS with custom settings
196
mle = model.optimize(
197
data=data,
198
algorithm="lbfgs",
199
iter=5000,
200
jacobian=True,
201
seed=54321
202
)
203
```
204
205
### Variational Inference
206
207
Run Automatic Differentiation Variational Inference (ADVI) for approximate posterior inference.
208
209
```python { .api }
210
def variational(
211
self,
212
data=None,
213
seed=None,
214
inits=None,
215
algorithm=None,
216
iter=10000,
217
grad_samples=1,
218
elbo_samples=100,
219
eta=1.0,
220
adapt_engaged=True,
221
adapt_iter=50,
222
tol_rel_obj=0.01,
223
eval_elbo=100,
224
draws=1000,
225
output_dir=None,
226
sig_figs=None,
227
show_console=False,
228
refresh=None,
229
time_fmt=None,
230
timeout=None,
231
output_samples=None
232
):
233
"""
234
Run variational inference.
235
236
Parameters:
237
- data (dict, str, or PathLike, optional): Model data
238
- seed (int, optional): Random seed
239
- inits (dict, str, optional): Initial parameter values
240
- algorithm (str, optional): VI algorithm ("meanfield", "fullrank")
241
- iter (int): Maximum iterations for VI algorithm
242
- grad_samples (int): Samples per gradient evaluation
243
- elbo_samples (int): Samples for ELBO evaluation
244
- eta (float): Learning rate scaling parameter
245
- adapt_engaged (bool): Enable learning rate adaptation
246
- adapt_iter (int): Adaptation iterations
247
- tol_rel_obj (float): Relative objective tolerance for convergence
248
- eval_elbo (int): ELBO evaluation frequency
249
- draws (int): Number of posterior draws to generate
250
- output_dir (str or PathLike, optional): Directory for output files
251
- sig_figs (int, optional): Significant figures in output
252
- show_console (bool): Display console output
253
- refresh (int, optional): Progress update frequency
254
- time_fmt (str, optional): Timestamp format for output files
255
- timeout (float, optional): Timeout in seconds
256
- output_samples (int, optional): Deprecated parameter, use draws instead
257
258
Returns:
259
CmdStanVB: Variational inference results container
260
"""
261
```
262
263
### Pathfinder
264
265
Run the Pathfinder algorithm for fast approximate inference.
266
267
```python { .api }
268
def pathfinder(
269
self,
270
data=None,
271
seed=None,
272
inits=None,
273
num_paths=4,
274
draws=1000,
275
psis_resample=True,
276
calculate_lp=True,
277
max_lbfgs_iters=1000,
278
num_draws=None,
279
save_single_paths=False,
280
output_dir=None,
281
sig_figs=None,
282
show_console=False,
283
refresh=None,
284
time_fmt=None,
285
timeout=None,
286
init_alpha=None,
287
num_threads=None
288
):
289
"""
290
Run Pathfinder algorithm for variational approximation.
291
292
Parameters:
293
- data (dict, str, or PathLike, optional): Model data
294
- seed (int, optional): Random seed
295
- inits (dict, str, optional): Initial parameter values
296
- num_paths (int): Number of pathfinder paths to run
297
- draws (int): Total number of draws to return
298
- psis_resample (bool): Use Pareto smoothed importance sampling resampling
299
- calculate_lp (bool): Calculate log probability for draws
300
- max_lbfgs_iters (int): Maximum L-BFGS iterations per path
301
- num_draws (int, optional): Deprecated, use draws instead
302
- save_single_paths (bool): Save individual path outputs
303
- output_dir (str or PathLike, optional): Directory for output files
304
- sig_figs (int, optional): Significant figures in output
305
- show_console (bool): Display console output
306
- refresh (int, optional): Progress update frequency
307
- time_fmt (str, optional): Timestamp format for output files
308
- timeout (float, optional): Timeout in seconds
309
- init_alpha (float, optional): Initial step size for pathfinder
310
- num_threads (int, optional): Number of threads for pathfinder
311
312
Returns:
313
CmdStanPathfinder: Pathfinder results container
314
"""
315
```
316
317
### Laplace Approximation
318
319
Run Laplace approximation around a mode for approximate posterior inference.
320
321
```python { .api }
322
def laplace_sample(
323
self,
324
data=None,
325
mode=None,
326
draws=1000,
327
jacobian=True,
328
refresh=100,
329
output_dir=None,
330
sig_figs=None,
331
show_console=False,
332
time_fmt=None,
333
timeout=None,
334
opt_args=None
335
):
336
"""
337
Run Laplace approximation sampling.
338
339
Parameters:
340
- data (dict, str, or PathLike, optional): Model data
341
- mode (CmdStanMLE, optional): Mode for approximation center (if None, runs optimization first)
342
- draws (int): Number of draws from approximation
343
- jacobian (bool): Calculate Jacobian matrix
344
- refresh (int): Progress update frequency
345
- output_dir (str or PathLike, optional): Directory for output files
346
- sig_figs (int, optional): Significant figures in output
347
- show_console (bool): Display console output
348
- time_fmt (str, optional): Timestamp format for output files
349
- timeout (float, optional): Timeout in seconds
350
- opt_args (dict, optional): Additional optimization arguments if mode is None
351
352
Returns:
353
CmdStanLaplace: Laplace approximation results container
354
"""
355
```
356
357
### Generated Quantities
358
359
Generate additional quantities of interest from existing fit results.
360
361
```python { .api }
362
def generate_quantities(
363
self,
364
data=None,
365
previous_fit=None,
366
seed=None,
367
parallel_chains=None,
368
output_dir=None,
369
sig_figs=None,
370
show_console=False,
371
refresh=None,
372
time_fmt=None,
373
timeout=None,
374
mcmc_sample=None
375
):
376
"""
377
Generate quantities from existing fit.
378
379
Parameters:
380
- data (dict, str, or PathLike, optional): Model data (can be different from original fit)
381
- previous_fit (CmdStanMCMC, CmdStanMLE, CmdStanVB, or CmdStanPathfinder): Existing fit for parameter values
382
- seed (int, optional): Random seed
383
- parallel_chains (int, optional): Number of parallel chains
384
- output_dir (str or PathLike, optional): Directory for output files
385
- sig_figs (int, optional): Significant figures in output
386
- show_console (bool): Display console output
387
- refresh (int, optional): Progress update frequency
388
- time_fmt (str, optional): Timestamp format for output files
389
- timeout (float, optional): Timeout in seconds
390
- mcmc_sample (CmdStanMCMC or list of str, optional): Alternative parameter name for previous_fit
391
392
Returns:
393
CmdStanGQ: Generated quantities results container
394
"""
395
```
396
397
### Log Probability Evaluation
398
399
Calculate log probability and gradients at specific parameter values.
400
401
```python { .api }
402
def log_prob(self, params, data=None, jacobian=True, sig_figs=None):
403
"""
404
Calculate log probability and gradients.
405
406
Parameters:
407
- params (dict): Parameter values to evaluate
408
- data (dict, str, or PathLike, optional): Model data
409
- jacobian (bool): Calculate gradients
410
- sig_figs (int, optional): Significant figures in output
411
412
Returns:
413
pd.DataFrame: Log probability and gradients
414
"""
415
```
416
417
## Model Information Methods
418
419
### Model Code Access
420
421
```python { .api }
422
def code(self):
423
"""
424
Return Stan program as string.
425
426
Returns:
427
str or None: Stan program code if available
428
"""
429
```
430
431
### Executable Information
432
433
```python { .api }
434
def exe_info(self):
435
"""
436
Get executable information by running with 'info' option.
437
438
Returns:
439
dict: Executable configuration information including compiler options and flags
440
"""
441
```
442
443
### Source Information
444
445
```python { .api }
446
def src_info(self):
447
"""
448
Get model information by running stanc with '--info'.
449
450
Returns:
451
dict: Model structure and parameter information including inputs, parameters, and generated quantities
452
"""
453
```
454
455
### Model Properties
456
457
Access model metadata and compilation information.
458
459
```python { .api }
460
# Properties
461
model.name # str: Model name
462
model.stan_file # str or None: Path to Stan source file
463
model.exe_file # str or None: Path to compiled executable
464
model.stanc_options # dict: Stanc compiler options
465
model.cpp_options # dict: C++ compiler options
466
model.user_header # str or None: User header file path
467
```
468
469
470
## Advanced Usage Patterns
471
472
### Custom Initialization
473
474
```python
475
# Different initialization strategies
476
fit1 = model.sample(data=data, inits=0) # Initialize at zero
477
fit2 = model.sample(data=data, inits=2) # Initialize with random values
478
fit3 = model.sample(data=data, inits={"theta": 1.5, "sigma": 0.5}) # Custom values
479
fit4 = model.sample(data=data, inits="previous_fit.json") # From file
480
```
481
482
### Multi-Stage Inference
483
484
```python
485
# Start with optimization for good initial values
486
mle = model.optimize(data=data)
487
488
# Use MLE as starting point for MCMC
489
fit = model.sample(
490
data=data,
491
inits=mle.optimized_params_dict()
492
)
493
494
# Generate additional quantities
495
gq = model.generate_quantities(
496
data=extended_data,
497
previous_fit=fit
498
)
499
```
500
501
### Model Comparison Workflow
502
503
```python
504
models = {
505
"simple": CmdStanModel(stan_file="simple_model.stan"),
506
"complex": CmdStanModel(stan_file="complex_model.stan")
507
}
508
509
results = {}
510
for name, model in models.items():
511
# Pathfinder for fast approximate inference
512
pf = model.pathfinder(data=data, num_paths=8)
513
514
# Use Pathfinder results to initialize MCMC
515
fit = model.sample(
516
data=data,
517
inits=pf.create_inits(chains=4),
518
chains=4
519
)
520
521
results[name] = fit
522
523
# Compare models using LOO or other criteria
524
```