0
# Simulation & Modeling
1
2
Tree simulation using birth-death and coalescent processes, character evolution simulation, and phylogenetic modeling. DendroPy provides comprehensive tools for simulating phylogenetic trees and evolving character data under various evolutionary models.
3
4
## Capabilities
5
6
### Tree Simulation
7
8
Functions for simulating phylogenetic trees under different evolutionary processes.
9
10
```python { .api }
11
def birth_death_tree(birth_rate, death_rate, **kwargs):
12
"""
13
Simulate tree under birth-death process.
14
15
Parameters:
16
- birth_rate: Per-lineage birth (speciation) rate
17
- death_rate: Per-lineage death (extinction) rate
18
- taxon_namespace: TaxonNamespace for tip taxa
19
- num_extant_tips: Number of extant taxa to simulate
20
- num_total_tips: Total number of taxa (including extinct)
21
- max_time: Maximum time for simulation
22
- gsa_ntax: General sampling approach target taxa
23
- is_retain_extinct_tips: Keep extinct lineages in tree
24
- repeat_until_success: Retry if simulation goes extinct
25
- rng: Random number generator
26
27
Returns:
28
Tree: Simulated birth-death tree
29
"""
30
31
def discrete_birth_death_tree(birth_rate, death_rate, **kwargs):
32
"""
33
Simulate tree under discrete birth-death process.
34
35
Parameters:
36
- birth_rate: Birth rate per time step
37
- death_rate: Death rate per time step
38
- max_time: Maximum simulation time
39
- num_time_steps: Number of discrete time steps
40
- **kwargs: Additional birth-death parameters
41
42
Returns:
43
Tree: Simulated discrete birth-death tree
44
"""
45
46
def uniform_pure_birth_tree(taxon_namespace, birth_rate=1.0, **kwargs):
47
"""
48
Simulate tree under pure birth (Yule) process.
49
50
Parameters:
51
- taxon_namespace: TaxonNamespace defining tip taxa
52
- birth_rate: Speciation rate (default: 1.0)
53
- is_assign_extant_taxa: Assign taxa to extant tips
54
- rng: Random number generator
55
56
Returns:
57
Tree: Simulated pure birth tree
58
"""
59
60
def star_tree(taxon_namespace, **kwargs):
61
"""
62
Create star tree (polytomy) with all taxa sister to each other.
63
64
Parameters:
65
- taxon_namespace: TaxonNamespace for tip taxa
66
- edge_length: Length for all terminal edges
67
68
Returns:
69
Tree: Star-shaped tree
70
"""
71
72
def rand_trees(taxon_namespace, num_trees, **kwargs):
73
"""
74
Generate collection of random trees.
75
76
Parameters:
77
- taxon_namespace: TaxonNamespace for trees
78
- num_trees: Number of trees to generate
79
- tree_factory: Function for generating individual trees
80
- rng: Random number generator
81
82
Returns:
83
TreeList: Collection of random trees
84
"""
85
```
86
87
### Coalescent Simulation
88
89
Functions for simulating trees under coalescent processes.
90
91
```python { .api }
92
def contained_coalescent_tree(containing_tree, gene_to_species_map, **kwargs):
93
"""
94
Simulate gene tree contained within species tree using coalescent.
95
96
Parameters:
97
- containing_tree: Species tree containing gene tree
98
- gene_to_species_map: Mapping from gene taxa to species taxa
99
- default_pop_size: Default population size for branches
100
- rng: Random number generator
101
102
Returns:
103
Tree: Simulated gene tree embedded in species tree
104
"""
105
106
def pure_kingman_tree(taxon_namespace, pop_size=1, **kwargs):
107
"""
108
Simulate tree under pure Kingman coalescent.
109
110
Parameters:
111
- taxon_namespace: TaxonNamespace for coalescent taxa
112
- pop_size: Effective population size
113
- rng: Random number generator
114
115
Returns:
116
Tree: Simulated coalescent tree
117
"""
118
119
def mean_kingman_tree(taxon_namespace, pop_size=1, **kwargs):
120
"""
121
Generate tree with expected coalescent times.
122
123
Parameters:
124
- taxon_namespace: TaxonNamespace for taxa
125
- pop_size: Effective population size
126
127
Returns:
128
Tree: Tree with mean coalescent branch lengths
129
"""
130
131
def constrained_kingman_tree(pop_tree, gene_tree_list=None, **kwargs):
132
"""
133
Simulate constrained coalescent tree within population tree.
134
135
Parameters:
136
- pop_tree: Population/species tree providing constraints
137
- gene_tree_list: List to store simulated gene trees
138
- num_genes: Number of gene trees to simulate
139
- pop_sizes: Population sizes for each branch
140
- rng: Random number generator
141
142
Returns:
143
Tree or TreeList: Simulated constrained coalescent tree(s)
144
"""
145
```
146
147
### Population Genetics Simulation
148
149
Functions for simulating population genetic processes and genealogies.
150
151
```python { .api }
152
def pop_gen_tree(num_genes, pop_size=1, **kwargs):
153
"""
154
Simulate population genetics tree (gene genealogy).
155
156
Parameters:
157
- num_genes: Number of gene copies to simulate
158
- pop_size: Effective population size
159
- num_gens: Number of generations to simulate
160
- rng: Random number generator
161
162
Returns:
163
Tree: Simulated gene genealogy
164
"""
165
166
def fragmented_tree(taxon_namespace, fragment_probs, **kwargs):
167
"""
168
Simulate tree with fragmented (missing) taxa.
169
170
Parameters:
171
- taxon_namespace: Complete set of taxa
172
- fragment_probs: Probability of each taxon being present
173
- base_tree: Base tree structure for fragmentation
174
- rng: Random number generator
175
176
Returns:
177
Tree: Tree with some taxa removed
178
"""
179
```
180
181
### Character Evolution Simulation
182
183
Comprehensive character evolution simulation under various models.
184
185
```python { .api }
186
def simulate_discrete_char_dataset(tree, seq_len, **kwargs):
187
"""
188
Simulate discrete character dataset on phylogenetic tree.
189
190
Parameters:
191
- tree: Tree for character evolution
192
- seq_len: Length of character sequences
193
- char_model: Character evolution model (JC69, HKY85, etc.)
194
- mutation_rate: Overall mutation rate multiplier
195
- site_rates: Rate variation across sites
196
- invariant_sites_prop: Proportion of invariant sites
197
- rng: Random number generator
198
199
Returns:
200
CharacterMatrix: Simulated character alignment
201
"""
202
203
def simulate_discrete_chars(tree, char_model, seq_len, **kwargs):
204
"""
205
Simulate discrete characters with specified evolutionary model.
206
207
Parameters:
208
- tree: Phylogenetic tree with branch lengths
209
- char_model: DiscreteCharacterEvolutionModel instance
210
- seq_len: Number of characters to simulate
211
- root_states: Ancestral character states
212
- rng: Random number generator
213
214
Returns:
215
CharacterMatrix: Evolved discrete character data
216
"""
217
218
def hky85_chars(tree, seq_len, **kwargs):
219
"""
220
Simulate DNA evolution under HKY85 substitution model.
221
222
Parameters:
223
- tree: Phylogenetic tree with branch lengths in substitutions
224
- seq_len: DNA sequence length
225
- kappa: Transition/transversion ratio (default: 1.0)
226
- base_freqs: Equilibrium base frequencies [fA, fC, fG, fT]
227
- mutation_rate: Rate multiplier for all substitutions
228
- site_rates: Gamma-distributed rate variation
229
- invariant_sites_prop: Proportion of invariant sites
230
- rng: Random number generator
231
232
Returns:
233
DnaCharacterMatrix: Simulated DNA sequences
234
"""
235
236
def jc69_chars(tree, seq_len, **kwargs):
237
"""
238
Simulate DNA evolution under Jukes-Cantor 69 model.
239
240
Parameters:
241
- tree: Phylogenetic tree with branch lengths
242
- seq_len: DNA sequence length
243
- mutation_rate: Substitution rate
244
- rng: Random number generator
245
246
Returns:
247
DnaCharacterMatrix: Simulated DNA sequences
248
"""
249
250
def evolve_continuous_char(tree, char_matrix, **kwargs):
251
"""
252
Evolve continuous characters using Brownian motion.
253
254
Parameters:
255
- tree: Phylogenetic tree
256
- char_matrix: Starting continuous character values
257
- rate: Rate of continuous character evolution
258
- model: Evolution model ('brownian', 'ou', etc.)
259
- rng: Random number generator
260
261
Returns:
262
ContinuousCharacterMatrix: Evolved continuous characters
263
"""
264
```
265
266
### Advanced Character Evolution Models
267
268
Sophisticated models for character evolution including rate variation and complex substitution patterns.
269
270
```python { .api }
271
class DiscreteCharacterEvolutionModel:
272
"""
273
Base class for discrete character evolution models.
274
275
Parameters:
276
- state_alphabet: StateAlphabet defining character states
277
- stationary_freqs: Equilibrium state frequencies
278
- rate_matrix: Instantaneous rate matrix Q
279
"""
280
281
def __init__(self, state_alphabet=None, **kwargs): ...
282
283
def p_matrix(self, edge_length):
284
"""Calculate transition probability matrix P(t) = exp(Qt)."""
285
286
def simulate_ancestral_states(self, num_chars, rng=None):
287
"""Simulate ancestral character states."""
288
289
def evolve_states(self, tree, seq_len, **kwargs):
290
"""Evolve character states on tree."""
291
292
class Hky85(DiscreteCharacterEvolutionModel):
293
"""
294
HKY85 nucleotide substitution model.
295
296
Parameters:
297
- kappa: Transition/transversion rate ratio
298
- base_freqs: Equilibrium base frequencies [fA, fC, fG, fT]
299
"""
300
301
def __init__(self, kappa=1.0, base_freqs=None): ...
302
303
class Jc69(DiscreteCharacterEvolutionModel):
304
"""
305
Jukes-Cantor 69 nucleotide substitution model.
306
307
Equal substitution rates and base frequencies.
308
"""
309
310
def __init__(self): ...
311
312
class Gtr(DiscreteCharacterEvolutionModel):
313
"""
314
General Time Reversible nucleotide substitution model.
315
316
Parameters:
317
- rate_params: Six substitution rate parameters
318
- base_freqs: Equilibrium base frequencies
319
"""
320
321
def __init__(self, rate_params=None, base_freqs=None): ...
322
323
class DiscreteCharacterEvolver:
324
"""
325
Engine for evolving discrete characters along phylogenetic trees.
326
327
Parameters:
328
- seq_model: DiscreteCharacterEvolutionModel
329
- mutation_rate: Overall rate multiplier
330
- site_rates: Rate heterogeneity across sites
331
"""
332
333
def __init__(self, seq_model=None, **kwargs): ...
334
335
def evolve_states(self, tree, seq_len, **kwargs):
336
"""
337
Simulate character evolution on tree.
338
339
Parameters:
340
- tree: Tree for simulation
341
- seq_len: Number of characters
342
- root_states: Starting character states
343
- rng: Random number generator
344
345
Returns:
346
CharacterMatrix: Simulated character data
347
"""
348
```
349
350
### Rate Variation Models
351
352
Models for rate variation across sites and lineages.
353
354
```python { .api }
355
def gamma_site_rates(alpha, num_categories=4, **kwargs):
356
"""
357
Generate gamma-distributed rate categories for sites.
358
359
Parameters:
360
- alpha: Shape parameter for gamma distribution
361
- num_categories: Number of discrete rate categories
362
- rng: Random number generator
363
364
Returns:
365
list: Rate multipliers for each category
366
"""
367
368
def invariant_sites_model(prop_invariant, site_rates=None):
369
"""
370
Create model with proportion of invariant sites.
371
372
Parameters:
373
- prop_invariant: Proportion of sites that never change
374
- site_rates: Rate categories for variable sites
375
376
Returns:
377
list: Rate categories including invariant sites (rate=0)
378
"""
379
380
def codon_position_rates(rates_123):
381
"""
382
Set different rates for codon positions.
383
384
Parameters:
385
- rates_123: Rates for 1st, 2nd, 3rd codon positions
386
387
Returns:
388
list: Site-specific rate multipliers
389
"""
390
```
391
392
### Tree Modification for Simulation
393
394
Functions for modifying trees to prepare them for simulation.
395
396
```python { .api }
397
def scale_tree(tree, scaling_factor):
398
"""
399
Scale all branch lengths in tree.
400
401
Parameters:
402
- tree: Tree to scale
403
- scaling_factor: Multiplier for all branch lengths
404
405
Returns:
406
None (modifies tree in place)
407
"""
408
409
def set_uniform_branch_lengths(tree, length=1.0):
410
"""
411
Set all branches to same length.
412
413
Parameters:
414
- tree: Tree to modify
415
- length: Branch length to assign
416
417
Returns:
418
None (modifies tree in place)
419
"""
420
421
def add_rate_variation_to_tree(tree, rate_distribution, **kwargs):
422
"""
423
Add rate variation across tree branches.
424
425
Parameters:
426
- tree: Tree to modify
427
- rate_distribution: Distribution for sampling rates
428
- rng: Random number generator
429
430
Returns:
431
None (modifies branch lengths in place)
432
"""
433
434
def ultrametrize_tree(tree, **kwargs):
435
"""
436
Convert tree to ultrametric (molecular clock).
437
438
Parameters:
439
- tree: Tree to ultrametrize
440
- strategy: Method ('equal', 'proportional', 'minimize_change')
441
442
Returns:
443
None (modifies tree in place)
444
"""
445
```
446
447
### Simulation Utilities
448
449
Utility functions for managing and analyzing simulated data.
450
451
```python { .api }
452
def repeat_simulation(simulation_fn, num_replicates, **kwargs):
453
"""
454
Repeat simulation multiple times.
455
456
Parameters:
457
- simulation_fn: Function that performs one simulation
458
- num_replicates: Number of simulation replicates
459
- **kwargs: Arguments passed to simulation function
460
461
Returns:
462
list: Results from all simulation replicates
463
"""
464
465
def simulation_summary_stats(simulated_trees, **kwargs):
466
"""
467
Calculate summary statistics on simulated trees.
468
469
Parameters:
470
- simulated_trees: Collection of simulated Tree objects
471
- stats: List of statistics to calculate
472
473
Returns:
474
dict: Summary statistics across simulations
475
"""
476
477
def validate_simulation_parameters(tree, char_model, **kwargs):
478
"""
479
Validate parameters for character evolution simulation.
480
481
Parameters:
482
- tree: Tree for simulation
483
- char_model: Character evolution model
484
- **kwargs: Additional simulation parameters
485
486
Returns:
487
bool: True if parameters are valid
488
489
Raises:
490
ValueError: If parameters are invalid
491
"""
492
```