or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

clustering.mdcore-tree.mddata-tables.mdexternal-formats.mdindex.mdncbi-taxonomy.mdphylogenetic.mdsequences.mdvisualization.md

phylogenetic.mddocs/

0

# Phylogenetic Analysis

1

2

Advanced phylogenetic tree analysis capabilities including species tree operations, monophyly testing, evolutionary analysis, and specialized phylogenetic methods. These features extend core tree functionality with domain-specific phylogenetic tools.

3

4

## Capabilities

5

6

### Phylogenetic Tree Classes

7

8

Enhanced tree classes with phylogenetic-specific features and methods.

9

10

```python { .api }

11

class PhyloTree(Tree):

12

"""

13

Phylogenetic tree with species-aware operations.

14

Inherits all Tree functionality plus phylogenetic methods.

15

"""

16

17

def __init__(self, newick=None, alignment=None, alg_format="fasta",

18

sp_naming_function=None, format=0):

19

"""

20

Initialize phylogenetic tree.

21

22

Parameters:

23

- newick (str): Newick format string or file

24

- alignment (str): Sequence alignment file or string

25

- alg_format (str): Alignment format ("fasta", "phylip", "iphylip")

26

- sp_naming_function (function): Function to extract species from node names

27

- format (int): Newick format specification

28

"""

29

30

class PhyloNode(PhyloTree):

31

"""Alias for PhyloTree - same functionality."""

32

pass

33

```

34

35

### Species Naming and Annotation

36

37

Configure how species names are extracted from node names and manage species-level operations.

38

39

```python { .api }

40

def set_species_naming_function(self, fn):

41

"""

42

Set function to extract species name from node name.

43

44

Parameters:

45

- fn (function): Function that takes node name, returns species name

46

Example: lambda x: x.split('_')[0]

47

"""

48

49

species: str # Species name property (read-only)

50

51

def get_species(self):

52

"""

53

Get set of all species in tree.

54

55

Returns:

56

set: Species names present in tree

57

"""

58

59

def annotate_gtdb_taxa(self, taxid_attr="name"):

60

"""

61

Annotate tree with GTDB (Genome Taxonomy Database) taxonomic information.

62

63

Parameters:

64

- taxid_attr (str): Node attribute containing taxonomic IDs

65

"""

66

```

67

68

### Monophyly Testing

69

70

Test and analyze monophyletic groups in phylogenetic trees.

71

72

```python { .api }

73

def check_monophyly(self, values, target_attr, ignore_missing=False):

74

"""

75

Check if specified values form monophyletic group.

76

77

Parameters:

78

- values (list): List of values to test for monophyly

79

- target_attr (str): Node attribute to check ("species", "name", etc.)

80

- ignore_missing (bool): Ignore nodes without target attribute

81

82

Returns:

83

tuple: (is_monophyletic: bool, clade_type: str, broken_branches: list)

84

clade_type can be "monophyletic", "paraphyletic", or "polyphyletic"

85

"""

86

87

def get_monophyletic(self, values, target_attr):

88

"""

89

Get node that represents monophyletic group of specified values.

90

91

Parameters:

92

- values (list): Values that should form monophyletic group

93

- target_attr (str): Node attribute to match against

94

95

Returns:

96

TreeNode: Node representing the monophyletic group, or None if not monophyletic

97

"""

98

```

99

100

### Distance and Divergence Analysis

101

102

Calculate evolutionary distances and analyze tree metrics.

103

104

```python { .api }

105

def get_age(self, species2age):

106

"""

107

Get age of node based on species age information.

108

109

Parameters:

110

- species2age (dict): Mapping from species names to ages

111

112

Returns:

113

float: Estimated age of node

114

"""

115

116

def get_closest_leaf(self, topology_only=False):

117

"""

118

Find closest leaf node with phylogenetic distance.

119

120

Parameters:

121

- topology_only (bool): Use only topology, ignore branch lengths

122

123

Returns:

124

tuple: (closest_leaf_node, distance)

125

"""

126

127

def get_farthest_leaf(self, topology_only=False):

128

"""

129

Find most distant leaf node.

130

131

Parameters:

132

- topology_only (bool): Use only topology, ignore branch lengths

133

134

Returns:

135

tuple: (farthest_leaf_node, distance)

136

"""

137

138

def get_farthest_node(self, topology_only=False):

139

"""

140

Find most distant node (leaf or internal).

141

142

Parameters:

143

- topology_only (bool): Use only topology, ignore branch lengths

144

145

Returns:

146

tuple: (farthest_node, distance)

147

"""

148

149

def get_midpoint_outgroup(self):

150

"""

151

Find optimal outgroup for midpoint rooting.

152

153

Returns:

154

TreeNode: Node that serves as midpoint outgroup

155

"""

156

```

157

158

### Sequence Integration

159

160

Link phylogenetic trees with molecular sequence data.

161

162

```python { .api }

163

def link_to_alignment(self, alignment, alg_format="fasta", **kwargs):

164

"""

165

Associate sequence alignment with tree nodes.

166

167

Parameters:

168

- alignment (str): Alignment file path or sequence string

169

- alg_format (str): Format ("fasta", "phylip", "iphylip", "paml")

170

- kwargs: Additional format-specific parameters

171

"""

172

173

sequence: str # Associated sequence data (when linked to alignment)

174

```

175

176

### NCBI Taxonomy Comparison

177

178

Compare phylogenetic trees with NCBI taxonomic relationships.

179

180

```python { .api }

181

def ncbi_compare(self, autodetect_duplications=True):

182

"""

183

Compare tree topology with NCBI taxonomy.

184

185

Parameters:

186

- autodetect_duplications (bool): Automatically detect gene duplications

187

188

Returns:

189

dict: Comparison results including conflicts and agreements

190

"""

191

```

192

193

### Tree Reconciliation

194

195

Reconcile gene trees with species trees to infer evolutionary events.

196

197

```python { .api }

198

def reconcile(self, species_tree, inplace=True):

199

"""

200

Reconcile gene tree with species tree.

201

202

Parameters:

203

- species_tree (PhyloTree): Reference species tree

204

- inplace (bool): Modify current tree or return new one

205

206

Returns:

207

PhyloTree: Reconciled tree with duplication/speciation events annotated

208

"""

209

210

# Properties set by reconciliation

211

evoltype: str # Event type: "S" (speciation), "D" (duplication)

212

```

213

214

### Phylogenetic Tree Statistics

215

216

Calculate various phylogenetic tree statistics and metrics.

217

218

```python { .api }

219

def get_cached_content(self, store_attr=None):

220

"""

221

Cache tree content for efficient repeated access.

222

223

Parameters:

224

- store_attr (str): Specific attribute to cache

225

226

Returns:

227

dict: Cached tree statistics and content

228

"""

229

230

def robinson_foulds(self, ref_tree, attr_t1="name", attr_t2="name",

231

expand_polytomies=False, polytomy_size_limit=5,

232

skip_large_polytomies=True):

233

"""

234

Calculate Robinson-Foulds distance between trees.

235

236

Parameters:

237

- ref_tree (Tree): Reference tree for comparison

238

- attr_t1 (str): Attribute for leaf matching in self

239

- attr_t2 (str): Attribute for leaf matching in ref_tree

240

- expand_polytomies (bool): Resolve polytomies before comparison

241

- polytomy_size_limit (int): Max size for polytomy expansion

242

- skip_large_polytomies (bool): Skip large polytomies

243

244

Returns:

245

tuple: (RF_distance, max_RF, common_leaves, parts_t1, parts_t2,

246

discard_t1, discard_t2)

247

"""

248

```

249

250

## Evolution-Specific Tree Classes

251

252

### EvolTree for Evolutionary Analysis

253

254

Specialized tree class for evolutionary model analysis and molecular evolution studies.

255

256

```python { .api }

257

class EvolTree(PhyloTree):

258

"""

259

Tree specialized for evolutionary analysis and molecular evolution models.

260

"""

261

262

def get_evol_model(self, model_name):

263

"""

264

Get evolutionary model associated with tree.

265

266

Parameters:

267

- model_name (str): Name of evolutionary model

268

269

Returns:

270

EvolModel: Evolutionary model object

271

"""

272

273

def link_to_evol_model(self, model_file, workdir=None):

274

"""

275

Link tree to evolutionary analysis results.

276

277

Parameters:

278

- model_file (str): Path to model results file

279

- workdir (str): Working directory for analysis files

280

"""

281

282

def run_model(self, model_name_or_fname):

283

"""

284

Run evolutionary model analysis.

285

286

Parameters:

287

- model_name_or_fname (str): Model name or file path

288

289

Returns:

290

dict: Model analysis results

291

"""

292

293

class EvolNode(EvolTree):

294

"""Alias for EvolTree - same functionality."""

295

pass

296

```

297

298

## Utility Functions

299

300

### Species Tree Analysis

301

302

```python { .api }

303

def get_subtrees(tree, full_copy=False, features=None, newick_only=False):

304

"""

305

Calculate all possible species trees within a gene tree.

306

307

Parameters:

308

- tree (PhyloTree): Input gene tree

309

- full_copy (bool): Create full copies of subtrees

310

- features (list): Features to preserve in subtrees

311

- newick_only (bool): Return only Newick strings

312

313

Returns:

314

tuple: (num_trees, num_duplications, tree_iterator)

315

"""

316

317

def is_dup(node):

318

"""

319

Check if node represents a duplication event.

320

321

Parameters:

322

- node (TreeNode): Node to test

323

324

Returns:

325

bool: True if node is duplication

326

"""

327

```

328

329

## Usage Examples

330

331

### Basic Phylogenetic Operations

332

333

```python

334

from ete3 import PhyloTree

335

336

# Create phylogenetic tree with species naming

337

tree = PhyloTree("(human_gene1:0.1,(chimp_gene1:0.05,bonobo_gene1:0.05):0.02);")

338

tree.set_species_naming_function(lambda x: x.split('_')[0])

339

340

# Check species representation

341

species = tree.get_species()

342

print(f"Species in tree: {species}")

343

344

# Test monophyly

345

is_mono, clade_type, broken = tree.check_monophyly(['human', 'chimp'], 'species')

346

print(f"Human-Chimp monophyly: {is_mono} ({clade_type})")

347

```

348

349

### Sequence Integration

350

351

```python

352

from ete3 import PhyloTree

353

354

# Create tree and link to alignment

355

tree = PhyloTree("(seq1:0.1,seq2:0.2,seq3:0.15);")

356

tree.link_to_alignment("alignment.fasta")

357

358

# Access sequence data

359

for leaf in tree.get_leaves():

360

print(f"{leaf.name}: {leaf.sequence}")

361

```

362

363

### Tree Reconciliation

364

365

```python

366

from ete3 import PhyloTree

367

368

# Gene tree and species tree

369

gene_tree = PhyloTree("(human_gene1:0.1,(chimp_gene1:0.05,chimp_gene2:0.05):0.02);")

370

species_tree = PhyloTree("(human:0.1,chimp:0.1);")

371

372

# Set species naming

373

gene_tree.set_species_naming_function(lambda x: x.split('_')[0])

374

375

# Reconcile trees

376

reconciled = gene_tree.reconcile(species_tree)

377

378

# Check event types

379

for node in reconciled.traverse():

380

if hasattr(node, 'evoltype'):

381

print(f"Node {node.name}: {node.evoltype}")

382

```

383

384

### NCBI Taxonomy Integration

385

386

```python

387

from ete3 import PhyloTree, NCBITaxa

388

389

ncbi = NCBITaxa()

390

391

# Create tree from NCBI taxonomy

392

tree = ncbi.get_topology([9606, 9598, 9597]) # Human, chimp, bonobo

393

394

# Compare with gene tree

395

gene_tree = PhyloTree("(human:0.1,(chimp:0.05,bonobo:0.05):0.02);")

396

comparison = gene_tree.ncbi_compare()

397

print(f"Topology conflicts: {comparison}")

398

```