-
Notifications
You must be signed in to change notification settings - Fork 35
Cov phylogenetic tree quality by monophylicity
Robert Edgar edited this page Jul 4, 2020
·
3 revisions
The goal is to assess agreement between a tree and Cov taxonomy by measuring the degree of monophylicity. In an optimal tree, all species would be monophyletic. Given a number of candidate trees, we would pick the tree with most monophyletic species, or possibly a tree where more important species are monophyletic (e.g SARS).
TaxId Seqs Taxa Mono Names 28295 159 1 mono Porcine epidemic diarrhea virus 694014 283 1 mono Avian coronavirus ... 694007 4 2 POLY Tylonycteris bat coronavirus HKU4, Tylonycteris pachypus bat coronavirus HKU4-related 1335626 12 3 POLY Middle East respiratory syndrome-related coronavirus, Bat coronavirus, Hypsugo bat coronavirus HKU25 11137 8 2 POLY Human coronavirus 229E, Rousettus aegyptiacus bat coronavirus 229E-related 51 taxa, 42 mono, 9 polyphyletic
s3://serratus-public/rce/monophy/
See runme.bash
for an example.
To run the analysis, you need a rooted tree in usearch tabbed format.
Root placement is not important, if you have an unrooted tree you can use any convenient method. With raxml:
raxml -f I -m GTRCAT -t $intree -n rooted
To convert a rooted Newick tree to usearch tabbed:
usearch -tree_cvt tree.newick -tabbedout tree.tsv