Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dist.dendlist seems to give wrong results as compared with ape #97

Open
talgalili opened this issue Nov 23, 2019 · 0 comments
Open

dist.dendlist seems to give wrong results as compared with ape #97

talgalili opened this issue Nov 23, 2019 · 0 comments
Labels

Comments

@talgalili
Copy link
Owner

Reported by Cooley, Nicholas P:

According to the R package ape, those two trees have a Robinson-Foulds distance of 4.

The code for getting there is included below:

library(ape)
library(DECIPHER)
library(dendextend)

x <- 1:6 %>% dist %>% hclust %>% as.dendrogram
y <- set(x, "labels", c(1:3,6,4,5))

dend_diff(x,y)
dist.dendlist(dendlist(x,y))
distinct_edges(x,y)
distinct_edges(y,x)
length(distinct_edges(x,y))+length(distinct_edges(y,x)) # dist.dendlist

z <- set(x, "labels", as.character(1:6))
w <- set(y, "labels", as.character(c(1:3,6,4,5)))

TempTree <- tempfile()
WriteDendrogram(x = z,
                file = TempTree,
                quoteLabels = FALSE,
                append = FALSE)
v <- unroot(read.tree(TempTree))
unlink(TempTree)
TempTree <- tempfile()
WriteDendrogram(x = w,
                file = TempTree,
                quoteLabels = FALSE,
                append = FALSE)
u <- unroot(read.tree(TempTree))
unlink(TempTree)
dist.topo(x = v,
          y = u)

Hope this helps clarify this for you! If you need anything else just let me know.

As I understand it it’s a little more than just the unrooting, as I understand it your measure of unique branch paths is looking at tips that have different parent nodes, while the the RF distance is looking at whole data partitions, so a partition on your left dendrogram ((1,2)(3,4)) is not repeated in the tree on the right, while the partition (1,2) is repeated in both trees, and so on?

I don’t know particularly if you need to account for the unrooting. But if you can clearly argue that your branch history measure is just as valid as an RF distance that might be pretty cool? I’ve looked at a couple data sets where I collected both your measure and RF distances and the differences between the two seem pretty uniform. Though I never really dug much into it.

@talgalili talgalili added the bug label Nov 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant