Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic programming matrix was not filled properly #5

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

milot-mirdita
Copy link

During evaluation of the tool i found that the DP matrix for the RMQ was not filled properly, resulting in a lot of nodes with LCA root. For example maus + human results in the root node.

Thank you for the great implementation in all other regards. I have ported the code to C++ and integrated it into our homology search, clustering and metagenomics suite MMseqs2.

@emepyc
Copy link
Owner

emepyc commented Oct 25, 2017

Thanks for the PR!
You mention that you have ported the code. Are you still relying on GI numbers? if not, what approach are you following for acc => taxid conversion?

@milot-mirdita
Copy link
Author

milot-mirdita commented Oct 25, 2017

We work only with Uniprot, and that is sufficiently well annotated with NCBI taxons.

MMseqs2 does the annotation to Uniprot accessions, which are then mapped to NCBI taxons, which the LCA tool can then read.

We also implement a 2bLCA like approach to get more reliable LCAs.

@emepyc
Copy link
Owner

emepyc commented Oct 25, 2017

Ah, ok. Thanks

@milot-mirdita
Copy link
Author

By the way, do you have any manuscript that we could cite?

@emepyc
Copy link
Owner

emepyc commented Oct 25, 2017

No, not really. I never tried to publish this tool. But thanks for asking :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants