-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ambiguous nucleotides? #31
Comments
Hi there, Implementing ambiguous nucleotides is possible. However, weighing the comparisons is not trivial. You suggest that d(W,A) = 0 and d(W,T) = 0, but d(A,T) = 1. The distance thus is no longer a true metric distance. So I am unsure how the weighing should be implemented to satisfy all need of users of snp-dists. Best, |
Hei Fabian,
thanks for the quick reply. For my application it does not matter if distances are truly metric but I can see what the problem is... this is indeed non-trivial...
Anyway, thanks very much for your help! :-)
Best wishes!
Alex
Alexander Brandt, MSc
Georg-August-University Göttingen
J.-F.-Blumenbach-Institute of Zoology and Anthropology
Dept. of Animal Ecology
Berliner Straße 28
D-37073 Göttingen
…________________________________
From: Fabian Klötzl <[email protected]>
Sent: Monday, August 26, 2019 1:41:38 PM
To: tseemann/snp-dists
Cc: Brandt, Alexander; Author
Subject: Re: [tseemann/snp-dists] ambiguous nucleotides? (#31)
Hi there,
Implementing ambiguous nucleotides is possible. However, weighing the comparisons is not trivial. You suggest that d(W,A) = 0 and d(W,T) = 0, but d(A,T) = 1. The distance thus is no longer a true metric distance. So I am unsure how the weighing should be implemented to satisfy all need of users of snp-dists.
Best,
Fabian
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#31?email_source=notifications&email_token=AICMAEKLXHSVWTNWH76NLNLQGO6PFA5CNFSM4IPN66W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5EDWUQ#issuecomment-524827474>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AICMAEJNREJJ6MD47IOLZR3QGO6PFANCNFSM4IPN66WQ>.
|
I think supporting IUPAC codes in some manner would be a good option to include, but it is complicated. What about d(W,W) and d(W, B/D/G/V) etc?
|
I came up with an implementation that works with ambiguous nucleotides. @trommleralex Note that you have to fill the table in main.c also, you have to compile using |
Hi Fabian,
this is awesome! Thanks so much for your effort, it will make things a lot more easy now!
Made my day :-)
All the best!
Alex
Alexander Brandt, MSc
Georg-August-University Göttingen
J.-F.-Blumenbach-Institute of Zoology and Anthropology
Dept. of Animal Ecology
Berliner Straße 28
D-37073 Göttingen
…________________________________
From: Fabian Klötzl <[email protected]>
Sent: Wednesday, August 28, 2019 8:51:53 AM
To: tseemann/snp-dists
Cc: Brandt, Alexander; Mention
Subject: Re: [tseemann/snp-dists] ambiguous nucleotides? (#31)
I came up with an implementation<https://github.com/kloetzl/snp-dists/tree/ambiguous> that works with ambiguous nucleotides. @trommleralex<https://github.com/trommleralex> Note that you have to fill the table in main.c also, you have to compile using make.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#31?email_source=notifications&email_token=AICMAEIOLAUMTVP2KEDEGYTQGYOATA5CNFSM4IPN66W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5KCPMI#issuecomment-525608881>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AICMAENBXO2DF2SXZJPV4UDQGYOATANCNFSM4IPN66WQ>.
|
Dear @tseemann, we once met in UK, Hinxton during ENA meeting. I took the opportunity and cloned your nice snp-dists repo. I have added a so-called basic literal-distance, which deals with IUPAC distances. The original code was not touched and one can still calculate the snp-dists distance. |
Dear Torsten,
I want to calculate genetic distances between sequences that contain ambiguous bases, i.e. W, S, Y and so on. If I am not mistaken snp-dists can either ignore these positions or count them as a snp. However, I would like to use the ambiguous information, e.g.:
W vs. A or T -> print distance 0
W vs. G or C -> print distance 1
I also would love to stick to unix command line because I have thousands of sequences and could loop the command easily in unix.
Would you consider implementing the ambiguous base information thing into snp-dist or could you recommend any other program that can deal with them?
Thanks a lot and best wishes!
Alex
The text was updated successfully, but these errors were encountered: