Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue to Register Singlet Recombinants that shall be not be removed by usher ( Until 2024-7-2 ) [Part I] #991

Closed
xz-keg opened this issue Oct 20, 2023 · 420 comments
Labels
Multiple lineages recombinant Usher Issues with usher related problems

Comments

@xz-keg
Copy link
Contributor

xz-keg commented Oct 20, 2023

#991 Part I
#1675 Part II
#2086 Part III

I'd like to post this issue to help everyone propose any recombinant sequence that may be added to usher.

Anything with >5 reversion(refer to the last checkpoint ,usually a designation) or >20 private mutations(distance to the last node) may be removed by usher. However some of them are real recombinants.

There are quite a few recombinants meeting this criteria. Whenever you happen to see any of them, please register them here to help those recombinant singlets not becoming invisible, hence when a 2nd seq appears we may easily realize the cluster.

Even if you are unsure about recomb or artefact please submit. It may be artefact but having 1 more artefact seq on tree doesn't harm as much as missing a real recomb.

XCM and XCT are both discovered this way so it shall be very meaningful.

Registered recombinants will be added to recombinants.tsv so that they won't be dropped out by usher.

@FedeGueli @AngieHinrichs @ryhisner @NkRMnZr @JosetteSchoenma

@xz-keg xz-keg changed the title Issue for Recombinants that shall be added to usher Issue to Register Singlet Recombinants that shall be added to usher Oct 20, 2023
@xz-keg xz-keg changed the title Issue to Register Singlet Recombinants that shall be added to usher Issue to Register Singlet Recombinants that shall be not be removed by usher Oct 20, 2023
@JosetteSchoenma
Copy link

JosetteSchoenma commented Oct 20, 2023

@aviczhl2 You say removed by Usher. But what will I see in Usher if I run a fasta of such samples through it?
Do you mean that it will be put on the wrong branch while showing reversions?

@FedeGueli FedeGueli added recombinant Multiple lineages Usher Issues with usher related problems labels Oct 20, 2023
@FedeGueli
Copy link

FedeGueli commented Oct 20, 2023

@aviczhl2 You say removed by Usher. But what will I see in Usher if I run a fasta of such samples through it? Do you mean that it will be put on the wrong branch while showing reversions?

XZ is referring to the fact that unless you don't upload the fasta Usher won't show it cause it excludes sequences with too many reversions as default

@JosetteSchoenma
Copy link

@aviczhl2 You say removed by Usher. But what will I see in Usher if I run a fasta of such samples through it? Do you mean that it will be put on the wrong branch while showing reversions?

XZ is referring to the fact that unless you don't upload the fasta Usher won't show it cause it excludes sequences with too many reversions as default

Oke. So, if I do upload one and it has more than 5 reversions or more then 20 private mutations, report it here. Is that it? Will that not put a lot of chronics here as well?

@FedeGueli
Copy link

@aviczhl2 You say removed by Usher. But what will I see in Usher if I run a fasta of such samples through it? Do you mean that it will be put on the wrong branch while showing reversions?

XZ is referring to the fact that unless you don't upload the fasta Usher won't show it cause it excludes sequences with too many reversions as default

Oke. So, if I do upload one and it has more than 5 reversions or more then 20 private mutations, report it here. Is that it? Will that not put a lot of chronics here as well?

You should have hints of recombination. If reversions are sparse and don't make you think to a recombinant no need to put it into the list

@FedeGueli FedeGueli pinned this issue Oct 20, 2023
@FedeGueli
Copy link

I ve pinned this on the homepage so easy to get it

@AngieHinrichs
Copy link

From my point of view, it would be ideal if you could add them to the new file that @corneliusroemer just added:

https://github.com/sars-cov-2-variants/lineage-proposals/blob/main/recombinants.tsv

I have set up my build to automatically spare (from the too-many-reversions filter) any sequences that are listed there (name must include the ID, for example I parse "EPI_ISL_18076084" out of the name "France/GES-IPP16300/2023|EPI_ISL_18076084|2023-07-10"). That is a bit more work than commenting in an issue, but it means that there is no waiting until I have time to add them to my local file! The tab-separated format has the nice table-formatted display in github too, which is nicer for tracking than searching through issue comments.

You would need write access to the sars-cov-2-variants/lineage-proposals repo, but I think everybody on this thread can be trusted with that. :)

The singletons with more than 20 private mutations will still be pruned, there is not currently a way to exempt particular sequences from that filter operation. I can raise it to 25 -- the more time that passes, the more divergence is to be expected.

@xz-keg
Copy link
Contributor Author

xz-keg commented Oct 22, 2023

Register one here.
EPI_ISL_18416021, Looks like an XCT(or its parent in #811)/#972(or its DV.7.1-donor) recomb. The XCT donor has additional S:151R.

Breakpoint between 24074 and 27506.

image

usher

image

@FedeGueli
Copy link

GK.1/XBB.1.16 https://t.co/CqYJWeNLlr
https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice1_genome_test_6380b_b81730.json

Epi_Isl_18421965 singlet

@ryhisner
Copy link

ryhisner commented Oct 27, 2023

There may already be an issue for this, but there's a JY.1/XBB.1.16.24 recombinant branch with 11 sequences. It's XBB.1.16* from ORF7a to the 3' end, and the rest is JY.1. It also has S:N641S and ORF1a:S2535L, which seem to be exclusive to this recombinant.
Query: C5173T, T2552C, C29386T

image

@FedeGueli
Copy link

There may already be an issue for this, but there's a JY.1/XBB.1.16.24 recombinant branch with 11 sequences. It's XBB.1.16* from ORF7a to the 3' end, and the rest is JY.1. It also has S:N641S and ORF1a:S2535L, which seem to be exclusive to this recombinant. Query: C5173T, T2552C, C29386T

image

There is no proposal for it please do it

@JosetteSchoenma
Copy link

Singlet EPI_ISL_18440692 HK.3/JN.1 Spike from JN.1
image

#1024

@JosetteSchoenma
Copy link

Not completely sure if this belongs here, but EG.5.1/XBB.1.16. Close to XCV in the tree. 4 samples
3 from Switzerland, not on GISAID, one from the Netherlands on GISAID EPI_ISL_18441654
image

@FedeGueli
Copy link

Dubious recombinant between JC.2/HV.1 : EPI_ISL_18438637 : https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice1_genome_test_135a5_c17110.json?c=gt-S_339&gmax=25384&gmin=21563

@xz-keg
Copy link
Contributor Author

xz-keg commented Oct 30, 2023

EPI_ISL_18443676, looks like recombinant of #1016(HK.3+L452M) and GK.2.1

image

@FedeGueli
Copy link

18443676

It has also S:L517F if another one pops up please propose it.

@xz-keg
Copy link
Contributor Author

xz-keg commented Nov 1, 2023

image

EPI_ISL_18450958, looks like recomb of XBB.1.16 and GK.3.1

@JosetteSchoenma
Copy link

JosetteSchoenma commented Nov 1, 2023

HF.1.1/BA.2.86.1 recombinant
Singlet England EPI_ISL_18450393
#1040

@FedeGueli
Copy link

Member

It seems a sandwich XBB.1.16/GK.3.1/XBB.1.16
but be aware that Germany ENV samples showing recombinants sometimes are instead contaminations

@xz-keg
Copy link
Contributor Author

xz-keg commented Nov 2, 2023

EPI_ISL_18455278

EG.5.1(likely EG.5.1.4)/XCT recomb

image

@xz-keg
Copy link
Contributor Author

xz-keg commented Nov 4, 2023

EPI_ISL_18443676, looks like recombinant of #1016(HK.3+L452M) and GK.2.1

EPI_ISL_18462697 belongs to the same branch

Try to make a query: C15755G, C22916A

@xz-keg
Copy link
Contributor Author

xz-keg commented Nov 4, 2023

It has also S:L517F if another one pops up please propose it.

Another one pops up but does not have S:L517F.

@FedeGueli
Copy link

It has also S:L517F if another one pops up please propose it.

Another one pops up but does not have S:L517F.

propose it the same if they are the same recombinant.

@FedeGueli
Copy link

FedeGueli commented Nov 4, 2023

EPI_ISL_18378307 is a JG.3.1/XBB.1.16.15/JG.3.1/XBB.1.16.15/JG.3 putative recombinant most likely to me contamination/coinfection:
https://cov-spectrum.org/explore/World/AllSamples/Past6M/variants?nucMutations=T12730A%2CC29625T%2C2059G%2C14856A&nextcladePangoLineage=XBB.1.16.15*&nextcladePangoLineage1=JG.3.1*&nextcladePangoLineage2=XBB.1.16.15*&analysisMode=CompareEquals&

@FedeGueli
Copy link

This JN.1.4 FLirt branch could be instead one or multiple recombinants with JN.1.16.1:
Screenshot 2024-06-20 alle 01 54 16
https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice13_genome_test_59a3e_335390.json?label=id:node_7107236

@xz-keg
Copy link
Contributor Author

xz-keg commented Jun 20, 2024

This JN.1.4 FLirt branch could be instead one or multiple recombinants with JN.1.16.1: Screenshot 2024-06-20 alle 01 54 16 https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice13_genome_test_59a3e_335390.json?label=id:node_7107236

I think that's just JN.1.4-FLiRT, JN.1.4+C5284T+FLiRT and JN.1.4.5+FLiRT misplaced together.

@FedeGueli
Copy link

I think that's just JN.1.4-FLiRT, JN.1.4+C5284T+FLiRT and JN.1.4.5+FLiRT misplaced together.

it has also s:31del

@FedeGueli
Copy link

I think that's just JN.1.4-FLiRT, JN.1.4+C5284T+FLiRT and JN.1.4.5+FLiRT misplaced together.

it has also s:31del

More samples of this:
Screenshot 2024-06-21 alle 18 46 24
https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice25_genome_test_4216a_564680.json?c=userOrOld&label=id:node_12207460

the three samples with 1263l and 31del to me are very likely recombinants between JN.1.4 and LF.1.1.1

@xz-keg
Copy link
Contributor Author

xz-keg commented Jun 22, 2024

the three samples with 1263l and 31del to me are very likely recombinants between JN.1.4 and LF.1.1.1

Yeah, that 3 looks quite possible recombs.
Others are JN.1.4-FLiRT, JN.1.4+C5284T+FLiRT and JN.1.4.5+FLiRT and JN.1.4.5+C28948T+FLiRT misplaced together.

@FedeGueli
Copy link

the three samples with 1263l and 31del to me are very likely recombinants between JN.1.4 and LF.1.1.1

Yeah, that 3 looks quite possible recombs. Others are JN.1.4-FLiRT, JN.1.4+C5284T+FLiRT and JN.1.4.5+FLiRT and JN.1.4.5+C28948T+FLiRT misplaced together.

yeah i agree. sorry i wasnt cler in the first tweet, i ve looked at those sequences, but as usual around recombinants usher tree could place other possible recombinants, that i was meaning.

@xz-keg
Copy link
Contributor Author

xz-keg commented Jun 23, 2024

EPI_ISL_19203296, EG.5.1/JN.1+S:F456L. It seems to have some other interesting mutations, S:E132Q, H146K, D178H, A688V, S704P, H1058Y and K1073N

@FedeGueli
Copy link

EPI_ISL_19203296, EG.5.1/JN.1+S:F456L. It seems to have some other interesting mutations, S:E132Q, H146K, D178H, A688V, S704P, H1058Y and K1073N

great catch this is super interesting one better to specify it :
Screenshot 2024-06-24 alle 09 45 19

It has also S:P9L the mutation analyzed by @ryhisner multiple times,
but it got also the NTD from EG.5.1 with further mutations, the RBD from JN.1+456L, from the end of Rbd to 3'end from EG.5.1 with multiple N mutations

Screenshot 2024-06-24 alle 09 57 25

https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice1_genome_test_4ba9a_807780.json?f_userOrOld=uploaded%20sample

ping @corneliusroemer @ryhisner

Suggested query: C14980T, C21588T,C8860T

@FedeGueli
Copy link

FedeGueli commented Jun 25, 2024

potential recombinant between KP.2.3 and LB.1+31del?: EPI_ISL_19214475 with S:N405N T22777C

Screenshot 2024-06-25 alle 11 56 05

https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice2_genome_test_24d0e_a90570.json?c=userOrOld

not sure of it

@FedeGueli
Copy link

A side branch of XDV appeared, i think it is unrelated or a recombinant of XDV. @aviczhl2 @JosetteSchoenma could you take a look at it:
Screenshot 2024-06-28 alle 19 13 57
https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice15_genome_test_4d139_eed9e0.json?f_userOrOld=uploaded%20sample&label=id:node_11503430
Screenshot 2024-06-28 alle 19 14 36

EPI_ISL_19221357,EPI_ISL_19221366

@JosetteSchoenma
Copy link

A side branch of XDV appeared, i think it is unrelated or a recombinant of XDV. @aviczhl2 @JosetteSchoenma could you take a look at it: Screenshot 2024-06-28 alle 19 13 57 https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice15_genome_test_4d139_eed9e0.json?f_userOrOld=uploaded%20sample&label=id:node_11503430 Screenshot 2024-06-28 alle 19 14 36

EPI_ISL_19221357,EPI_ISL_19221366

EPI_ISL_19221357 is the red line here and EPI_ISL_19221366 from China is the one just above it.
I think the red one is just rubbish and the Chinese one is just on a branch with a few extra mutations.
I searched for C9996R, C1884T and A21137G, which besides these 2 also found the XBB.1.5 at the bottom, which does not seem related.
image

@xz-keg
Copy link
Contributor Author

xz-keg commented Jun 28, 2024

A side branch of XDV appeared, i think it is unrelated or a recombinant of XDV. @aviczhl2 @JosetteSchoenma could you take a look at it: Screenshot 2024-06-28 alle 19 13 57 https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice15_genome_test_4d139_eed9e0.json?f_userOrOld=uploaded%20sample&label=id:node_11503430 Screenshot 2024-06-28 alle 19 14 36
EPI_ISL_19221357,EPI_ISL_19221366

EPI_ISL_19221357 is the red line here and EPI_ISL_19221366 from China is the one just above it. I think the red one is just rubbish and the Chinese one is just on a branch with a few extra mutations. I searched for C9996R, C1884T and A21137G, which besides these 2 also found the XBB.1.5 at the bottom, which does not seem related. image

19221366 is XDV.1 with Orf1a:L3829F, Orf1b:K2557R , two famous beneficial non-spike mutations.
19221357 seems to be GBW pooled sample involving XDV.1
@ryhisner what do you think?

@JosetteSchoenma
Copy link

I think it is likely that the 357 is pooled, both because of what it looks like and because there is no originating country given.

@xz-keg
Copy link
Contributor Author

xz-keg commented Jun 28, 2024

I think it is likely that the 357 is pooled, both because of what it looks like and because there is no originating country given.

AAA is pooled (like H10) and BBB is personal (like H20)

@JosetteSchoenma
Copy link

I think it is likely that the 357 is pooled, both because of what it looks like and because there is no originating country given.

AAA is pooled (like H10) and BBB is personal (like H20)

Oke. It's AAA so pooled. So, why did you say it that it seemed pooled and ask for Ryan's opinion, if you know that?

@ryhisner
Copy link

I agree with everyone's thoughts.

It's really frustrating how GBW's pooled samples end up screwing up the Usher Tree. I've been criticizing them for doing pooled samples at every opportunity, but apparently no one cares. GBW has been paid over $100 million now to do this sequencing, and they can't even be bothered to follow the most basic rules. What a pathetic state of affairs.

@xz-keg
Copy link
Contributor Author

xz-keg commented Jun 29, 2024

Oke. It's AAA so pooled. So, why did you say it that it seemed pooled and ask for Ryan's opinion, if you know that?

Pooled are not always artifacts though. Most of them are actually the same as the BBB seq. Only when 2 travelers infected by different variants mixed in one pool is troublesome.

@FedeGueli
Copy link

Ok thanks i ve looked mostly to the non pooled one and that row of reversions made me think it could have been a further recombinant.Thank you for your time

@JosetteSchoenma
Copy link

Ok thanks i ve looked mostly to the non pooled one and that row of reversions made me think it could have been a further recombinant.Thank you for your time

I didn't see any reversions.

@xz-keg
Copy link
Contributor Author

xz-keg commented Jun 29, 2024

Ok thanks i ve looked mostly to the non pooled one and that row of reversions made me think it could have been a further recombinant.Thank you for your time

That's an usher bug on recombinants. @AngieHinrichs

Is it possible to remove those fixed positions on recombinant branches(at least for designated recombs)?

@FedeGueli
Copy link

FedeGueli commented Jun 29, 2024 via email

@xz-keg
Copy link
Contributor Author

xz-keg commented Jun 29, 2024

EPI_ISL_19216679, BA.2.12.1/?

@FedeGueli
Copy link

EPI_ISL_19224177 LA.1/KP.3 recomb

@xz-keg
Copy link
Contributor Author

xz-keg commented Jul 2, 2024

EPI_ISL_19224596, XDQ.1 recombs with other XBB-3' variants and lose its 28846.

@xz-keg
Copy link
Contributor Author

xz-keg commented Jul 3, 2024

Github seems to be displaying 60 comments at the same time. I think it's better refresh this long-term issue for every 60 comments. I'll close this one and create a new one.

@xz-keg xz-keg closed this as completed Jul 3, 2024
@xz-keg xz-keg unpinned this issue Jul 3, 2024
@FedeGueli FedeGueli changed the title Issue to Register Singlet Recombinants that shall be not be removed by usher Issue to Register Singlet Recombinants that shall be not be removed by usher ( Until 2024-7-2 ) [Part I] Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Multiple lineages recombinant Usher Issues with usher related problems
Projects
None yet
Development

No branches or pull requests