Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BA.5.2 + orf1b:1050N + C29762T sublineage with S:444R and Orf1a:S2797F circulating in Sweden 67 sequences with a saltation cluster in USA #1328

Closed
FedeGueli opened this issue Nov 13, 2022 · 5 comments
Assignees
Milestone

Comments

@FedeGueli
Copy link
Contributor

FedeGueli commented Nov 13, 2022

Defining mutations:
BA.5.2 + > C27513T > C27012T > G12310A > C16616A > C29762T
then Orf1a:S2787F (C8655T)
then S:K444R (A22893G)

Tree:
Schermata 2022-11-13 alle 09 59 29

Gisaid query : NSP13_T127N,NSP4_S34F,Spike_K444R finds 67 seqs:
EPI_ISL_14419355, EPI_ISL_14706383, EPI_ISL_14824185,
EPI_ISL_14851820, EPI_ISL_14925484, EPI_ISL_15014426,
EPI_ISL_15014485, EPI_ISL_15014519, EPI_ISL_15106097,
EPI_ISL_15108932, EPI_ISL_15108969, EPI_ISL_15108981,
EPI_ISL_15168364, EPI_ISL_15208871, EPI_ISL_15208910,
EPI_ISL_15208931, EPI_ISL_15208967, EPI_ISL_15209006,
EPI_ISL_15237445, EPI_ISL_15266344, EPI_ISL_15312116,
EPI_ISL_15312118, EPI_ISL_15312130, EPI_ISL_15312138,
EPI_ISL_15312149, EPI_ISL_15312155, EPI_ISL_15312179,
EPI_ISL_15312181, EPI_ISL_15312183, EPI_ISL_15312187,
EPI_ISL_15312189, EPI_ISL_15312204, EPI_ISL_15312216,
EPI_ISL_15312236, EPI_ISL_15312240, EPI_ISL_15373866,
EPI_ISL_15379736, EPI_ISL_15384750, EPI_ISL_15384867-15384874,
EPI_ISL_15398745, EPI_ISL_15414013, EPI_ISL_15479464,
EPI_ISL_15479490-15479492, EPI_ISL_15497618, EPI_ISL_15573966,

Countries : mainly Sweden but also Germany, Denmark and USA.

In USA it seems to have done a saltation jump ( @shay671, @ryhisner) outside the spike :

ORF1a:I238V (T724G), T1001I,(C3267T) E1801Q (G5666C),
Orf1b:S2324G (A20437G)
ORF3a:Q213K ( C26029A)
ORF10:Y26H (T29633C)
Nuc: A977G,C21306, T24073C , C16092T, C16329T
Schermata 2022-11-13 alle 10 00 21
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_2347d_b1340.json?c=pango_lineage_usher&label=nuc%20mutations:T724G,A977G,C3267T,G5666C,C16329T,A20437G,C21306T,T24073C,C26029A,T29633C

Gisaid query for saltation cluster: NSP13_t127N,NSP4_S34F,Spike_K444R,NS3_Q213K,NSP15_S273G finds 6 sequences:
EPI_ISL_15373866, EPI_ISL_15379736, EPI_ISL_15414013,
EPI_ISL_15497618, EPI_ISL_15610848, EPI_ISL_15611748,

@Mydtlwn
Copy link
Contributor

Mydtlwn commented Nov 13, 2022

If there is no dispute, please designate it as soon as possible!

@InfrPopGen InfrPopGen self-assigned this Nov 13, 2022
InfrPopGen added a commit that referenced this issue Nov 13, 2022
Added new lineage BA.5.2.41 from #1328 with 30 new sequence designations, and 0 updated designations
@InfrPopGen InfrPopGen added this to the BA.5.2.41 milestone Nov 13, 2022
@InfrPopGen
Copy link
Contributor

Thanks for submitting. We've added lineage BA.5.2.41 with 30 newly designated sequences, and 0 updated designations. Defining mutation A22893G (S:K444R) (following C8655T (ORF1a:S2797F)).

@corneliusroemer
Copy link
Contributor

corneliusroemer commented Nov 14, 2022

Nextclade's consensus sequence generation algorithm sees this as just a BA.5.2 + ORF1b:1050N + S:444R (=BA.5.2.18) with an extra C29762T

29762T is not a good mutation to base a lineage on as it's close to the end of the sequence (within ~140nt) in an area that's often not sequenced. It looks like Usher pulled these sequences together based on that mutation which appears all over the BA.2 subtree (see 29762T sequences with collection date from the last month) as placed by Nextclade:
image

@AngieHinrichs you should probably mask this so that it doesn't cause things with and without to be pulled together based on a jumpy site.

The reason Nextclade only identified 29762T as lineage defining is that some misdesignated sequences ended up in the designations. I'm cleaning them up now.

This is a covSpectrum query: https://cov-spectrum.org/explore/World/AllSamples/Past6M/variants?aaMutations=ORF1b%3A1050N&nucMutations=C27012T%2CC27513T%2C8655T&nextcladePangoLineage=BA.5.2*&

It does look like 8655T occurred before S:K444R here.

Here's the parent lineage with just 8655 before S:K444R. 29762 has pulled these two subtrees apart

https://next.nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice2_genome_test_654be_2b3d30.json?branchLabel=Spike%20mutations

@AngieHinrichs
Copy link
Member

Yes, pangolin masks bases 1-265 and 29674-29903 after aligning them to reference, so 29762 is invisible to pangolin!

And I see what you mean about how the BA.5.2 > (4 muts including ORF1b:T1050N) > C29762T > C8655T is pulling sequences away from BA.5.2 > (4 muts including ORF1b:T1050N) > C8655T --

but the S:444 change on the C29762T > C8655T branch is A22893G (S:K444R), while the S:444 change on the C8655T branch is G22894T (S:K444N). Different mutations.

I think you're right that I should mask it, though. Currently I'm only masking the last 100 bases (29804-29903 from the Problematic Sites set) -- ah, and 29766 in BA.2 onward because it was found pretty much exclusively by Luxembourg and was causing a mini-BA.2 Luxembourg branch back in the day.

@AngieHinrichs
Copy link
Member

OK, I see the BA.5.2 > 4-mut-special > C8655T branch does also have a few scattered S:K444R's in addition to the branch with 34 sequences with S:K444N. There are even a few S:K444M's:

https://next.nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice2_genome_test_654be_2b3d30.json?branchLabel=Spike%20mutations&c=gt-S_444&label=id:node_6707796

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants