Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sublineage of BA.5.2.1 with S:N450D, ORF1a:T1552N, & ORF1b:G662S (113 seq, 75 in NYC) #1250

Closed
ryhisner opened this issue Oct 24, 2022 · 3 comments
Assignees
Milestone

Comments

@ryhisner
Copy link

Description

Sub-lineage of: BA.5.2.1
Earliest sequence: 2022-8-6, USA, New York City — EPI_ISL_14461365
Most recent sequence: 2022-10-16, USA, New York City —EPI_ISL_15470791, EPI_ISL_15470800, EPI_ISL_15470821, EPI_ISL_15470856
Countries circulating: USA (110—75 from New York City), Australia (2), Portugal (1)
Number of Sequences: 113
GISAID Query: NSP3_T734N, NSP12_G671S, Spike_N450D
CovSpectrum Query: Nextcladepangolineage:BA.5.2.1* & S:N450D & ORF1a:T1552N & ORF1b:G662S
Substitutions on top of BA.5.2.1:
Spike: N450D
ORF1a: T1552N
ORF1b: G662S
Nucleotide: C4920A, A22910G, G15451A

USHER Tree
https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons/main/BA.5.2.1%2BN450D%2BORF1aT1552N%2BORF1bG662S_subtreeAuspice1_genome_288a9_5f41c0.json

image

Evidence
ORF1b:G662S is one of the most convergent ORF1ab mutations. It’s been in Delta, BA.2.75, BJ.1, XBB, and has evolved independently in countless other undesignated lineages and chronic-infection sequences. I have no idea what its functional influence is, but it seems clear that it is advantageous, at least in most contexts.

image

NextClade also says this lineage has N:S33F and ORF9b:A29I and ORF9b:V30L. N:S33F, however, doesn’t show up on GISAID or on Usher. The areas next to deletions are often fuzzy and hard to interpret, so I haven’t included these in the proposal above. If these mutations do exist, I’m unsure whether they are unique to this lineage or shared more widely by sequences not in this lineage.

Genomes

Genomes EPI_ISL_14461365, EPI_ISL_14574395, EPI_ISL_14685547, EPI_ISL_14685562, EPI_ISL_14685595, EPI_ISL_14808097, EPI_ISL_14808774, EPI_ISL_14808950, EPI_ISL_14849195, EPI_ISL_14893188, EPI_ISL_14900805, EPI_ISL_14914897, EPI_ISL_14920521, EPI_ISL_14954474, EPI_ISL_15015839, EPI_ISL_15015927, EPI_ISL_15105927, EPI_ISL_15106053, EPI_ISL_15106539, EPI_ISL_15240473, EPI_ISL_15241933, EPI_ISL_15297092, EPI_ISL_15297748, EPI_ISL_15297753, EPI_ISL_15297847, EPI_ISL_15297968, EPI_ISL_15298009, EPI_ISL_15298013, EPI_ISL_15298045, EPI_ISL_15298177, EPI_ISL_15298191, EPI_ISL_15298198, EPI_ISL_15298267, EPI_ISL_15313429, EPI_ISL_15313470, EPI_ISL_15313518, EPI_ISL_15313546, EPI_ISL_15341383, EPI_ISL_15341713, EPI_ISL_15341728, EPI_ISL_15341771, EPI_ISL_15341816, EPI_ISL_15342012, EPI_ISL_15348807, EPI_ISL_15352978, EPI_ISL_15353032, EPI_ISL_15380522, EPI_ISL_15380525, EPI_ISL_15380573, EPI_ISL_15380673, EPI_ISL_15380683, EPI_ISL_15380816, EPI_ISL_15380897, EPI_ISL_15380902, EPI_ISL_15380915, EPI_ISL_15380927, EPI_ISL_15380931, EPI_ISL_15381023, EPI_ISL_15381043, EPI_ISL_15381115, EPI_ISL_15381126, EPI_ISL_15381127, EPI_ISL_15381165, EPI_ISL_15381195, EPI_ISL_15381259, EPI_ISL_15390598, EPI_ISL_15391643, EPI_ISL_15391660, EPI_ISL_15395708, EPI_ISL_15395725, EPI_ISL_15395765, EPI_ISL_15410837, EPI_ISL_15413421, EPI_ISL_15413762, EPI_ISL_15425951, EPI_ISL_15427652, EPI_ISL_15427712, EPI_ISL_15427715, EPI_ISL_15440285, EPI_ISL_15440286, EPI_ISL_15458035, EPI_ISL_15458232, EPI_ISL_15465287, EPI_ISL_15465343, EPI_ISL_15465348, EPI_ISL_15465805, EPI_ISL_15466842, EPI_ISL_15470019, EPI_ISL_15470032, EPI_ISL_15470047, EPI_ISL_15470062, EPI_ISL_15470109, EPI_ISL_15470137, EPI_ISL_15470178, EPI_ISL_15470190, EPI_ISL_15470317, EPI_ISL_15470347, EPI_ISL_15470388, EPI_ISL_15470421, EPI_ISL_15470432, EPI_ISL_15470437, EPI_ISL_15470474, EPI_ISL_15470506, EPI_ISL_15470507, EPI_ISL_15470554, EPI_ISL_15470595, EPI_ISL_15470641, EPI_ISL_15470791, EPI_ISL_15470800, EPI_ISL_15470821, EPI_ISL_15470856, EPI_ISL_15470868, EPI_ISL_15470883
@InfrPopGen InfrPopGen self-assigned this Nov 1, 2022
InfrPopGen added a commit that referenced this issue Nov 1, 2022
Added new lineage BF.32 from #1250 with 138 new sequence designations, and 0 updated designations
@InfrPopGen InfrPopGen added this to the BF.32 milestone Nov 1, 2022
@InfrPopGen
Copy link
Contributor

Thanks for submitting. We've added recombinant lineage BF.32 with 138 newly designated sequences, and 0 updated designations. Defining mutation: G15451A (ORF1b:G662S) (following C4920A (ORF1a:T1552N), A22910G (S:N450D)).

@shay671
Copy link

shay671 commented Nov 29, 2022

What about G28361T and G28371T ? Are they part of this variant? Should they ?
Those 2 positions are seemed to be masked in USHER. But i found them in ~80% of the sequences of BF.32.
When i try to look their % over time in BF.32 samples, i see that they do not seem to increase over time (meaning they are probably not a sub lineage of BF.32), and they are not present in the BA.5.2.1.
But, when looking on samples who has the 3 mutations between BA.5.2.1 and G15451A (which defines BF.32) and also lacks the mutation at G15451A, most of these also has those 2 mutations. So they are probably part of the mediating branch. But does it change the picture to were the branchpoint should be?
@ryhisner

@AngieHinrichs
Copy link
Member

They are masked in UShER (in the BA.2 branch that also includes BA.4 and BA.5) because they are part of / adjacent to (depending on alignment boundaries) a deletion in BA.2.

Unfortunately, with many pipelines used for assembly of SARS-CoV-2 consensus genomes, deletions and insertions are very inconsistently detected and there are often spurious "substitutions" adjacent to the deleted region. In order to prevent big messes in the tree I mask deleted regions and sometimes adjacent bases as well. Regions masked after placement in Delta and Omicron variants are in a script called maskDelta.sh.

The inconsistent detection of indels makes it difficult to determine exactly when a deletion happened, even when looking at the genome sequences. I sure hope there is never another pandemic, but if there is, I hope the world can at least solve the problem of correctly assembling genomes with indels relative to the reference!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants