Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dealing with structural variants #386

Open
mdediegofuertes opened this issue Aug 8, 2024 · 3 comments
Open

Dealing with structural variants #386

mdediegofuertes opened this issue Aug 8, 2024 · 3 comments

Comments

@mdediegofuertes
Copy link

Hello @jodyphelan,

We are currently implementing the analysis of IS6110 insertions onto our structural variants workflow, but we are running into some unexpected TBProfiler outputs.

The attached vcf (fixed_joint.delly.vcf.gz) has a number of structural variants, detected by either Delly or ISMapper, but it seems that only one of these (pncA_c.-2148_*747del) is being processed by TBProfiler. At first we thought that this could be an issue with the format of the ISMapper variants, but we've noticed that some from Delly are also not being reported. This is expected for those occurring in non-DR regions, which are simply added to the total_variants count, but we have found at least one instance of a variant occurring within a DR region (1359bp deletion in Rv2477c, a Tier 2 gene for STM, MFX, EMB, LFX, AMK, RIF & KAN) that is also not showing up in the json outputs.

Do you think this issue is related to the format of our input vcf, or perhaps these variants are not being correctly interpreted by SnpEff?

Thanks,
Miguel de Diego Fuertes

fixed_joint.delly.vcf.gz
tbprofiler.variants.csv

@jodyphelan
Copy link
Owner

Hi @mdediegofuertes

I'll take a look at this and get back to you asap!

@jodyphelan
Copy link
Owner

jodyphelan commented Aug 20, 2024

It looks like it could be an issue todo with snpEff not annotating insertions wiht the alt set to <INS>.

Is there any way you could get the actual sequence that is inserted?

@mdediegofuertes
Copy link
Author

Hi Jody,

Apologies for the delay in replying, and thanks for taking a look at this. The inserted sequence, in the case of IS6110 insertions, is attached here (in txt format).
While this might solve the IS6110 insertions, the structural variants detected by Delly still pose an issue, since the alt sequences are variable and can span several thousand bp. Do you have any thoughts on how these could be dealt with?

Thanks again,
Miguel de Diego Fuertes

@vrennie

IS_6110.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants