Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multi-allelic VCF #424

Closed
xiamaz opened this issue Mar 29, 2024 · 2 comments
Closed

Support multi-allelic VCF #424

xiamaz opened this issue Mar 29, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@xiamaz
Copy link
Contributor

xiamaz commented Mar 29, 2024

Is your feature request related to a problem? Please describe.
Currently VCFs containing multi-allelic sites need to be decomposed, whereas these are supported by vep. This limitation makes direct benchmark comparisons more limited.

Describe the solution you'd like
mehari should support multi allelics and simply decompose these while processing. As the used vcf parsing library already fully supports parsing multi-allelics, only changes in mehari should be necessary.

Describe alternatives you've considered
Otherwise preprocessing using bcftools using e.g. bcftools norm -m- -a is necessary. As mehari also doesn't support directly reading from stdin, a write to disk is always necessary. This has significant impact on overall performance for non-normalized vcf.

Additional context
If possible performance penalty to normalized VCFs should be kept close to zero. This needs to be established when making changes.

@xiamaz xiamaz added the enhancement New feature or request label Mar 29, 2024
@holtgrewe
Copy link
Contributor

Will need reference FASTA and implement variant normalization (https://academic.oup.com/bioinformatics/article/31/13/2202/196142). This will need local re-sorting of variants, potentially line wise external memory sorting.

@xiamaz
Copy link
Contributor Author

xiamaz commented May 29, 2024

Closing for #447

@xiamaz xiamaz closed this as completed May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Archived in project
Development

No branches or pull requests

2 participants