Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Funcotator outputs fields in wrong order in some cases when writing vcf #6173

Closed
droazen opened this issue Sep 23, 2019 · 1 comment · Fixed by #6178
Closed

Funcotator outputs fields in wrong order in some cases when writing vcf #6173

droazen opened this issue Sep 23, 2019 · 1 comment · Fixed by #6178
Assignees

Comments

@droazen
Copy link
Collaborator

droazen commented Sep 23, 2019

There's an issue in VcfFuncotationFactory.createFuncotationsOnVariant where certain cites that need multiple functotations merged produce incorrect output. The cause is an accidental conversion of LinkedHashMap -> HashMap which scrambles the iteration order of the map. The order is used when writing the fields so the names of the fields end up on different values.

@droazen droazen added this to the GATK-Triaged-Issues milestone Sep 23, 2019
@lbergelson lbergelson changed the title Funcotator output order sometimes wrong with VCF Funcotator outputs fields in wrong order in some cases when writing vcf Sep 24, 2019
lbergelson added a commit that referenced this issue Sep 24, 2019
Fixes #6173

I added a single line test that shows the problem, and regenerated the other test files.  I didn't write a specific unit test that proves it's solved.  Feel free to do so if you think it's necessary.
I validated the output by adding the field name to the output value and checking it by eye against the header lines.  This could fairly simply made into a unit test if desired.

I'm not sure if there are other large files that need to be regenerated.

I had initially said this bug only affects vcf but I think it happens to Maf output as well.
lbergelson added a commit that referenced this issue Sep 24, 2019
Fixes #6173

I added a single line test that shows the problem, and regenerated the other test files.  I didn't write a specific unit test that proves it's solved.  Feel free to do so if you think it's necessary.
I validated the output by adding the field name to the output value and checking it by eye against the header lines.  This could fairly simply made into a unit test if desired.

I'm not sure if there are other large files that need to be regenerated.

I had initially said this bug only affects vcf but I think it happens to Maf output as well.
lbergelson added a commit that referenced this issue Oct 3, 2019
Fixes #6173

I added a single line test that shows the problem, and regenerated the other test files.  I didn't write a specific unit test that proves it's solved.  Feel free to do so if you think it's necessary.
I validated the output by adding the field name to the output value and checking it by eye against the header lines.  This could fairly simply made into a unit test if desired.

I'm not sure if there are other large files that need to be regenerated.

I had initially said this bug only affects vcf but I think it happens to Maf output as well.
@lbergelson
Copy link
Member

The issue seems to happen when you have multiple records in the datasource which exactly match the position and alleles of a record in the file being annotated. The output fields for that datasource will be scrambled at those sites. This affects both vcf and maf output.

lbergelson added a commit that referenced this issue Oct 4, 2019
* Fix VCFFuncotation output order bug
    This bug caused funcotation fields produced from a VCFFuncotation source to be scrambled when there were multiple entries with identical positions/alleles in the datasource vcf.
  This was caused by accidental use Collectors.toMap creating a HashMap instead of a LinkedHashMap
* Removed a method which allowed passing in a Map instead of LinkedHashMap in order to harden against future bugs of this source
* Fixes #6173
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants