-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
max alt allele bug fix #7655
max alt allele bug fix #7655
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Back to you with my comments @ldgauthier
src/main/java/org/broadinstitute/hellbender/tools/genomicsdb/GenomicsDBArgumentCollection.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/genomicsdb/GenomicsDBOptions.java
Outdated
Show resolved
Hide resolved
if (genotypeCalcArgs != null) { | ||
this.maxDiploidAltAllelesThatCanBeGenotyped = genotypeCalcArgs.maxAlternateAlleles; | ||
this.maxGenotypeCount = genotypeCalcArgs.maxGenotypeCount; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we determine whether we also need a separate max genotype count arg for GenomicsDB? Does the GenomicsDB value also need to be greater than the value for genotyping? @mlathara opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm probably just looking in the wrong place, but I couldn't see where GenotypeGVCFs is limiting anything based on the max-genotype-count. There's one spot where it warns about PL field length but that is checking against a constant, not max-genotype-count.
So, if GenotypeGVCFs doesn't actually need to use this, then we're fine with a single argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using IntelliJ it looks like it's just HaplotypeCaller.
...broadinstitute/hellbender/tools/walkers/genotyper/GenotypeCalculationArgumentCollection.java
Outdated
Show resolved
Hide resolved
...broadinstitute/hellbender/tools/walkers/genotyper/GenotypeCalculationArgumentCollection.java
Outdated
Show resolved
Hide resolved
public static boolean genotypeIsUsableForAFCalculation(Genotype g) { | ||
return g.hasLikelihoods() || g.hasGQ() || g.getAlleles().stream().anyMatch(a -> a.isCalled() && a.isNonReference() && !a.isSymbolic()); | ||
return g.hasLikelihoods() || (g.isHomRef() && g.hasGQ() && 2 == g.getPloidy()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition has changed quite a bit in this PR! Can you briefly explain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only scenario where we can (somewhat confidently) reconstruct PLs from GQ is hom-ref, but it has to be diploid (probably we could do haploid and higher ploidies too, but the ROI is low)
The second case isn't useful because if we get there because there are no likelihoods and no GQ, then there's no useful data.
src/test/java/org/broadinstitute/hellbender/tools/walkers/GenotypeGVCFsIntegrationTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/broadinstitute/hellbender/tools/walkers/GenotypeGVCFsIntegrationTest.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
@Test | ||
public void testMaxAltsToCombine() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testMaxAltsToCombine -> testMaxAltsForGenomicsDB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compromise: testMaxAltsToCombineInGenomicsDB ?
src/test/java/org/broadinstitute/hellbender/tools/walkers/GenotypeGVCFsIntegrationTest.java
Show resolved
Hide resolved
@mlathara Would you mind reviewing as well when you get a chance? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added my comments
src/main/java/org/broadinstitute/hellbender/tools/genomicsdb/GenomicsDBArgumentCollection.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/genotyper/GenotypingEngine.java
Outdated
Show resolved
Hide resolved
if (genotypeCalcArgs != null) { | ||
this.maxDiploidAltAllelesThatCanBeGenotyped = genotypeCalcArgs.maxAlternateAlleles; | ||
this.maxGenotypeCount = genotypeCalcArgs.maxGenotypeCount; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm probably just looking in the wrong place, but I couldn't see where GenotypeGVCFs is limiting anything based on the max-genotype-count. There's one spot where it warns about PL field length but that is checking against a constant, not max-genotype-count.
So, if GenotypeGVCFs doesn't actually need to use this, then we're fine with a single argument.
@droazen back to you. Tests fixed locally and I will rebase and merge AlleleSubsettingUtils now. |
94b9c4b
to
462c6f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ldgauthier Looks good -- just one minor typo in a comment, then go ahead and merge when tests pass
args.add("--"+GenomicsDBArgumentCollection.MAX_ALTS_LONG_NAME); | ||
args.add("5"); | ||
args.add("--"+GenotypeCalculationArgumentCollection.MAX_ALTERNATE_ALLELES_LONG_NAME); | ||
args.add("5"); // GenotypeGVCFs value needs to be at least one more, should throw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"GenotypeGVCFs value needs to be at least one more" should be "GenomicsDB value needs to be at least one more"
max alts + 1 (for non-ref) Properly handle genotypes returned from GenomicsDB with no likelihoods
2ba3a3b
to
81a078b
Compare
Add new argument for GenomicsDB max alts, which must be >= GGVCFs max alts + 1
Fix exception arising from GDB output with too many alts and no likelihoods