You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Output VCF file contains entries like below when '--alleles' is used:
chr5 125894866 . C T Infinity . AC=0;AF=0.00;AN=2;DP=31;ExcessHet=0.0000;FS=0.000;MLEAC=0;MLEAF=0.00;MQ=60.00;SOR=0.188 GT:AD:DP:GP:GQ:PG:PL 0/0:30,0:30:0,124.77,167.78:99:0,34.77,37.78:0,90,130
chr5 125895079 . A . 94.78 . AN=2;DP=37;MQ=60.00 GT:AD:DP 0/0:37:37
chr5 125895106 . G . 73.78 . AN=2;DP=28;MQ=60.00 GT:AD:DP 0/0:28:28
chr5 125895113 . G . 68.78 . AN=2;DP=25;MQ=60.00 GT:AD:DP 0/0:25:25
Position 125894866 is present in the --alleles file, and is genotyped correctrly as homozygous reference in the current sample. The following three positions are not present in the --alleles file, and do not contain an ALT allele in the output file ( dot for ALT). without '--alleles', these positions are not outputted.
Is this expected behaviour, and if so, why are they emitted ?
notes :
- same output is observed without threading ,
- same output is observed without dragen mode.
- --alleles is taken from a normal HC run.
- roughly the same heterozygous calls & hom.ALT calls are made with/without --alleles (which is expected behaviour)
22:06:39.332 WARN GATKAnnotationPluginDescriptor - Redundant enabled annotation group (StandardAnnotation) is enabled for this tool by default
22:06:39.337 WARN GATKAnnotationPluginDescriptor - Redundant enabled annotation group (StandardHCAnnotation) is enabled for this tool by default
22:06:39.383 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/gvandeweyer/miniconda3/envs/ELPREP/share/gatk4-4.2.5.0-0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Mar 12, 2022 10:06:39 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
22:06:39.543 INFO HaplotypeCaller - ------------------------------------------------------------
22:06:39.543 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.2.5.0
22:06:39.543 INFO HaplotypeCaller - Executing as [email protected] on Linux v4.4.0-210-generic amd64
22:06:39.543 INFO HaplotypeCaller - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_312-b07
22:06:39.544 INFO HaplotypeCaller - Start Date/Time: March 12, 2022 10:06:39 PM CET
22:06:39.544 INFO HaplotypeCaller - ------------------------------------------------------------
22:06:39.544 INFO HaplotypeCaller - ------------------------------------------------------------
22:06:39.544 INFO HaplotypeCaller - HTSJDK Version: 2.24.1
22:06:39.544 INFO HaplotypeCaller - Picard Version: 2.25.4
22:06:39.544 INFO HaplotypeCaller - Built for Spark Version: 2.4.5
22:06:39.544 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
22:06:39.545 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
22:06:39.545 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
22:06:39.545 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
22:06:39.545 INFO HaplotypeCaller - Deflater: IntelDeflater
22:06:39.545 INFO HaplotypeCaller - Inflater: IntelInflater
22:06:39.545 INFO HaplotypeCaller - GCS max retries/reopens: 20
22:06:39.545 INFO HaplotypeCaller - Requester pays: disabled
22:06:39.545 INFO HaplotypeCaller - Initializing engine
22:06:40.018 INFO FeatureManager - Using codec VCFCodec to read file file:///home/gvandeweyer/elprep_streaming/VariantCalling_Test/scattered/../wesid-226998-m.haplotypecaller.final.vcf.gz
22:06:40.091 INFO FeatureManager - Using codec IntervalListCodec to read file file:///home/gvandeweyer/elprep_streaming/VariantCalling_Test/scattered/0005-scattered.interval_list
22:06:40.375 INFO IntervalArgumentCollection - Processing 3884963 bp from intervals
22:06:40.411 INFO HaplotypeCaller - Done initializing engine
22:06:40.437 INFO HaplotypeCallerEngine - Disabling physical phasing, which is supported only for reference-model confidence output
22:06:40.484 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/home/gvandeweyer/miniconda3/envs/ELPREP/share/gatk4-4.2.5.0-0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
22:06:40.485 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/home/gvandeweyer/miniconda3/envs/ELPREP/share/gatk4-4.2.5.0-0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
22:06:40.515 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
22:06:40.516 INFO IntelPairHmm - Available threads: 4
22:06:40.516 INFO IntelPairHmm - Requested threads: 4
22:06:40.517 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
22:06:40.545 INFO ProgressMeter - Starting traversal
22:06:40.545 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
22:06:41.344 WARN InbreedingCoeff - InbreedingCoeff will not be calculated at position chr4:57843320 and possibly subsequent; at least 10 samples must have called genotypes
22:06:50.557 INFO ProgressMeter - chr4:69816964 0.2 570 3415.9
22:07:00.633 INFO ProgressMeter - chr4:74352584 0.3 1340 4002.4
22:07:10.827 INFO ProgressMeter - chr4:79856475 0.5 2370 4695.9
22:07:20.846 INFO ProgressMeter - chr4:88243684 0.7 3370 5017.2
22:07:32.170 INFO ProgressMeter - chr4:88577717 0.9 3470 4032.9
22:07:42.172 INFO ProgressMeter - chr4:102816662 1.0 4680 4556.4
22:07:52.185 INFO ProgressMeter - chr4:111397771 1.2 5780 4840.9
22:08:02.254 INFO ProgressMeter - chr4:122599470 1.4 6650 4883.2
22:08:12.357 INFO ProgressMeter - chr4:138450656 1.5 7590 4960.1
22:08:22.425 INFO ProgressMeter - chr4:147561485 1.7 8520 5017.7
22:08:32.475 INFO ProgressMeter - chr4:155719877 1.9 9500 5092.5
22:08:42.564 INFO ProgressMeter - chr4:169393558 2.0 10500 5163.1
22:08:52.656 INFO ProgressMeter - chr4:184628140 2.2 11430 5191.1
22:09:17.262 INFO ProgressMeter - chr4:190878453 2.6 12190 4667.0
22:09:31.627 INFO ProgressMeter - chr5:679327 2.9 12500 4383.9
22:09:41.640 INFO ProgressMeter - chr5:5303312 3.0 13080 4333.6
22:09:51.755 INFO ProgressMeter - chr5:15936459 3.2 14060 4411.9
22:10:01.826 INFO ProgressMeter - chr5:32403384 3.4 14710 4384.9
22:10:11.906 INFO ProgressMeter - chr5:38528906 3.5 15860 4502.2
22:10:22.512 INFO ProgressMeter - chr5:53815458 3.7 16970 4587.2
22:10:32.552 INFO ProgressMeter - chr5:64023887 3.9 17960 4644.7
22:10:42.616 INFO ProgressMeter - chr5:70888653 4.0 18930 4692.0
22:10:52.692 INFO ProgressMeter - chr5:78319994 4.2 20130 4790.1
22:11:02.777 INFO ProgressMeter - chr5:89939520 4.4 21130 4834.7
22:11:12.950 INFO ProgressMeter - chr5:96339058 4.5 21980 4841.3
22:11:23.401 INFO ProgressMeter - chr5:115188609 4.7 23170 4914.9
22:11:33.498 INFO ProgressMeter - chr5:127089898 4.9 24050 4925.7
22:11:33.815 INFO HaplotypeCaller - 69572 read(s) filtered by: MappingQualityReadFilter
James has a branch already that seems to solve this edge case.
This request was created from a contribution made by Geert Vandeweyer on March 13, 2022 08:49 UTC.
Link: https://gatk.broadinstitute.org/hc/en-us/community/posts/4664460343195-Haplotypecaller-4-5-2-0-alleles-unexpected-output
--
chr5 125894866 . C T Infinity . AC=0;AF=0.00;AN=2;DP=31;ExcessHet=0.0000;FS=0.000;MLEAC=0;MLEAF=0.00;MQ=60.00;SOR=0.188 GT:AD:DP:GP:GQ:PG:PL 0/0:30,0:30:0,124.77,167.78:99:0,34.77,37.78:0,90,130
chr5 125895079 . A . 94.78 . AN=2;DP=37;MQ=60.00 GT:AD:DP 0/0:37:37
chr5 125895106 . G . 73.78 . AN=2;DP=28;MQ=60.00 GT:AD:DP 0/0:28:28
chr5 125895113 . G . 68.78 . AN=2;DP=25;MQ=60.00 GT:AD:DP 0/0:25:25
Position 125894866 is present in the --alleles file, and is genotyped correctrly as homozygous reference in the current sample. The following three positions are not present in the --alleles file, and do not contain an ALT allele in the output file ( dot for ALT). without '--alleles', these positions are not outputted.
Is this expected behaviour, and if so, why are they emitted ?
notes :
- same output is observed without threading ,
- same output is observed without dragen mode.
- --alleles is taken from a normal HC run.
- roughly the same heterozygous calls & hom.ALT calls are made with/without --alleles (which is expected behaviour)
=======================
REQUIRED for all errors and issues:
a) GATK version used: 4.2.5.0
b) Exact command used:
gatk --java-options "-Djava.io.tmpdir=/tmp -Xmx3g" HaplotypeCaller \
-R /home/gvandeweyer/elprep_streaming/reference/hg19.fasta \
-I /home/gvandeweyer/elprep_streaming/results/wesep-229191-f.bam \
-O results/wesep-229191-f.vcf \
--alleles affected_alleles.vcf \
-L 0005-scattered.interval_list \
-bamout results/wesep-229191-f.variants.bam \
-G StandardAnnotation -G StandardHCAnnotation \
--dragen-mode \
--dragstr-params-path /home/gvandeweyer/elprep_streaming/results/wesep-229191-f.bam.params \
--native-pair-hmm-threads 2
c) Entire program log:
(ELPREP) gvandeweyer@ngsvm-pipelines:~/elprep_streaming/VariantCalling_Test/scattered$ gatk --java-options "-Djava.io.tmpdir=/tmp -Xmx3g" HaplotypeCaller -R /home/gvandeweyer/elprep_streaming/reference/hg19.fasta -I /home/gvandeweyer/elprep_streaming/results/wesep-
229191-f.bam -O results/wesep-229191-f.vcf --alleles ../wesid-226998-m.haplotypecaller.final.vcf.gz -L 0005-scattered.interval_list -bamout results/wesep-229191-f.variants.bam -G StandardAnnotation -G StandardHCAnnotation --dragen-mode --dragstr-params-
path /home/gvandeweyer/elprep_streaming/results/wesep-229191-f.bam.params 2>&1 | tee Runtime.log.txt
Using GATK jar /home/gvandeweyer/miniconda3/envs/ELPREP/share/gatk4-4.2.5.0-0/gatk-package-4.2.5.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Djava.io.tmpdir=/tmp -Xmx3g -jar /home/gvandeweyer/miniconda3/envs/ELPREP/share/gatk4-4.2.5.0-0/gatk-packa
ge-4.2.5.0-local.jar HaplotypeCaller -R /home/gvandeweyer/elprep_streaming/reference/hg19.fasta -I /home/gvandeweyer/elprep_streaming/results/wesep-229191-f.bam -O results/wesep-229191-f.vcf --alleles ../wesid-226998-m.haplotypecaller.final.vcf.gz -L 0005-scattered.inter
val_list -bamout results/wesep-229191-f.variants.bam -G StandardAnnotation -G StandardHCAnnotation --dragen-mode --dragstr-params-path /home/gvandeweyer/elprep_streaming/results/wesep-229191-f.bam.params
22:06:39.332 WARN GATKAnnotationPluginDescriptor - Redundant enabled annotation group (StandardAnnotation) is enabled for this tool by default
22:06:39.337 WARN GATKAnnotationPluginDescriptor - Redundant enabled annotation group (StandardHCAnnotation) is enabled for this tool by default
22:06:39.383 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/gvandeweyer/miniconda3/envs/ELPREP/share/gatk4-4.2.5.0-0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Mar 12, 2022 10:06:39 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
22:06:39.543 INFO HaplotypeCaller - ------------------------------------------------------------
22:06:39.543 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.2.5.0
22:06:39.543 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
22:06:39.543 INFO HaplotypeCaller - Executing as [email protected] on Linux v4.4.0-210-generic amd64
22:06:39.543 INFO HaplotypeCaller - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_312-b07
22:06:39.544 INFO HaplotypeCaller - Start Date/Time: March 12, 2022 10:06:39 PM CET
22:06:39.544 INFO HaplotypeCaller - ------------------------------------------------------------
22:06:39.544 INFO HaplotypeCaller - ------------------------------------------------------------
22:06:39.544 INFO HaplotypeCaller - HTSJDK Version: 2.24.1
22:06:39.544 INFO HaplotypeCaller - Picard Version: 2.25.4
22:06:39.544 INFO HaplotypeCaller - Built for Spark Version: 2.4.5
22:06:39.544 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
22:06:39.545 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
22:06:39.545 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
22:06:39.545 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
22:06:39.545 INFO HaplotypeCaller - Deflater: IntelDeflater
22:06:39.545 INFO HaplotypeCaller - Inflater: IntelInflater
22:06:39.545 INFO HaplotypeCaller - GCS max retries/reopens: 20
22:06:39.545 INFO HaplotypeCaller - Requester pays: disabled
22:06:39.545 INFO HaplotypeCaller - Initializing engine
22:06:40.018 INFO FeatureManager - Using codec VCFCodec to read file file:///home/gvandeweyer/elprep_streaming/VariantCalling_Test/scattered/../wesid-226998-m.haplotypecaller.final.vcf.gz
22:06:40.091 INFO FeatureManager - Using codec IntervalListCodec to read file file:///home/gvandeweyer/elprep_streaming/VariantCalling_Test/scattered/0005-scattered.interval_list
22:06:40.375 INFO IntervalArgumentCollection - Processing 3884963 bp from intervals
22:06:40.411 INFO HaplotypeCaller - Done initializing engine
22:06:40.411 WARN HaplotypeCaller - *************************************************************************
22:06:40.411 WARN HaplotypeCaller - * DRAGEN-GATK mode enabled *
22:06:40.412 WARN HaplotypeCaller - * The following arguments have had their inputs overwritten: *
22:06:40.412 WARN HaplotypeCaller - * --apply-frd *
22:06:40.412 WARN HaplotypeCaller - * --apply-bqd *
22:06:40.412 WARN HaplotypeCaller - * --transform-dragen-mapping-quality *
22:06:40.412 WARN HaplotypeCaller - * --soft-clip-low-quality-ends *
22:06:40.413 WARN HaplotypeCaller - * --mapping-quality-threshold-for-genotyping 1 *
22:06:40.413 WARN HaplotypeCaller - * --minimum-mapping-quality 1 *
22:06:40.413 WARN HaplotypeCaller - * --allele-informative-reads-overlap-margin 1 *
22:06:40.413 WARN HaplotypeCaller - * --disable-cap-base-qualities-to-map-quality *
22:06:40.413 WARN HaplotypeCaller - * --enable-dynamic-read-disqualification-for-genotyping *
22:06:40.413 WARN HaplotypeCaller - * --expected-mismatch-rate-for-read-disqualification 0.03 *
22:06:40.413 WARN HaplotypeCaller - * --genotype-assignment-method USE_POSTERIOR_PROBABILITIES *
22:06:40.414 WARN HaplotypeCaller - * --padding-around-indels 150 *
22:06:40.414 WARN HaplotypeCaller - * --standard-min-confidence-threshold-for-calling 3.0 *
22:06:40.414 WARN HaplotypeCaller - * --use-posteriors-to-calculate-qual *
22:06:40.414 WARN HaplotypeCaller - * --allele-informative-reads-overlap-margin 1 *
22:06:40.414 WARN HaplotypeCaller - * *
22:06:40.414 WARN HaplotypeCaller - * If you would like to run DRAGEN-GATK with different inputs for any *
22:06:40.415 WARN HaplotypeCaller - * of the above arguments please manually construct the command. *
22:06:40.415 WARN HaplotypeCaller - *************************************************************************
22:06:40.437 INFO HaplotypeCallerEngine - Disabling physical phasing, which is supported only for reference-model confidence output
22:06:40.484 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/home/gvandeweyer/miniconda3/envs/ELPREP/share/gatk4-4.2.5.0-0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
22:06:40.485 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/home/gvandeweyer/miniconda3/envs/ELPREP/share/gatk4-4.2.5.0-0/gatk-package-4.2.5.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
22:06:40.515 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
22:06:40.516 INFO IntelPairHmm - Available threads: 4
22:06:40.516 INFO IntelPairHmm - Requested threads: 4
22:06:40.517 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
22:06:40.545 INFO ProgressMeter - Starting traversal
22:06:40.545 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
22:06:41.344 WARN InbreedingCoeff - InbreedingCoeff will not be calculated at position chr4:57843320 and possibly subsequent; at least 10 samples must have called genotypes
22:06:50.557 INFO ProgressMeter - chr4:69816964 0.2 570 3415.9
22:07:00.633 INFO ProgressMeter - chr4:74352584 0.3 1340 4002.4
22:07:10.827 INFO ProgressMeter - chr4:79856475 0.5 2370 4695.9
22:07:20.846 INFO ProgressMeter - chr4:88243684 0.7 3370 5017.2
22:07:32.170 INFO ProgressMeter - chr4:88577717 0.9 3470 4032.9
22:07:42.172 INFO ProgressMeter - chr4:102816662 1.0 4680 4556.4
22:07:52.185 INFO ProgressMeter - chr4:111397771 1.2 5780 4840.9
22:08:02.254 INFO ProgressMeter - chr4:122599470 1.4 6650 4883.2
22:08:12.357 INFO ProgressMeter - chr4:138450656 1.5 7590 4960.1
22:08:22.425 INFO ProgressMeter - chr4:147561485 1.7 8520 5017.7
22:08:32.475 INFO ProgressMeter - chr4:155719877 1.9 9500 5092.5
22:08:42.564 INFO ProgressMeter - chr4:169393558 2.0 10500 5163.1
22:08:52.656 INFO ProgressMeter - chr4:184628140 2.2 11430 5191.1
22:09:17.262 INFO ProgressMeter - chr4:190878453 2.6 12190 4667.0
22:09:31.627 INFO ProgressMeter - chr5:679327 2.9 12500 4383.9
22:09:41.640 INFO ProgressMeter - chr5:5303312 3.0 13080 4333.6
22:09:51.755 INFO ProgressMeter - chr5:15936459 3.2 14060 4411.9
22:10:01.826 INFO ProgressMeter - chr5:32403384 3.4 14710 4384.9
22:10:11.906 INFO ProgressMeter - chr5:38528906 3.5 15860 4502.2
22:10:22.512 INFO ProgressMeter - chr5:53815458 3.7 16970 4587.2
22:10:32.552 INFO ProgressMeter - chr5:64023887 3.9 17960 4644.7
22:10:42.616 INFO ProgressMeter - chr5:70888653 4.0 18930 4692.0
22:10:52.692 INFO ProgressMeter - chr5:78319994 4.2 20130 4790.1
22:11:02.777 INFO ProgressMeter - chr5:89939520 4.4 21130 4834.7
22:11:12.950 INFO ProgressMeter - chr5:96339058 4.5 21980 4841.3
22:11:23.401 INFO ProgressMeter - chr5:115188609 4.7 23170 4914.9
22:11:33.498 INFO ProgressMeter - chr5:127089898 4.9 24050 4925.7
22:11:33.815 INFO HaplotypeCaller - 69572 read(s) filtered by: MappingQualityReadFilter
0 read(s) filtered by: MappingQualityAvailableReadFilter
0 read(s) filtered by: MappedReadFilter
380 read(s) filtered by: NotSecondaryAlignmentReadFilter
0 read(s) filtered by: NotDuplicateReadFilter
0 read(s) filtered by: PassesVendorQualityCheckReadFilter
0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter
0 read(s) filtered by: GoodCigarReadFilter
0 read(s) filtered by: WellformedReadFilter
69952 total reads filtered
22:11:33.816 INFO ProgressMeter - chr5:127488298 4.9 24105 4931.6
22:11:33.816 INFO ProgressMeter - Traversal complete. Processed 24105 total regions in 4.9 minutes.
22:11:33.891 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 2.883281574
22:11:33.891 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 66.287158269
22:11:33.891 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 81.56 sec
22:11:35.558 INFO HaplotypeCaller - Shutting down engine
[March 12, 2022 10:11:35 PM CET] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 4.94 minutes.
Runtime.totalMemory()=1998061568
===========
See forum topic details at forum guidelines page: https://gatk.broadinstitute.org/hc/en-us/articles/360053845952-Forum-Guidelines
(created from Zendesk ticket #276551)
gz#276551
The text was updated successfully, but these errors were encountered: