Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip faulty CIGARs instead of crashing #535

Closed
Pranav-Garg opened this issue Mar 7, 2024 · 5 comments
Closed

Skip faulty CIGARs instead of crashing #535

Pranav-Garg opened this issue Mar 7, 2024 · 5 comments

Comments

@Pranav-Garg
Copy link

There is a known bug in minimap2 (lh3/minimap2#1090) where occasionally faulty CIGARs are outputted. Currently when running on bam's containing even one such string, COBALT and GRIDSS crash. It would be better if these programs instead skipped reads with such CIGARs without crashing.

Example of a faulty CIGAR: 21M46I46D2I81M
Error message produced:


20:37:42.500 [INFO ] calculating read depths from dupsFlagged.bam
21:13:46.879 [ERROR] error: java.util.concurrent.ExecutionException: htsjdk.samtools.SAMFormatException: SAM validation error: WARNING::ADJACENT_INDEL_IN_CIGAR:Read name NOVASEQ1_131:4:1407:29143:34961, No M or N operator between pair of D operators in CIGAR
java.util.concurrent.ExecutionException: htsjdk.samtools.SAMFormatException: SAM validation error: WARNING::ADJACENT_INDEL_IN_CIGAR:Read name NOVASEQ1_131:4:1407:29143:34961, No M or N operator between pair of D operators in CIGAR
	at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
	at com.hartwig.hmftools.cobalt.count.BamReadCounter.generateDepths(BamReadCounter.java:98)
	at com.hartwig.hmftools.cobalt.CobaltApplication.run(CobaltApplication.java:86)
	at com.hartwig.hmftools.cobalt.CobaltApplication.main(CobaltApplication.java:65)
Caused by: htsjdk.samtools.SAMFormatException: SAM validation error: WARNING::ADJACENT_INDEL_IN_CIGAR:Read name NOVASEQ1_131:4:1407:29143:34961, No M or N operator between pair of D operators in CIGAR
	at htsjdk.samtools.SAMUtils.processValidationErrors(SAMUtils.java:457)
	at htsjdk.samtools.BAMRecord.getCigar(BAMRecord.java:284)
	at htsjdk.samtools.SAMRecord.isValid(SAMRecord.java:2102)
	at htsjdk.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:848)
	at htsjdk.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:834)
	at htsjdk.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:802)
	at htsjdk.samtools.BAMFileReader$BAMQueryFilteringIterator.advance(BAMFileReader.java:1079)
	at htsjdk.samtools.BAMFileReader$BAMQueryFilteringIterator.next(BAMFileReader.java:1069)
	at htsjdk.samtools.BAMFileReader$BAMQueryFilteringIterator.next(BAMFileReader.java:1033)
	at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:574)
	at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:553)
	at com.hartwig.hmftools.common.samtools.BamSlicer.slice(BamSlicer.java:68)
	at com.hartwig.hmftools.common.samtools.BamSlicer.slice(BamSlicer.java:52)
	at com.hartwig.hmftools.cobalt.count.BamReadCounter.sliceRegionTask(BamReadCounter.java:154)
	at com.hartwig.hmftools.cobalt.count.BamReadCounter.lambda$createFutures$1(BamReadCounter.java:141)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)

@Pranav-Garg
Copy link
Author

(I realize that GRIDSS is not developed by HMF but I thought I should document anyway)

@p-priestley
Copy link
Contributor

Thank you for the info. Can you give an example of a faulty CIGAR that caused this issue?

FYI - we are working on a new SV caller which will replace GRIDSS in our pipeline and will not be updating GRIDSS further

@Pranav-Garg
Copy link
Author

Here is such a string: 21M46I46D2I81M (I mentioned it above as well)

@charlesshale
Copy link
Contributor

To turn of this strict validation in Cobalt, pass in this argument:
-bam_validation SILENT

@Pranav-Garg
Copy link
Author

For any future readers, I also note that GRIDSS has a similar option:
--picardoptions "VALIDATION_STRINGENCY=SILENT"
And LENIENT can be used in place of SILENT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants