-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make VCFHeader not throw exception if contig header lines lack length field #1418
Make VCFHeader not throw exception if contig header lines lack length field #1418
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1418 +/- ##
===============================================
+ Coverage 68.122% 68.134% +0.012%
- Complexity 8382 8384 +2
===============================================
Files 573 573
Lines 33989 33992 +3
Branches 5668 5668
===============================================
+ Hits 23154 23160 +6
+ Misses 8645 8644 -1
+ Partials 2190 2188 -2
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative solutions suggested. Back to @cwhelan .
public SAMSequenceRecord getSAMSequenceRecord() { | ||
final String lengthString = this.getGenericFieldValue("length"); | ||
if (lengthString == null) throw new TribbleException("Contig " + this.getID() + " does not have a length field."); | ||
if (lengthString == null) return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SAMSequenceRecord has an UNKNOWN_SEQUENCE_LENGTH
value specifically for the case where the length is missing. It would be preferable to use that here and return a SAMSequenceRecord
, rather than returning null in cases where it previously would have thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth adding a comment noting the SAMSequenceRecord.isSameSequence treats a record created this way (with a length of UNKNOWN_SEQUENCE_LENGTH
) as matching another sequence record with the same name but a valid length.
Thanks for pointing me to |
@@ -270,7 +270,9 @@ public SAMSequenceDictionary getSequenceDictionary() { | |||
|
|||
final List<SAMSequenceRecord> sequenceRecords = new ArrayList<SAMSequenceRecord>(contigHeaderLines.size()); | |||
for (final VCFContigHeaderLine contigHeaderLine : contigHeaderLines) { | |||
sequenceRecords.add(contigHeaderLine.getSAMSequenceRecord()); | |||
final SAMSequenceRecord samSequenceRecord = contigHeaderLine.getSAMSequenceRecord(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you end up with nulls here? Shouldn't it be returning lines with UNKNOWN_SEQUENCE_LENGTH?
@@ -261,7 +261,7 @@ public void addMetaDataLine(final VCFHeaderLine headerLine) { | |||
|
|||
/** | |||
* Returns the contigs in this VCF file as a SAMSequenceDictionary. Returns null if contigs lines are | |||
* not present in the header. Throws SAMException if one or more contig lines do not have length | |||
* not present in the header. Returns null if one or more contig lines do not have length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this comment is true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cwhelan I think you forgot to update VCFHeader when you changed the behavior to return UNKNOWN_LENGTH_SEQUENCE
Looks good otherwise. |
… SAMSequenceRecord.UNKNOWN_SEQUENCE_LENGTH
…e.getSAMSequenceRecord will no longer return null
59679d8
to
ef6f244
Compare
Sorry for the oversight @lbergelson. Pushed a new commit to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Description
VCFHeader.getSequenceDictionary
currently throws an exception if one or more of theContig
header lines does not have a length field, despite the fact that:This generally has the effect of making htsjdk tools unable to process files missing length fields without re-headering, which is not always feasible, for example in the case of large multi-sample files in cloud storage.
Fixes #389
Things to think about before submitting: