-
Notifications
You must be signed in to change notification settings - Fork 588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write CRAM on Spark. #1488
Write CRAM on Spark. #1488
Conversation
b926e6b
to
42af603
Compare
@@ -83,24 +88,58 @@ public SparkHeaderlessBAMOutputFormat() { | |||
} | |||
} | |||
|
|||
// Output format class for writing CRAM files through saveAsNewAPIHadoopFile. Must be public. | |||
public static class SparkCRAMOutputFormat extends KeyIgnoringCRAMOutputFormat<NullWritable> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is identical to SparkBAMOutputFormat
-- is it only a separate class because setHeader()
is static and we might want a different header for the two output formats, or is there some deeper reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its only because the two classes have different base classes that live in Hadoop-BAM. I don't see any useful way to consolidate them here, though going forward we may want to change AnySAMOutputFormat in Hadoop-BAM to delegate based on file extension similar to the way AnySAMInputFormat does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created HadoopGenomics/Hadoop-BAM#72.
Review complete -- back to @cmnbroad. Merge after addressing comments. |
bb87c81
to
3573d65
Compare
3573d65
to
55faeb8
Compare
#1270. Depends on #1469, which should be reviewed first and then I can remove the first commit from this branch.