Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain reference md5s from bam to gvcf #5746

Open
EvanTheB opened this issue Mar 4, 2019 · 1 comment
Open

Maintain reference md5s from bam to gvcf #5746

EvanTheB opened this issue Mar 4, 2019 · 1 comment

Comments

@EvanTheB
Copy link

EvanTheB commented Mar 4, 2019

GATK at some point started printing the reference contig md5s in the bam header. This is great.

@SQ     SN:chr1 LN:248956422    AS:38   M5:6aef897c3d6ff0c78aff06ac189178dd     UR:/seq/references/Homo_sapiens_assembly38/v0/Homo_sapiens_assembly38.fasta     SP:Homo sapiens

However when I use HaplotypeCaller to create my gvcf, only the reference length and name is shown, the md5 is dropped. I don't know if this is for a technical reason, but it would be great to add the md5s to the gvcf header.

##contig=<ID=chr1,length=248956422,assembly=38>

Maybe if they cannot be added as a contig line, they can be added as a comment line in the gvcf header.

@cmnbroad
Copy link
Collaborator

cmnbroad commented Mar 4, 2019

Needs verification, but very likely this is due to samtools/htsjdk#730. This was fixed in samtools/htsjdk#835, which was never merged, but could be fixed independently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants