-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large contigs missing from AlignmentHeader.lengths #741
Comments
Could it be related to this bug? #732 |
|
ah, well, doing this with 2147483647 is fine, so that should realistically be large enough. Thanks! |
There are quite a few organisms with single chromosomes longer than this value…
|
There is not a simple fix because BAM binary format defines pos as an int32_t (https://github.com/samtools/hts-specs/blob/master/SAMv1.pdf). We did consider a BAMv2 but this would take some time for the ecosystem to adopt and thus people were resistant (samtools/hts-specs#240). HTSlib currently uses the BAM format as an interface and thus is also limited. Whilst I know the HTSlib team have been conversing about this I'm not sure if we've actually reached a solution. |
This happens for instance for the following file:
If I load this up in pysam I only see the second, short contig in f.header.lengths, while
f.header.to_dict()
preserves the large contig.I suspect this might be a htslib issue, as
target_len
is defined asuint32_t
. I just tried changing all those instances touint64_t
, but I don't know enough to to figure out if/what I missed (fea8c81)The text was updated successfully, but these errors were encountered: