Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for CSI indexes #284

Closed
nathanhaigh opened this issue Mar 7, 2017 · 6 comments
Closed

Support for CSI indexes #284

nathanhaigh opened this issue Mar 7, 2017 · 6 comments

Comments

@nathanhaigh
Copy link
Contributor

What are the chances of Sambamba supporting .csi index files?

I work with cereal crops where this is starting to be an issue. Many plant species have chromosomes in excess of the ~500Mbp limits of the .bai index.

@lomereiter
Copy link
Contributor

It should be straightforward to add.

<rant>
It's just very annoying to deal with hts-specs where half of the specification is in the PDF, and another half is to be found somewhere in htslib source code (samtools/hts-specs#70). Not sure if it's better with CRAM but that's one of the reasons I gave up and used htslib to read/write it.
</rant>
Fortunately, @kortschak went through this pain already, so at least biogo implementation is readable.

@nathanhaigh
Copy link
Contributor Author

Nice one @kortschak! Any chance you could provide some help/input here if needs be? Could meet up for a coffee if you are willing - I'll buy! :)

@kortschak
Copy link

@nathanhaigh There is not much to add here. I have implemented what is described in the struct definition document and done some looking through the code to clarify things. In the absence of any input from the hts-specs authors I don't see any other things to do, but they are too busy Actually Doing Science to provide a spec for that that could be worked from. Sorry.

I'd be happy to discuss though if you are in town.

@pjotrp
Copy link
Member

pjotrp commented Mar 12, 2017

I think we need to appreciate that writing specs and source code is Actually Doing Science.

Toward effective software solutions for big biology, by Pjotr Prins, Joep de Ligt, Artem Tarasov, Ritsert C Jansen, Edwin Cuppen and Philip E Bourne, Nature Biotechnology, 2015; 33, 686-687 (higly accessed) http://www.nature.com/nbt/journal/v33/n7/full/nbt.3240.html

@kortschak
Copy link

@pjotrp Yes, that phrase is a quote that was used to explain why the specs are the way they are. I agree entirely that any kind of science should be reproducible and reviewable, including the code that is used.

@pjotrp
Copy link
Member

pjotrp commented Nov 8, 2017

Feel free to pick this up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants