Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CSI as well as BAI indexes #447

Closed
peterjc opened this issue Jan 22, 2016 · 9 comments
Closed

Support CSI as well as BAI indexes #447

peterjc opened this issue Jan 22, 2016 · 9 comments

Comments

@peterjc
Copy link

peterjc commented Jan 22, 2016

The original BAM index format (BAI) has a fixed bin size system, and suffers from a maximum reference sequence length of 512Mbp making it unsuitable for large chromosomes (e.g. many plants, some marsupials). The newer CSI indexing format supports longer references, and also different bin sizes - which ought to help with ultra-high coverage in RNA sequencing or viral samples.

On the C side, htslib and samtools now support both BAI and CSI indexes.

Currently htsjdk only appears to support BAI indexes.

The CSI specification lives at https://github.com/samtools/hts-specs/blob/master/CSIv1.tex and http://samtools.github.io/hts-specs/CSIv1.pdf and http://samtools.github.io/hts-specs/CSIv2.pdf

@jmarshall
Copy link
Member

Ignore CSIv2.pdf for the time being. It is a draft from an experimental branch, and CSI v2 may well be different from what that document describes when it finally lands.

@nathanhaigh
Copy link

Any progress on this? It would be good to be able to load large (~1Gbp) plant pseudomolecules into IGV using CSI indexes.

@nathanhaigh
Copy link

What is required to add CSI support? I may be able to get a colleague to work on this but will need some pointers as to where to start etc.

@jrobinso
Copy link
Contributor

Hey all any updates? I might work on this myself but would like to know the status, if any, first. As this is still open I assume that CSI is not yet supported.

@nathanhaigh
Copy link

Not as far as I'm aware. I'm trying to get CSI support in other tools like Sambamba: biod/sambamba#284

@nathanhaigh
Copy link

Have you begun working on this yet @jrobinso ?

@jrobinso
Copy link
Contributor

@nathanhaigh Not yet. IGV is taking all my time, very little left to contribute to htsjdk.

@homonecloco
Copy link

Are there any updates on this?

@lbergelson
Copy link
Member

Htsjdk now supports reading CSI indexes as of 2.19.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants