Skip to content

⛰ covtobed | Convert the coverage track from a BAM file into a BED file

License

Notifications You must be signed in to change notification settings

telatin/covtobed

Folders and files

NameName
Last commit message
Last commit date

Latest commit

e0eb841 · Apr 19, 2022
Oct 15, 2021
May 29, 2020
Jul 30, 2020
Dec 15, 2021
Jan 7, 2020
Apr 13, 2020
Feb 27, 2020
Dec 15, 2021
Sep 8, 2020
Jul 9, 2020
May 29, 2020
Feb 16, 2022
May 28, 2020
Dec 9, 2019
Apr 19, 2022
Dec 17, 2021
Dec 15, 2021
Dec 9, 2019

Repository files navigation

covtobed

install with bioconda Bioconda installs covtobed Codacy Badge

status License

a tool to generate BED coverage tracks from BAM files

Reads one (or more) alignment files (sorted BAM) and prints a BED with the coverage. It will join consecutive bases with the same coverage, and can be used to only print a BED file with the regions having a specific coverage range.

📖 Read more in the wiki - this is the main documentation source

Features:

  • Can read (sorted) BAMs from stream (like bwa mem .. | samtools view -b | samtools sort - | covtobed)
  • Can print strand specific coverage to check for strand imbalance
  • Can print the physical coverage (with paired-end or mate-paired libraries)

ℹ️ For more features, check the BamToCov suite.

covtobed example

Usage

📖 The complete documentation is available in the GitHub wiki.

Synopsis:

Usage: covtobed [options] [BAM]...

Computes coverage from alignments

Options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  --physical-coverage   compute physical coverage (needs paired alignments in input)
  -q MINQ, --min-mapq=MINQ
                        skip alignments whose mapping quality is less than MINQ
                        (default: 0)
  -m MINCOV, --min-cov=MINCOV
                        print BED feature only if the coverage is bigger than
                        (or equal to) MINCOV (default: 0)
  -x MAXCOV, --max-cov=MAXCOV
                        print BED feature only if the coverage is lower than
                        MAXCOV (default: 100000)
  -l MINLEN, --min-len=MINLEN
                        print BED feature only if its length is bigger (or equal
                        to) than MINLELN (default: 1)
  -z MINCTG, --min-ctg-len=MINCTG
                        skip reference sequences having size less or equal to MINCTG
  -d, --discard-invalid-alignments
                        skip duplicates, failed QC, and non primary alignment,
                        minq>0 (or user-defined if higher) (default: 0)
  --output-strands      output coverage and stats separately for each strand
  --format=CHOICE       output format

Example

Command:

covtobed -m 0 -x 5 test/demo.bam

Output:

[...]
NC_001416.1     0       2       0
NC_001416.1     2       6       1
NC_001416.1     6       7       2
NC_001416.1     7       12      3
NC_001416.1     12      18      4
NC_001416.1     169     170     4
NC_001416.1     201     206     4
[...]

See the full example output from different tools 📂 here

Install

  • To install with Miniconda:
conda install -c bioconda covtobed
  • Both covtobed, and the legacy program coverage are available as a single Docker container available from Docker Hub Docker build:
sudo docker pull andreatelatin/covtobed
sudo docker run --rm -ti andreatelatin/covtobed coverage -h
  • Download Singularity image by singularity pull docker://andreatelatin/covtobed, then:
singularity exec covtobed.simg coverage -h

Startup message

When invoked without arguments, covtobed will print a message to inform the user that it is waiting for input from STDIN. To suppress this message, set the environment variable COVTOBED_QUIET to 1.

Performance

covtobed is generally faster than bedtools. More details are in the benchmark page.

Requirements and compiling

This tool requires libbamtools and zlib.

To manually compile:

c++ -std=c++11 *.cpp -I/path/to/bamtools/ -L${HOME}/path/to/lib/ -lbamtools -o covtobed

Issues, Limitations and how to contribute

  • This program will read the coverage from sorted BAM files. The CRAM format is not supported at the moment.
  • If you find a problem feel free to raise an issue, we will try to address it as soon as possible
  • Contributions are welcome via PR.

Acknowledgements

This tools uses libbamtools by Derek Barnett, Erik Garrison, Gabor Marth and Michael Stromberg, and cpp-optparse by Johannes Weißl. Both tools and this program are released with MIT license.

Authors

Giovanni Birolo (@gbirolo), University of Turin, and Andrea Telatin (@telatin), Quadram Institute Bioscience.

This program was finalized with a Flexible Talent Mobility Award funded by BBSRC through the Quadram Institute.

Citation

If you use this tool, we would really appreciate if you will cite its paper:

Releases after 1.3 (inclusive):

Giovanni Birolo, Andrea Telatin, BamToCov: an efficient toolkit for sequence coverage calculations, Bioinformatics, 2022

Releases up to 1.2:

Birolo et al., (2020). covtobed: a simple and fast tool to extract coverage tracks from BAM files. Journal of Open Source Software, 5(47), 2119, https://doi.org/10.21105/joss.02119