##RNASequel
RNAsequel runs as a post-processing step on top of an RNA-seq aligner and systematically detects and corrects many common alignment artifacts. Its key innovations are a two-pass splice junction detection system that combines de novo splice junction prediction with a splice junction database, and the use of an empirically determined estimate of the fragment size distribution for use in resolving read pairs.
##RNASequel Dependencies
- Boost (www.boost.org)
- Samtools (http://sourceforge.net/projects/samtools/files/samtools/0.1.19/). Note: Newer version of samtools are current not supported.
- GCC Version 4.7+ (for -std=c++11 support)
##Installation
The easiest way to obtain the RNASequel source code is to download the latest release. You can also clone the latest version of the repository using:
git clone https://github.com/GWW/RNASequel.git
RNASequel can be built by typing
cd <RNASequel Directory>
make
cp src/rnasequel <install directory>
The dependencies for RNASequel can be specified using the following variables:
BOOST_ROOT -- The path to the boost library root (default: /usr)
LIBBAM_ROOT -- The path to the libbam.a and the samtools header files (default: /usr)
BOOST_SUFFIX -- The version / compiler suffix used on boost library includes (Not usually necessary)
LDADD -- Extra libraries, for example, on some systems -lrt needs to be included
STATIC -- 0 by default 1 to compile a statically linked binary
For Example:
make BOOST_ROOT=/usr/local LIBBAM_ROOT=/usr/local
##Usage: rnasequel [command] options
###Commands:
- index Reference genome fasta file indexing
- transcriptome Transcriptome index generation
- merge Reference / Transcriptome alignment merging
Additional command line options can be viewed by using the -h flag for example:
rnasequel merge -h
##Notes: For merging the alignments RNASequel requires that the bam files be sorted lexicographically using the same sorted scheme as samtools sort -n This can be most easily accomplished by renaming the reads 1..N prior to the alignment bamfiles can also be sorting using samtools sort -n prior to running the rnasequel merge command.
##Example Usage with BWA-mem
#Index the genome fasta file this only has to be done once
rnasequel index genome.fa
bwa index genome.fa
#Generate a transcriptome using a genes.gtf file and a denovo_alignment by STAR or another spliced read aligner
rnasequel transcriptome -g genes.gtf -r genome.fa –n 76 -b denovo_alignment.bam -o tx
# Index the transcriptome using BWA
bwa index tx.fa
# Map read 1 and 2 individually to the reference genome
bwa mem –L 2,2 -k 15 -a -t 8 -B 2 genome.fa {reads1 or 2} | samtools view -bS - > {ref 1 or 2.bam}
# Map read 1 and 2 individually to the transcriptome
bwa mem –L 2,2 -c 20000 -M -k 15 -a -t 8 -B 2 tx.fa {read 1 or 2} | samtools view -bS -F 4 - > {juncs 1 or 2.bam}
#Merge the alignments and resolve the pairs
rnasequel merge -r genome.fa -g genes.gtf -f tx.txt -o align.bam ref1.bam juncs1.bam ref2.bam juncs2.bam