Intergenic selection paper

This repository contains the code for running the analysis in the paper:

##The large majority of intergenic sites in bacteria are selectively constrained, even when known regulatory elements are excluded

Preprint available here:

http://biorxiv.org/content/early/2016/08/15/069708

##Dependencies

Linux system
Perl >= 5.8.3
GNU Parallel

##Instructions

###File structure

Clone the project into a local directory:

cd /somewhere/on/your/system
git clone https://github.com/harry-thorpe/Intergenic_selection_paper

Move into the project base directory:

cd Intergenic_selection_paper

Move into the Data directory and create input file directories:

cd Data
mkdir Reference_files
mkdir Alignment_files
mkdir GFF_files
mkdir Promoter_files
mkdir Terminator_files
cd ..

###Input files

To run the full analysis, you must transfer appropriate input files to the appropriate directories. The analysis can be run on multiple species, but the following instructions assume you are using My_species.

Reference files - These should be in fasta format, with upper-case characters only (ATGCN), and the whole sequence on a single line. They should be called My_species_reference.fasta.
Alignment files - These should be alignments in fasta format, aligned to the reference file, with upper-case characters only (ATGCN), and each sequence on a single line. The first sequence in the alignment should be the reference sequence. They should be called My_species_alignment.fasta.
GFF files - These should be GFF files produced by Prokka. They should be called My_species.gff.
Promoter files - These should be promoter annotation files for the reference genome used downloaded from http://pepper.molgenrug.nl/index.php/genome2d. They should be called My_species_promoters.tab.
Terminator files - These should be terminator annotation files for the reference genome used downloaded from http://pepper.molgenrug.nl/index.php/genome2d. They should be called My_species_terminators.tab.

###Running the analysis

To run the analysis, you will have to edit two lines in the script Intergenic_selection_paper/Analysis_script.sh:

Line 4 - Change the variable $base_dir to the directory where you have cloned the repository to.
Line 7 - Change the species within $species_array to those which you want to analyse. These must be compatible with the input files.

Run the analysis:

bash Analysis_script.sh

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
Analysis		Analysis
Data		Data
Figures		Figures
Analysis_script.sh		Analysis_script.sh
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intergenic selection paper

About

Releases

Packages

Languages

License

harry-thorpe/Intergenic_selection_paper

Folders and files

Latest commit

History

Repository files navigation

Intergenic selection paper

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages