-
Notifications
You must be signed in to change notification settings - Fork 0
SajadMirzaei/RentPlus
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
# RentPlus v0.5 INTRODUCTION : This program, RentPlus, is a tool for reconstructing local tree topologies for a set of population SNP haplotypes undergoing recombination. Due to recombination, tree topologies change as one moves accross the genome. RentPlus extends previous program RENT which uses a novel search method to infer the local trees, one for each genomic region near a SNP site. RentPlus takes a matrix of binary haplotypes as input, and outputs a list of tree topologies, and TMRCAs for each SNP site. CAUTION: Note: RentPlus assumes infinite sites model of mutations. Thus, if your data has many recurrent mutations, then RentPlus may not work well. Moreover, our experience indicates that RentPlus works better if the SNP sites are closer spaced. That is, RentPlus may perform poorly for very sparse data. At last, the trees outputed by RentPlus are rooted. However, the user should be careful about where the root should be placed in the tree although current rooting is improved. RentPlus is written in java. --------------------------------------------------------------------------- SYNOPSIS : java -jar RentPlus.jar [<options>] <data-filename> [<ms-file-name>] OPTIONS : (Optional) use "-t" to estimate branch lengths for local trees. use "-h" or "--help" to show hints. use "-l <sequence length>" if you use proportional positions (between 0-1). DATA FILE : The first line of the data file contains the SNP site positions separated by a space. Positions can be either doubles between 0 and 1 or integers (exact site positions). Note: exact site positions required for more accurate branch length estimates If no line for positions is provided, RentPlus divides the sequence length equally (not recommended) See the example data file for what is expected by RentPlus. The data should be in 0,1. (Do NOT leave white space between columns.) Each sequence should be placed in a its own row. No SPACE between two values, please. MS FILE : (Optional) Test how accurate by comparing against a true list of trees (as outputed from Hudson's program MS with -T option turned on). This allows comparison of the inferred tree against the TRUE trees. OUTPUT : Output trees to Newick format. The trees (one corresponds to each site) are stored in a file starting with the same prefix as the inputfile (<data-filename>.trees) TMRCAs are reported in <data-filename>.tmrcas if -t option is used. EXAMPLE : An example data set is included. This example file is generated by program MS by setting theta=15 and and rho=10. For reference, the complete ms file is also included (with the true trees output by MS with -T turned on). CONTACT : Please send bug reports and technical questions to Sajad Mirzaei at <[email protected]>.