-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make picopili portable #56
Conversation
… Stephan for my .pl changes; misc bugfixes
This has been the de facto main branch for a few months, and has performed well over that time. Have been a number of bug fixes over that time (as evident from commit history on this pull), and should be fairly stable at this point. At minimum, is as least as stable as the current As such, merging this upward to |
Update install info to reflect move to portable version (#56). Add contact info.
Fully removes dependency on ricopili (https://github.com/Nealelab/ricopili), and attempts to make picopili more easily portable to other cluster environments.
Primary structural changes
Accomplishing the separation involves some major reworks:
~/picopili.conf
separate fromricopili.conf
(See Create separate config file for picopili #35). The general format is kept very consistent with ricopili, but includes new dependencies (.e.g. Admixture, Primus)config_pico.pl
This is directly derived fromrp_config
, including providing default locations where available from previous ricopili installations, but with some streamlining to hopefully require fewer restarts to complete configuration (see below about changing perl dependency structure).get_refs.sh
has been added to manage either linking to local copies (for environments with existing ricopili installs) or downloading them from a hosted copy.blueprint.py
replacingblueprint_pico.pl
previously migrated from ricopili. A partial command line interface for ricopili-style job submission is maintained for the sake of theimp_prep.pl
scripts, but most submission now operates through the python function call. Compared toblueprint_pico.pl
, the code is substantially streamlined to avoid any hard-coding of job submission mechanics. Instead, cluster configuration is managed through a config file and a template job script placed in./cluster_templates/
. Substantial attention has been given to trying to be a good citizen of cluster that dispatch parallel jobs per-machine rather than per-CPU (e.g. Lisa). Example configurations are provided for Broad and Lisa. More documentation of this system for adding additional clusters is pending.Features
Unrelated to these portability/structural changes, there's one noteworthy feature updates:
admix_rel.py
for Admixture 1.3's more principled projection of admixture solutions from the unrelated subset to the full sample, rather than selecting population "exemplars" from the unrelateds to run a supervised solution. This behavior is now default, but the old approach is still accessible with--use-exemplars
.admix_rel.py
also now allows starting from a existing admixture solution for unrelated individuals (supplied with--admix-p
specifying output.P
file). Should benefit resubmission of jobs that crash downstream, or have run an initial admixture solution in some other context.Minor changes
Additional minor changes accompanying the structural changes include:
Utils.pm
from ricopili, and moving perl dependencies from an environment variable (previouslyrp_perlpackages
) to the config../docs/
.pickle
d submission info) and it's at least a bit more responsive than needing to manually edit the scripts involved.picopili.conf
(implemented as a naive check for an@
), and is intentionally caught so that the mail system doesn't get invoked with the bad email.imp_prep.pl
, the primary legacy code kept from ricopili, to use local log files rather than interacting with ricopili logs..py
scripts, to hopefully reduce update bugs caused by omitting required imports/variable initiatialization..pl
files) or argparse defaults.Final note on ricopili
There's no desire to abandon ricopili here, and the connection between the two programs remains clearly documented in
./docs/RICOPILI.md
to recognize that we're building very directly on that previous work.Removing the formal code dependence should make it easier share and maintain this project, especially since that was a pre-requisite to making picopili portable to environments that don't currently have fully supported ricopili installations.
The separation will probably make some tasks more difficult (e.g. incorporating future updates to ricopili, see #52) but it should be a net improvement, especially for usability.