Skip to content

TORQUE_PBS SetUp

Shicheng Guo edited this page Nov 30, 2019 · 1 revision

Working on HPC systems

Below are directions for running ipyrad on a TORQUE based system (one that uses qsub to connect to computing nodes). There are two main ways of running jobs, either interactively or with a submission script. I show both ways below. Currently, connecting to multiple nodes with ipyparallel is complicated, but when 5.0 comes out (it's already out, will be on conda very soon) it will be a cinch. So for now, instead of connecting to multiple nodes just ask for one node with many CPUs.

First, install ipyrad locally

## download miniconda
wget https://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh

## Install miniconda. Follow the directions, by default it will propose installing
## to your home directory, which should be fine, e.g., `/home/user/miniconda2`
## When asked yes/no whether to append the miniconda directory to $PATH, say yes.
bash Miniconda-latest-Linux-x86_64.sh

## You could now quit and reconnect, or just run the following command 
## which reloads .bashrc so that miniconda will now be in your path. This
## way the conda program can be found and run by calling conda.
source ~/.bashrc

## upgrade conda
conda upgrade conda

## install ipyrad
conda install -c ipyrad ipyrad

Loading modules

Because ipyrad is installed with conda, all required software will be installed locally, including Python 2.7, all required Python modules, and other dependencies such as MPI and HDF5. Because they are local you don't need to run any kind of module load command to load software that was installed by administrators on the system.

Connect interactively

Personally, I prefer working interactively (with qsub arg -I). The example below shows an example of using the CLI, but one could similarly open an interactive IPython session and use the API as well. Because assembling a data set in ipyrad can be split into several steps, one can login to a node with fast access but short wall times to do testing or to run fast-running steps (not step 3) on average sized data sets.

## connect to the head node
ssh user@cluster

## run the unix command 'screen' so you're able to disconnect & reconnect 
## from compute nodes. This way you don't have to stay connected.
screen

## from the head node connect to compute nodes interactively
## in this example we connect to 3 nodes with 8 processors each
## for a total of 24 processors. You could also use additional -l 
## args to designate wall-time or memory allocation.
qsub -I -q [queue-name] -l nodes=3:ppn=8 -l walltime=24:00:00

## start a long-running ipyrad job. You need to use the --MPI arg
## for ipyrad to connect to processors split across multiple nodes.
## By default it will use all available processors (here, 24), if for
## some reason you wanted to use fewer you can designate with the -c arg
ipyrad -p params.txt --MPI

## press ctrl-d to disconnect from compute node while its running
## or stay connected and wait for job to finish
## if you disconnect, you can reconnect later by logging into the 
## head node and running the screen re-connect command.
screen -r 

Submit jobs to run remotely

Create a qsub script using a text editor:

#!/bin/sh 

### Set the job name
#PBS -N ipyrad_test

### Combine standard error and standard out to one file.
#PBS -j oe

### Have PBS mail you results
#PBS -m ae
#PBS -M [email protected]

### Specify the queue name. 
#PBS -q fas_high

### Specify N processors & N nodes.
#PBS -l nodes=2:ppn=8

### Specify wall-time
#PBS -l walltime=72:00:00 

### Run the ipyrad job
ipyrad -p params.txt -s 123 --MPI

Now submit the job with a qsub script

qsub qscript.sh