Skip to content

When you say 'jump', jobservant says 'how many cores do you want?'

License

Notifications You must be signed in to change notification settings

cwant/jobservant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

jobservant

Build Status

This package submits and monitors the progress of jobs on HPC clusters via python.

TL;DR

Checkout the Jupyter notebook demos in the demo directory. You should run your notebooks inside an SSH agent with keys loaded, and you'll need to have the python packages paramiko and python-i18n[YAML] installed. There are two environment variables you might want to set before running your notebooks:

  • JOBSERVANT_CLUSTER: the network name of the cluster you are connecting to.
  • JOBSERVANT_ACCOUNTING_GROUP: the Slurm accounting group you will be using to submit jobs.

The individual notebooks will outline any additional requirements that may be needed.

My setup

jobservant relies on a package called paramiko to handle SSH communication between the remote cluser and your computer. In order for this work, we need to set up some SSH keys to work via an ssh agent. Here are pretty much all of the steps that I would take (using Linux/Mac) to get the whole thing working. This is not the only way to get things working, but it's the way I would set things up.

SSH setup

  • Create an SSH key if you don't already have one. This is done with the command ssh-keygen -t rsa. By default, this will create a private key in the file ~/.ssh/id_rsa and a public key in ~/.ssh/id_rsa.pub. For the sake of security, please give your private key a passphrase when prompted to do so (otherwise, if your computer is compromised, intruders may also get access to the remote cluster).
  • Put the public key in the file ~/.ssh/authorized_keys on the remote cluster. That is, put the contents of id_rsa.pub in that file on the remote machine -- DO NOT put your private key (id_rsa) there!
  • Start an SSH agent on your computer. An SSH needs a (usually interactive) program to run in. My preference is to run it in tmux, but you could also run it in bash if you like. So we do either ssh-agent tmux or ssh-agent bash at the command line.
  • Add your keys to the agent by running ssh-add in the program your agent is running in (tmux or bash). You will be prompted to supply the passphrase you used when creating your SSH key. Your SSH key is now ready to use, and you can SSH to the remote cluster without a password within either tmux or bash

Python virtual enviroment

  • I like to use python 3.7 currently, but probably python 3.5 or 3.6 will do.
  • Create a python virtual environment on your computer using virtualenv --no-download ~/virtualenv-jobservant
  • Activate the virtual environment with source ~/virtualenv-jobservant. (Your command line prompt should change.) All packages you install should be local to the virtual environment (not installed globally), and the virtual environment will be activated until you issue the command deactivate (don't do that now though).
  • Install some packages: pip3 install jupyter paramiko
  • The higher-level Jupyter code has internationalization (i18n) support, and depends on the python-i18n package with YAML support, so install that with pip3 install python-i18n[YAML]
  • For the K-Means clustering demo, you might want to install a few additional packages: pip3 install numpy pandas matlplotlib sklearn.

Running Jupyter

  • I don't like to hardcode server names or accounting groups in my notebooks, so instead I put them in environment variables (this make it easier to switch clusters without modifying my notebooks). As mentioned above, the demos use the environment variables JOBSERVANT_CLUSTER and JOBSERVANT_ACCOUNTING_GROUP to set these (or if you really want, you can hardcode these in the notebooks instead -- it's up to you). With this in mind, here is how I start my jupyter notebook server: JOBSERVANT_CLUSTER='foo.whatever.org' JOBSERVANT_ACCOUNTING_GROUP='gabba-gabba-hey' jupyter notebook
  • A browser should spawn for you, click on the demo directory and check things out.

Code documentation

Expanded documentation coming soon, but currently there are three main classes:

  • ClusterAccount: represents the user's account on an HPC cluster. Depends only on paramiko. From module jobservant.cluster_account.
  • ClusterJob: represents a computational job to be run on an HPC cluster. Owned by a user's account. Depends only on paramiko. From module jobservant.cluster_job.
  • JobPresenter: a class to help interface with job information in a Jupyter notebook. Depends on jupyter and python-i18n[YAML]. From module jobservant.jupyter.job_presenter.

About

When you say 'jump', jobservant says 'how many cores do you want?'

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages