-
Notifications
You must be signed in to change notification settings - Fork 30
Usage
Pyxis being a SPANK plugin, the new command-line arguments it introduces are directly added to srun.
$ srun --help
...
--container-image=[USER@][REGISTRY#]IMAGE[:TAG]|PATH
[pyxis] the image to use for the container
filesystem. Can be either a docker image given as
an enroot URI, or a path to a squashfs file on the
remote host filesystem.
--container-mounts=SRC:DST[:FLAGS][,SRC:DST...]
[pyxis] bind mount[s] inside the container. Mount
flags are separated with "+", e.g. "ro+rprivate"
--container-workdir=PATH
[pyxis] working directory inside the container
--container-name=NAME [pyxis] name to use for saving and loading the
container on the host. Unnamed containers are
removed after the slurm task is complete; named
containers are not. If a container with this name
already exists, the existing container is used and
the import is skipped.
--container-save=PATH [pyxis] Save the container state to a squashfs
file on the remote host filesystem.
--container-mount-home [pyxis] bind mount the user's home directory.
System-level enroot settings might cause this
directory to be already-mounted.
--no-container-mount-home
[pyxis] do not bind mount the user's home
directory
--container-remap-root [pyxis] ask to be remapped to root inside the
container. Does not grant elevated system
permissions, despite appearances.
--no-container-remap-root
[pyxis] do not remap to root inside the container
--container-entrypoint [pyxis] execute the entrypoint from the container
image
--no-container-entrypoint
[pyxis] do not execute the entrypoint from the
container image
--container-writable [pyxis] make the container filesystem writable
--container-readonly [pyxis] make the container filesystem read-only
--container-env=NAME[,NAME...]
[pyxis] names of environment variables to preserve
from the host environment
This argument activates the Pyxis plugin and containerizes the submitted job. If no container registry is specified, the image will be pulled from Docker Hub:
$ srun --container-image=centos grep PRETTY /etc/os-release
PRETTY_NAME="CentOS Linux 8 (Core)"
You can pull the container image from any container registry, like you would do with the docker CLI:
$ srun --container-image nvcr.io/nvidia/pytorch:20.03-py3
You can use a squashfs file (from --container-save
or enroot export
) by passing its path as the argument:
$ srun --container-image ~/ubuntu.sqsh
If this file is on a shared filesystem, this is is useful for avoiding to pull the same image on all nodes of your cluster.
This argument can be used to expose folders or files from the host system to the container. It is similar to the -v
(or --mount type=bind
) argument of docker run
.
For instance, to bind-mount the /mnt
folder from the host as /data
inside the container:
$ srun --container-image ubuntu --container-mounts /mnt:/data ls /data
Using the same syntax, you can also mount files:
$ srun --container-image ubuntu --container-mounts /etc/os-release:/host/os-release cat /host/os-release
If the source and destination are identical, you can use the short-form with a single path:
$ srun --container-image ubuntu --container-mounts /mnt ls /mnt
You can also use relative paths (using the job's current working directory):
$ srun --container-image ubuntu --container-mounts ./config:/root/config cat /root/config
Finally, you can use additional mount flags such as ro
(read-only), to prevent the container from unintentionally modifying the content from the host:
$ srun --container-image ubuntu --container-mounts /tmp/config:/root/config:ro sh -c 'echo oops > /root/config'
/usr/bin/sh: 1: cannot create /root/config: Read-only file system
This argument is used to save the state of the container filesystem, in order to reuse it across srun
commands. This is similar to docker run --name
, and it is used to run or install additional tools required by the application.
# The file utility is not installed by default.
$ srun --container-image=ubuntu:20.04 which file
srun: error: luna-0173: task 0: Exited with exit code 1
# The following command creates a named container with the name "myubuntu", starting from the ubuntu 20.04 image.
$ srun --container-image=ubuntu:20.04 --container-name=myubuntu sh -c 'apt-get update && apt-get install -y file'
# Use the container filesystem created above, you don't need to specify --container-image anymore.
$ srun --container-name=myubuntu which file
/usr/bin/file
If you don't need to add anything to the container, you can also use this argument combined with a no-op command like true
, to behave like docker pull
and docker create
, to prepare the container on all nodes before launching the application.
$ srun --container-image=ubuntu:20.04 --container-name=myubuntu true
If the container is running, --container-name
will behave like docker exec
. This is particularly useful on the login node of the cluster combined with --jobid
, to join a running container without having to ssh to the compute node:
# From a compute node, or inside a sbatch script
$ srun --container-name=myapp --container-mounts /mnt:/data ./myapp
# From the login node
$ export SLURM_OVERLAP=1 # when using Slurm 20.11
$ srun --jobid=432788 --container-name=myapp findmnt /data
TARGET SOURCE FSTYPE OPTIONS
/data /dev/nvme2n1p2[/mnt] ext4 rw,relatime,errors=remount-ro
As you will land in the same container, this approach can be used to debug or profile your app with gdb, perf_events, strace, etc.
Exports the container filesystem to a squashfs file after the job completes. This file can then be passed to --container-image
.
This option is useful to avoid storming a container registry with requests when running a large distributed job, all the nodes will pull the image simultaneously (unless the layers are cached already on some nodes).
Instead you can have a single job pull the container image, save it to a parallel filesystem, and then all the nodes can use the squashfs file from this shared filesystem.
$ srun --ntasks=1 --container-image nvcr.io#nvidia/pytorch:20.03-py3 --container-save /lustre/felix/pytorch.sqsh true
$ srun --nodes=128 --container-image /lustre/felix/pytorch.sqsh python train.py
This argument can also be useful to save the state of a container across jobs, as it is not possible with --container-name
.
By default, the working directory of the job will be taken from the container image (WORKDIR
in a Dockerfile). This argument is equivalent to docker run --workdir
and allows to override this path.
$ srun --container-image nvcr.io#nvidia/pytorch:20.03-py3 pwd
/workspace
$ srun --container-image nvcr.io/nvidia/pytorch:20.03-py3 --container-workdir /root pwd
/root
These arguments control whether the user will see themselves as UID 0 (root) or their usual UID when inside the container. This feature relies on the user namespaces feature of the Linux kernel, and thus the container is never granted additional privileges.
$ whoami
fabecassis
$ srun --container-image ubuntu:20.04 --container-remap-root whoami
root
$ srun --container-image ubuntu:20.04 --no-container-remap-root whoami
fabecassis
Being root inside the container is useful to install packages, or in general for any application that expects to be root (e.g. checks that the UID is 0).
Being yourself inside the container is useful for applications that refuse to run as root (like OpenMPI
), or scripts that get confused by the sudden change of UID and home directory.
There is a positive and negative form for this option, as the default behavior (when none of these arguments are used) depends on the pyxis configuration. The default behavior is to remap root, hence --container-remap-root
is a no-op in this situation.
These arguments control whether the user's home directory should be mounted inside the container. This can also be achieved with --container-mounts
.
If you are root inside the container (see above section), the directory will be mounted to /root
:
$ srun --container-image ubuntu --container-mount-home --container-remap-root sh -c 'echo $HOME ; ls $HOME'
/root
...
If your UID is unchanged inside the container, the directory will be mounted at the same location than the host:
$ srun --container-image ubuntu --container-mount-home --no-container-remap-root sh -c 'echo $HOME ; ls $HOME'
/home/fabecassis
...
Mounting your home directory inside of the container is useful if you need to use code or configuration files (such as your .bashrc
) stored outside of the container image.
There is a positive and negative form for this option, as the default behavior (when none of these arguments are used) depends on the enroot setting ENROOT_MOUNT_HOME
. Mounting the home directory by default can create problems (such as the user's .bashrc
overriding environment variables from the container), so it is not recommended.
These arguments control whether the entrypoint defined in the container image is executed (ENTRYPOINT
in the Dockerfile).
This is useful when the container image has an entrypoint that appends arguments to an existing binary. This is the pattern described as "allowing that image to be run as though it was that command (and then use CMD as the default flags)" in the docker documentation. When that's the case, pyxis will fail when starting the container, and then suggest to try with --no-container-entrypoint
$ srun --container-entrypoint --container-image myimage true
pyxis: importing docker image ...
pyxis: creating container filesystem ...
pyxis: starting container ...
slurmstepd: error: pyxis: container start failed with error code: 1
slurmstepd: error: pyxis: printing contents of log file ...
...
slurmstepd: error: pyxis: couldn't start container
slurmstepd: error: pyxis: if the image has an unusual entrypoint, try using --no-container-entrypoint
Entrypoints that follow the consistency recommendation (only wrapping the execution of a command) of Docker Hub will work fine, so --container-entrypoint
can be used to have the entrypoint executed. Some entrypoints serve a useful purpose such as injecting environment variables or preparing the container filesystem.
There is a positive and negative form for this option, as the default behavior (when none of these arguments are used) depends on the pyxis configuration. The default behavior is to not execute the entrypoint, hence --no-container-entrypoint
is a no-op in this situation.