JupyterLab image with VisualStudio Code server integrated, based on the jupyter/docker-stacks scipy image, with additional packages and kernels installed for data science and knowledge graphs.
List of features for the images available running on CPU
This is the base image with useful interfaces and libraries for data science preinstalled:
📋️ VisualStudio Code server is installed, and accessible from the JupyterLab Launcher
🐍 Python 3.8 with notebook kernel supporting autocomplete and suggestions (jupyterlab-lsp)
☕️ Java OpenJDK 11 with IJava notebook kernel
🐍 Conda and mamba are installed, each conda environment created will add a new option to create a notebook using this environment in the JupyterLab Launcher (with nb_conda_kernels
). You can create environments using different version of Python if necessary.
🧑💻 ZSH is used by default for the JupyterLab and VisualStudio Code terminals
The following JupyterLab extensions are also installed: jupyterlab-git, jupyterlab-system-monitor, jupyter_bokeh, plotly, jupyterlab-spreadsheet, jupyterlab-drawio.
Extended from ghcr.io/maastrichtu-ids/jupyterlab:latest
, it contains
✨️ SPARQL kernel to query RDF knowledge graphs
✨️ Apache Spark and PySpark are installed for distributed data processing
💎 OpenRefine is installed, and accessible from the JupyterLab Launcher
🦀 Oxigraph SPARQL database
⚡️ Blazegraph SPARQL database
☕️ Java .jar
programs for knowledge graph processing are pre-downloaded in the /opt
folder, such as RDF4J, Apache Jena, OWLAPI, RML mapper.
📈 R kernel
With those docker images, you can optionally provide the URL to a git repository to be automatically cloned in the workspace at the start of the container using the environment variable GIT_URL
The following files will be automatically installed if they are present at the root of the provided Git repository:
- The conda environment described in
environment.yml
will be installed, make sure you addedipykernel
andnb_conda_kernels
to theenvironment.yml
to be able to easily start notebooks using this environment from the JupyterLab Launcher page. See this repository as example. - The python packages in
requirements.txt
will be installed withpip
- The debian packages in
packages.txt
will be installed withapt-get
- The JupyterLab extensions in
extensions.txt
will be installed withjupyter labextension
You can also create a conda environment from a file in a running JupyterLab (we use mamba
which is like conda
but faster):
mamba env create -f environment.yml
You'll need to wait a minute before the new conda environment becomes available on the JupyterLab Launcher page.
The easiest way to build a custom image is to extend the existing images.
For notebooks running on CPU, we use images from the official jupyter/docker-stacks, which run as non root user. So you will need to make sure the folders permissions are properly set for the notebook user.
Here is an example Dockerfile
to extend ghcr.io/maastrichtu-ids/jupyterlab:latest
:
FROM ghcr.io/maastrichtu-ids/jupyterlab:latest
# Change to root user to install packages requiring admin privileges:
USER root
RUN apt-get update && \
apt-get install -y vim
RUN fix-permissions /home/$NB_USER
# Switch back to the notebook user for other packages:
USER ${NB_UID}
RUN mamba install -c defaults -y rstudio
RUN pip install jupyter-rsession-proxy
For docker image that are not based on the jupyter/docker-stack, such as the GPU images, you the root user is used by default. See at the further in this README for more information on how to extend GPU images.
For the ghcr.io/maastrichtu-ids/jupyterlab:latest
image volumes should be mounted into /home/jovyan/work
folder.
This command will start JupyterLab as jovyan
user with sudo
privileges, use JUPYTER_TOKEN
to define your password:
docker run --rm -it --user root -p 8888:8888 -e GRANT_SUDO=yes -e JUPYTER_TOKEN=password -v $(pwd)/data:/home/jovyan/work ghcr.io/maastrichtu-ids/jupyterlab
You should now be able to install anything in the JupyterLab container, try:
sudo apt-get update
You can check the docker-compose.yml
file to run it easily with Docker Compose.
Run with a restricted jovyan
user, without sudo
privileges:
docker run --rm -it --user $(id -u) -p 8888:8888 -e CHOWN_HOME=yes -e CHOWN_HOME_OPTS='-R' -e JUPYTER_TOKEN=password -v $(pwd)/data:/home/jovyan/work ghcr.io/maastrichtu-ids/jupyterlab:latest
⚠️ Potential permission issue when running locally. The official jupyter/docker-stacks images use thejovyan
user by default which does not grant admin rights (sudo
). This can cause issues when writing to the shared volumes, to fix it you can change the owner of the folder, or start JupyterLab as root user.To create the folder with the right permissions, replace
1000:100
by your username:group if necessary and run:mkdir -p data/ sudo chown -R 1000:100 data/
Instructions to build the various image aiming to run on CPU.
This repository contains multiple folders with Dockerfile
to build various flavor of JupyterLab for Data Science.
With Python 3.8, conda integration, VisualStudio Code, Java and SPARQL kernels
Build:
docker build -t ghcr.io/maastrichtu-ids/jupyterlab .
Run:
docker run --rm -it --user root -p 8888:8888 -e JUPYTER_TOKEN=password -v $(pwd)/data:/home/jovyan/work ghcr.io/maastrichtu-ids/jupyterlab
Push:
docker push ghcr.io/maastrichtu-ids/jupyterlab
With Oxigraph and Blazegraph SPARQL database, and additional python/java library for RDF processing:
docker build -f knowledge-graph/Dockerfile -t ghcr.io/maastrichtu-ids/jupyterlab:knowledge-graph .
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:knowledge-graph
With a python2.7 kernel only (python3 not installed). Build and run (workdir is /root
):
docker build -t ghcr.io/maastrichtu-ids/jupyterlab:python2.7 ./python2.7
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:python2.7
Based on https://github.com/bruggerk/ricopili_docker. Build and run (workdir is /root
):
docker build -t ghcr.io/maastrichtu-ids/jupyterlab:ricopili ./ricopili
docker run --rm -it -p 8888:8888 -v $(pwd)/data:/root -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:ricopili
Built with https://github.com/ReproNim/neurodocker. Build and run (workdir is /root
):
docker build -t ghcr.io/maastrichtu-ids/jupyterlab:fsl ./fsl
docker run --rm -it -p 8888:8888 -v $(pwd)/data:/root -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:fsl
To deploy JupyterLab on GPU we use the official Nvidia images, we defined the same gpu.dockerfile
to install additional dependencies, such as JupyterLab and VisualStudio Code, with different images from Nvidia:
🗜️ TensorFlow with nvcr.io/nvidia/tensorflow
:
ghcr.io/maastrichtu-ids/jupyterlab:tensorflow
🔥 PyTorch with nvcr.io/nvidia/pytorch
:
ghcr.io/maastrichtu-ids/jupyterlab:pytorch
👁️ CUDA with nvcr.io/nvidia/cuda
:
ghcr.io/maastrichtu-ids/jupyterlab:cuda
Volumes should be mounted into the /workspace/persistent
or /workspace
folder.
The easiest way to build a custom image is to extend the existing images.
Here is an example Dockerfile
to extend ghcr.io/maastrichtu-ids/jupyterlab:tensorflow
based on nvcr.io/nvidia/tensorflow
:
FROM ghcr.io/maastrichtu-ids/jupyterlab:tensorflow
RUN apt-get update && \
apt-get install -y vim
RUN pip install jupyter-tensorboard
You will find here the commands to use to build our different GPU docker images, most of them are using the gpu.dockerfile
Change the build-arg
and run from the root folder of this repository:
docker build --build-arg NVIDIA_IMAGE=nvcr.io/nvidia/cuda:11.4.2-devel-ubuntu20.04 -f gpu.dockerfile -t ghcr.io/maastrichtu-ids/jupyterlab:cuda .
Run an image on http://localhost:8888
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password -v $(pwd)/data:/workspace/persistent ghcr.io/maastrichtu-ids/jupyterlab:cuda
Change the build-arg
and run from the root folder of this repository:
docker build --build-arg NVIDIA_IMAGE=nvcr.io/nvidia/pytorch:23.03-py3 -f gpu.dockerfile -t ghcr.io/maastrichtu-ids/jupyterlab:pytorch .
Run an image on http://localhost:8888
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password -v $(pwd)/data:/workspace/persistent ghcr.io/maastrichtu-ids/jupyterlab:pytorch
Change the build-arg
and run from the root folder of this repository:
docker build --build-arg NVIDIA_IMAGE=nvcr.io/nvidia/tensorflow:21.11-tf2-py3 -f gpu.dockerfile -t ghcr.io/maastrichtu-ids/jupyterlab:tensorflow .
Run an image on http://localhost:8888
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password -v $(pwd)/data:/workspace/persistent ghcr.io/maastrichtu-ids/jupyterlab:tensorflow
This build use a different image, go to the fsl-gpu
folder. And check the README.md
for more details.
Build:
docker build -t ghcr.io/maastrichtu-ids/jupyterlab:fsl-gpu ./fsl-gpu
Run (workdir is /workspace
):
docker run --rm -it -p 8888:8888 -e JUPYTER_TOKEN=password ghcr.io/maastrichtu-ids/jupyterlab:fsl-gpu
This image is compatible with OpenShift and OKD security constraints to run as non root user.
We recommend to use this Helm chart to deploy these JupyterLab images on Kubernetes or OpenShift: https://artifacthub.io/packages/helm/dsri-helm-charts/jupyterlab
If you are working or studying at Maastricht University, you can easily deploy this notebook on the Data Science Research Infrastructure (DSRI) 🌉
Choose which image fits your need: latest, tensorflow, cuda, pytorch, freesurfer, python2.7...
-
Fork this repository.
-
Clone the forked repository
-
Edit the
Dockerfile
for the image you want to improve. Preferably usemamba
orconda
to install new packages, you can also install withapt-get
(need to run as root or withsudo
) andpip
-
Go to the folder and rebuild the
Dockerfile
:
docker build -t jupyterlab -f Dockerfile .
- Run the docker image built on http://localhost:8888 to test it
docker run -it --rm -p 8888:8888 -e JUPYTER_TOKEN=yourpassword jupyterlab
If the built Docker image works well feel free to send a pull request to get your changes merged to the main repository and integrated in the corresponding published Docker image.
You can check the size of the image built in MB:
expr $(docker image inspect ghcr.io/maastrichtu-ids/jupyterlab:latest --format='{{.Size}}') / 1000000