Skip to content

Commit

Permalink
Update docs
Browse files Browse the repository at this point in the history
Signed-off-by: Simon Zhao <[email protected]>
  • Loading branch information
SimonYansenZhao committed Nov 13, 2024
1 parent 06accbf commit 3d509e0
Show file tree
Hide file tree
Showing 3 changed files with 110 additions and 65 deletions.
23 changes: 20 additions & 3 deletions tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,26 @@ GitHub workflows `azureml-unit-tests.yml`, `azureml-cpu-nightly.yml`, `azureml-g

There are three scripts used with each workflow, all of them are located in [ci/azureml_tests](./ci/azureml_tests/):

* `submit_groupwise_azureml_pytest.py`: this script uses parameters in the workflow yml to set up the AzureML environment for testing using the AzureML SDK.
* `run_groupwise_pytest.py`: this script uses pytest to run the tests of the libraries and notebooks. This script runs in an AzureML workspace with the environment created by the script above.
* `test_groups.py`: this script defines the groups of tests. If the tests are part of the unit tests, the total compute time of each group should be less than 15min. If the tests are part of the nightly builds, the total time of each group should be less than 35min.
* [`submit_groupwise_azureml_pytest.py`](./ci/azureml_tests/submit_groupwise_azureml_pytest.py):
this script uses parameters in the workflow yml to set up the
AzureML environment for testing using the AzureML SDK.
* [`run_groupwise_pytest.py`](./ci/azureml_tests/run_groupwise_pytest.pyy):
this script uses pytest to run the tests of the libraries and
notebooks. This script runs in an AzureML workspace with the
environment created by the script above.
* [`aml_utils.py`](./ci/azureml_tests/aml_utils.py): this script
defines several utility functions using
[the AzureML Python SDK v2](https://learn.microsoft.com/en-us/azure/machine-learning/concept-v2?view=azureml-api-2).
These fuctions are used by scripts above to set up the compute and
the environment for the tests on AzureML. For example, the
environment with all dependencies of Recommenders is created by the
function `get_or_create_environment` via the [Dockerfile](../tools/docker/Dockerfile).
More details on Docker support can be found at [tools/docker/README.md](../tools/docker/README.md).
* [`test_groups.py`](./ci/azureml_tests/test_groups.py): this script
defines the groups of tests. If the tests are part of the unit
tests, the total compute time of each group should be less than
15min. If the tests are part of the nightly builds, the total time
of each group should be less than 35min.

## How to contribute tests to the repository

Expand Down
9 changes: 7 additions & 2 deletions tools/docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -89,20 +89,22 @@ RUN ${CONDA_PREFIX}/bin/conda create -n Recommenders -c conda-forge -y \
FROM deps AS final

ARG RECO_LOCAL_DIR="./"
ARG RECO_DIR="/tmp/recommenders"
ARG RECO_DIR="/root/Recommenders"
ARG RECO_GIT_URL="git+https://github.com/recommenders-team/recommenders.git"

# Copy Recommenders into the image
COPY ${RECO_LOCAL_DIR} ${RECO_DIR}

# Install Recommenders and its dependencies
RUN source ${CONDA_PREFIX}/bin/activate && \
conda activate ${ENV_NAME} && \
conda activate Recommenders && \
if [ -z "${GIT_REF}" ]; then \
pip install ${RECO_DIR}${EXTRAS}; \
else \
pip install recommenders${EXTRAS}@${RECO_GIT_URL}@${GIT_REF}; \
fi && \
jupyter notebook --generate-config && \
echo "c.MultiKernelManager.default_kernel_name = 'Recommenders'" >> /root/.jupyter/jupyter_notebook_config.py && \
python -m ipykernel install --user --name Recommenders --display-name "Python (Recommenders)"

# Activate Recommenders Conda environment
Expand All @@ -113,3 +115,6 @@ ENV JAVA_LD_LIBRARY_PATH="${JAVA_HOME}/lib/server"
ENV PATH="${CONDA_PREFIX}/envs/Recommenders/bin:${CONDA_PREFIX}/condabin:${PATH}"
ENV CONDA_PREFIX="${CONDA_PREFIX}/envs/Recommenders"
ENV PS1='(Recommenders) \[\]\[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\u@\h:\w\$ \[\]'

EXPOSE 8888
CMD ["jupyter", "notebook", "--ip=0.0.0.0", "--port=8888", "--no-browser", "--allow-root", "--ServerApp.allow_origin='*'", "--IdentityProvider.token=''"]
143 changes: 83 additions & 60 deletions tools/docker/README.md
Original file line number Diff line number Diff line change
@@ -1,92 +1,115 @@
Docker Support
==============
The Dockerfile in this directory will build Docker images with all the dependencies and code needed to run example notebooks or unit tests included in this repository.
The Dockerfile in this directory will build Docker images with all
the dependencies and code needed to run example notebooks or unit
tests included in this repository. It is also used by
* [.devcontainer/devcontainer.json](../../.devcontainer/devcontainer.json)
to build
[VS Code Dev Contianers](https://code.visualstudio.com/docs/devcontainers/containers)
that can facilitate the development of Recommenders,
* and [tests/ci/azureml_tests/aml_utils.py](../../tests/ci/azureml_tests/aml_utils.py)
to create the environment in [the testing workflows of Recommenders](../../.github/workflows/).

Multiple environments are supported by using [multistage builds](https://docs.docker.com/develop/develop-images/multistage-build/). In order to efficiently build the Docker images in this way, [Docker BuildKit](https://docs.docker.com/develop/develop-images/build_enhancements/) is necessary.
The following examples show how to build and run the Docker image for CPU, PySpark, and GPU environments.
Multiple environments are supported by using
[multistage builds](https://docs.docker.com/build/building/multi-stage/).
The following examples show how to build and run the Docker image for
CPU, PySpark, and GPU environments.

<i>Note:</i> On some platforms, one needs to manually specify the environment variable for `DOCKER_BUILDKIT`to make sure the build runs well. For example, on a Windows machine, this can be done by the powershell command as below, before building the image
```
$env:DOCKER_BUILDKIT=1
```
Once the container is running you can access Jupyter notebooks at
http://localhost:8888.

<i>Warning:</i> On some platforms using Docker Buildkit interferes with Anaconda environment installation. If you find that the docker build is hanging during Anaconda environment setup stage try building the container without Buildkit enabled.

Once the container is running you can access Jupyter notebooks at http://localhost:8888.

Building and Running with Docker
--------------------------------

See examples below for the case of conda. If you use venv or virtualenv instead, replace `--build-arg VIRTUAL_ENV=conda` with `--build-arg VIRTUAL_ENV=venv` or `--build-arg VIRTUAL_ENV=virtualenv`, respectively.
<details>
<summary><strong><em>CPU environment</em></strong></summary>

```
DOCKER_BUILDKIT=1 docker build -t recommenders:cpu --build-arg ENV=cpu --build-arg VIRTUAL_ENV=conda .
docker run -p 8888:8888 -d recommenders:cpu
```
* **CPU environment**

</details>
```bash
docker build -t recommenders:cpu .
docker run -v ../../examples:/root/examples -p 8888:8888 -d recommenders:cpu
```

<details>
<summary><strong><em>PySpark environment</em></strong></summary>

```
DOCKER_BUILDKIT=1 docker build -t recommenders:pyspark --build-arg ENV=pyspark --build-arg VIRTUAL_ENV=conda .
docker run -p 8888:8888 -d recommenders:pyspark
```
* **PySpark environment**

</details>
```bash
docker build -t recommenders:pyspark --build-arg EXTRAS=[spark] .
docker run -v ../../examples:/root/examples -p 8888:8888 -d recommenders:pyspark
```

<details>
<summary><strong><em>GPU environment</em></strong></summary>
* **GPU environment**

```
DOCKER_BUILDKIT=1 docker build -t recommenders:gpu --build-arg ENV=gpu --build-arg VIRTUAL_ENV=conda .
docker run --runtime=nvidia -p 8888:8888 -d recommenders:gpu
```
```bash
docker build -t recommenders:gpu --build-arg COMPUTE=gpu .
docker run --runtime=nvidia -v ../../examples:/root/examples -p 8888:8888 -d recommenders:gpu
```

</details>

<details>
<summary><strong><em>GPU + PySpark environment</em></strong></summary>
* **GPU + PySpark environment**

```
DOCKER_BUILDKIT=1 docker build -t recommenders:full --build-arg ENV=full --build-arg VIRTUAL_ENV=conda .
docker run --runtime=nvidia -p 8888:8888 -d recommenders:full
```
```bash
docker build -t recommenders:gpu-pyspark --build-arg COMPUTE=gpu --build-arg EXTRAS=[gpu,spark] .
docker run --runtime=nvidia -v ../../examples:/root/examples -p 8888:8888 -d recommenders:gpu-pyspark
```

</details>

Build Arguments
---------------

There are several build arguments which can change how the image is built. Similar to the `ENV` build argument these are specified during the docker build command.
There are several build arguments which can change how the image is
built. Similar to the `ENV` build argument these are specified during
the docker build command.

Build Arg|Description|
---------|-----------|
ENV|Environment to use, options: cpu, pyspark, gpu, full (defaults to cpu)|
VIRTUAL_ENV|Virtual environment to use; mandatory argument, must be one of "conda", "venv", "virtualenv"|
ANACONDA|Anaconda installation script (defaults to miniconda3 4.6.14)|
`COMPUTE`|Compute to use, options: `cpu`, `gpu` (defaults to `cpu`)|
`EXTRAS`|Extra dependencies to use, options: `dev`, `gpu`, `spark` (defaults to none ("")); For example, `[gpu,spark]`|
`GIT_REF`|Git ref of Recommenders to install, options: `main`, `staging`, etc (defaults to `main`); Empty value means editable installation of current clone|
`JDK_VERSION`|OpenJDK version to use (defaults to `21`)|
`PYTHON_VERSION`|Python version to use (defaults to `3.11`)|
`RECO_DIR`|Path to the copy of Recommenders in the container when `GIT_REF` is empty (defaults to `/root/Recommenders`)|

Examples:
* Install Python 3.10 and the Recommenders package from the staging branch.

```bash
docker build -t recommenders:staging --build-arg GIT_REF=staging --build-arg PYTHON_VERSION=3.10 .
```

* Install the current local clone of Recommenders and its extra 'dev' dependencies.

Example:
```bash
# Go to the root directory of Recommenders to copy the local clone into the Docker image
cd ../../
docker build -t recommenders:dev --build-arg GIT_REF= --build-arg EXTRAS=[dev] -f tools/docker/Dockerfile .
```

```
DOCKER_BUILDKIT=1 docker build -t recommenders:cpu --build-arg ENV=cpu --build-arg VIRTUAL_ENV=conda .
```
In order to see detailed progress you can provide a flag during the
build command: ```--progress=plain```

In order to see detailed progress with BuildKit you can provide a flag during the build command: ```--progress=plain```

Running tests with docker
Running tests with Docker
-------------------------

To run the tests using e.g. the CPU image, do the following:
```
docker run -it recommenders:cpu bash -c 'pip install pytest; \
pip install pytest-cov; \
pip install pytest-mock; \
apt-get install -y git; \
git clone https://github.com/recommenders-team/recommenders.git; \
cd recommenders; \
pytest tests/unit -m "not spark and not gpu and not notebooks and not experimental"'
```
* Run the tests using the `recommenders:cpu` image built above.
NOTE: The `recommender:cpu` image only installs the Recommenders
package under [../../recommenders/](../../recommenders/).

```bash
docker run -it recommenders:cpu bash -c 'pip install pytest; \
pip install pytest-cov; \
pip install pytest-mock; \
apt-get install -y git; \
git clone https://github.com/recommenders-team/recommenders.git; \
cd Recommenders; \
pytest tests/unit -m "not spark and not gpu and not notebooks and not experimental"'
```

* Run the tests using the `recommenders:dev` image built above.
NOTE: The `recommenders:dev` image has a full copy of your local
Recommenders repository.

```bash
docker run -it recommenders:dev bash -c 'cd Recommenders; \
pytest tests/unit -m "not spark and not gpu and not notebooks and not experimental"'
```

0 comments on commit 3d509e0

Please sign in to comment.