Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Fix nightly CD for python docker image releases #19772

Merged
merged 29 commits into from
Feb 27, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
b67e201
install wget
mseth10 Jan 21, 2021
c7ccd0b
test cd docker in ci
mseth10 Jan 21, 2021
ec2c30a
install docker
mseth10 Jan 22, 2021
5f5a16c
install python3-dev and gcc
mseth10 Jan 22, 2021
9c39d5d
remove docker testing from ci
mseth10 Jan 22, 2021
20031c5
remove python3-dev
mseth10 Feb 23, 2021
9f189a4
ecr target
mseth10 Feb 23, 2021
182c855
skip build test
mseth10 Feb 23, 2021
8fb412e
adding back python3-dev for make
mseth10 Feb 23, 2021
333f50a
remove dynamic and pypi stages for testing
mseth10 Feb 24, 2021
21d62ac
install build-essential
mseth10 Feb 24, 2021
21a6c5d
install zlib
mseth10 Feb 24, 2021
e82f830
update python version
mseth10 Feb 24, 2021
20c5c95
update ld library path
mseth10 Feb 24, 2021
ae0c8e8
install openssl
mseth10 Feb 24, 2021
cf2a505
update test packages for python3.7
mseth10 Feb 24, 2021
d093c69
remove call to deleted safe_docker_run.py
mseth10 Feb 24, 2021
6cef78e
hardcode region for public ecr repo
mseth10 Feb 24, 2021
a89afcf
use deadsnakes to install python
mseth10 Feb 24, 2021
b2ef4d1
revert dependency change
mseth10 Feb 24, 2021
69e2afb
refactor ecr login
mseth10 Feb 24, 2021
580514c
update ecr repo jenkins global var
mseth10 Feb 24, 2021
7a43857
cleanup
mseth10 Feb 24, 2021
2cbb8a7
update docker authentication
mseth10 Feb 25, 2021
8241871
add ecr repo
mseth10 Feb 25, 2021
6f18c06
add back pypi and tests
mseth10 Feb 25, 2021
8568f87
remove unused libmxnet pipeline
mseth10 Feb 25, 2021
a4c11e1
update cu112 base docker
mseth10 Feb 26, 2021
ba63c59
update base docker images to ub18
mseth10 Feb 26, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 14 additions & 26 deletions cd/Jenkinsfile_cd_pipeline
Original file line number Diff line number Diff line change
Expand Up @@ -54,33 +54,21 @@ pipeline {
stage("MXNet Release") {
steps {
script {
cd_utils.error_checked_parallel([

"Static libmxnet based release": {
stage("Build") {
cd_utils.trigger_release_job(params.CD_RELEASE_JOB_NAME, "Build static libmxnet", "mxnet_lib/static", params.MXNET_VARIANTS)
}
stage("Releases") {
cd_utils.error_checked_parallel([
"PyPI Release": {
echo "Building PyPI Release"
cd_utils.trigger_release_job(params.CD_RELEASE_JOB_NAME, "Release PyPI Packages", "python/pypi", params.MXNET_VARIANTS)
},
"Python Docker Release": {
echo "Building Python Docker Release"
cd_utils.trigger_release_job(params.CD_RELEASE_JOB_NAME, "Release Python Docker Images", "python/docker", params.MXNET_VARIANTS)
}
])
stage("Build libmxnet") {
cd_utils.trigger_release_job(params.CD_RELEASE_JOB_NAME, "Build libmxnet", "mxnet_lib", params.MXNET_VARIANTS)
}
stage("Releases") {
cd_utils.error_checked_parallel([
"PyPI Release": {
echo "Building PyPI Release"
cd_utils.trigger_release_job(params.CD_RELEASE_JOB_NAME, "Release PyPI Packages", "python/pypi", params.MXNET_VARIANTS)
},
"Python Docker Release": {
echo "Building Python Docker Release"
cd_utils.trigger_release_job(params.CD_RELEASE_JOB_NAME, "Release Python Docker Images", "python/docker", params.MXNET_VARIANTS)
}
},

"Dynamic libmxnet based release": {
stage("Build") {
cd_utils.trigger_release_job(params.CD_RELEASE_JOB_NAME, "Build dynamic libmxnet", "mxnet_lib/dynamic", params.MXNET_VARIANTS)
}
}

])
])
}
}
}
}
Expand Down
5 changes: 2 additions & 3 deletions cd/Jenkinsfile_release_job
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ pipeline {
// Using string instead of choice parameter to keep the changes to the parameters minimal to avoid
// any disruption caused by different COMMIT_ID values chaning the job parameter configuration on
// Jenkins.
string(defaultValue: "mxnet_lib/static", description: "Pipeline to build", name: "RELEASE_JOB_TYPE")
string(defaultValue: "mxnet_lib", description: "Pipeline to build", name: "RELEASE_JOB_TYPE")
string(defaultValue: "cpu,native,cu101,cu102,cu110,cu112", description: "Comma separated list of variants", name: "MXNET_VARIANTS")
booleanParam(defaultValue: false, description: 'Whether this is a release build or not', name: "RELEASE_BUILD")
}
Expand Down Expand Up @@ -90,8 +90,7 @@ pipeline {

// Add new job types here
def valid_job_types = [
"mxnet_lib/static",
"mxnet_lib/dynamic",
"mxnet_lib",
"python/pypi",
"python/docker"
]
Expand Down
4 changes: 2 additions & 2 deletions cd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ The [release job](Jenkinsfile_release_job) takes five parameters:
* **RELEASE\_JOB\_TYPE**: Defines the release pipeline you want to execute.
* **COMMIT_ID**: The commit id to build

The release job executes, in parallel, the release pipeline for each of the variants (**MXNET_VARIANTS**) for the job type (**RELEASE\_JOB\_TYPE**). The job type the path to a directory (relative to the `cd` directory) that includes a `Jenkins_pipeline.groovy` file ([e.g.](mxnet_lib/static/Jenkins_pipeline.groovy)).
The release job executes, in parallel, the release pipeline for each of the variants (**MXNET_VARIANTS**) for the job type (**RELEASE\_JOB\_TYPE**). The job type the path to a directory (relative to the `cd` directory) that includes a `Jenkins_pipeline.groovy` file ([e.g.](mxnet_lib/Jenkins_pipeline.groovy)).

NOTE: The **COMMIT_ID** is a little tricky and we must be very careful with it. It is necessary to ensure that the same commit is built through out the pipeline, but at the same time, it has the potential to change the current state of the release job configuration - specifically the parameter configuration. Any changes to this configuration will require a "dry-run" of the release job to ensure Jenkins has the current (master) version. This is acceptable as there will be few changes to the parameter configuration for the job, if any at all. But, it's something to keep in mind.

Expand Down Expand Up @@ -191,4 +191,4 @@ def test(mxnet_variant) {

Examples:

Both the [statically linked libmxnet](mxnet_lib/static/Jenkins_pipeline.groovy) and [dynamically linked libmxnet](mxnet_lib/dynamic/Jenkins_pipeline.groovy) pipelines have long running compilation and testing stages that **do not** require specialized/expensive hardware (e.g. GPUs). Therefore, as much as possible, it is important to run each stage in on its own node, and design the pipeline to spend the least amount of time possible on expensive hardware. E.g. for GPU builds, only run GPU tests on GPU instances, all other stages can be executed on CPU nodes.
The [libmxnet](mxnet_lib/Jenkins_pipeline.groovy) pipeline has long running compilation and testing stages that **do not** require specialized/expensive hardware (e.g. GPUs). Therefore, as much as possible, it is important to run each stage in on its own node, and design the pipeline to spend the least amount of time possible on expensive hardware. E.g. for GPU builds, only run GPU tests on GPU instances, all other stages can be executed on CPU nodes.
58 changes: 0 additions & 58 deletions cd/mxnet_lib/dynamic/Jenkins_pipeline.groovy

This file was deleted.

19 changes: 7 additions & 12 deletions cd/python/docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,19 +23,14 @@
ARG BASE_IMAGE
FROM ${BASE_IMAGE}

ARG PYTHON=python3
ARG PIP=pip3
ARG PYTHON_VERSION=3.7.9
RUN apt-get update && \
wget -q https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz && \
tar -xzf Python-$PYTHON_VERSION.tgz && \
cd Python-$PYTHON_VERSION && \
./configure --enable-shared --prefix=/usr/local && \
make -j $(nproc) && make install && \
cd .. && rm -rf ../Python-$PYTHON_VERSION* && \
ln -s /usr/local/bin/pip3 /usr/bin/pip && \
ln -s /usr/local/bin/$PYTHON /usr/local/bin/python && \
${PIP} --no-cache-dir install --upgrade pip setuptools
apt-get install -y software-properties-common && \
add-apt-repository -y ppa:deadsnakes/ppa && \
apt-get update && \
apt-get install -y python3.7-dev python3.7-distutils virtualenv wget && \
ln -sf /usr/bin/python3.7 /usr/local/bin/python3 && \
wget -nv https://bootstrap.pypa.io/get-pip.py && \
python3 get-pip.py

ARG MXNET_COMMIT_ID
ENV MXNET_COMMIT_ID=${MXNET_COMMIT_ID}
Expand Down
3 changes: 0 additions & 3 deletions cd/python/docker/Dockerfile.test
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,6 @@
ARG BASE_IMAGE
FROM ${BASE_IMAGE}

# Install test dependencies
RUN pip install pytest

ARG USER_ID=1001
ARG GROUP_ID=1001

Expand Down
18 changes: 8 additions & 10 deletions cd/python/docker/python_images.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@

set -xe

usage="Usage: python_images.sh <build|test|publish> MXNET-VARIANT"
usage="Usage: python_images.sh <build|test|push> MXNET-VARIANT"

command=${1:?$usage}
mxnet_variant=${2:?$usage}
Expand All @@ -39,8 +39,8 @@ image_name="${repository}:${main_tag}"

resources_path='cd/python/docker'

if [ ! -z "${RELEASE_DOCKERHUB_REPOSITORY}" ]; then
image_name="${RELEASE_DOCKERHUB_REPOSITORY}/${image_name}"
if [ ! -z "${RELEASE_PUBLIC_ECR_REPOSITORY}" ]; then
image_name="${RELEASE_PUBLIC_ECR_REPOSITORY}/${image_name}"
fi

build() {
Expand All @@ -57,26 +57,24 @@ test() {

# Ensure the correct context root is passed in when building - Dockerfile.test expects ci directory
docker build -t "${test_image_name}" --build-arg USER_ID=`id -u` --build-arg GROUP_ID=`id -g` --build-arg BASE_IMAGE="${image_name}" -f ${resources_path}/Dockerfile.test ./ci
python3 ci/safe_docker_run.py ${runtime_param} --cap-add "SYS_PTRACE" -u `id -u`:`id -g` -v `pwd`:/work/mxnet "${test_image_name}" ${resources_path}/test_python_image.sh "${mxnet_variant}"
}

push() {
if [ -z "${RELEASE_DOCKERHUB_REPOSITORY}" ]; then
echo "Cannot publish image without RELEASE_DOCKERHUB_REPOSITORY environment variable being set."
if [ -z "${RELEASE_PUBLIC_ECR_REPOSITORY}" ]; then
echo "Cannot publish image without RELEASE_PUBLIC_ECR_REPOSITORY environment variable being set."
exit 1
fi

# The secret name env var is set in the Jenkins configuration
# Manage Jenkins -> Configure System
python3 ${ci_utils}/docker_login.py --secret-name "${RELEASE_DOCKERHUB_SECRET_NAME}"
# Retrieve an authentication token and authenticate Docker client to registry
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/w6z5f7h2

# Push image
docker push "${image_name}"

# Iterate over remaining tags, if any
for ((i=1;i<${#docker_tags[@]};i++)); do
local docker_tag="${docker_tags[${i}]}"
local latest_image_name="${RELEASE_DOCKERHUB_REPOSITORY}/${repository}:${docker_tag}_py3"
local latest_image_name="${RELEASE_PUBLIC_ECR_REPOSITORY}/${repository}:${docker_tag}_py3"

docker tag "${image_name}" "${latest_image_name}"
docker push "${latest_image_name}"
Expand Down
12 changes: 6 additions & 6 deletions cd/utils/mxnet_base_image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,22 +22,22 @@ mxnet_variant=${1:?"Please specify the mxnet variant as the first parameter"}

case ${mxnet_variant} in
cu101*)
echo "nvidia/cuda:10.1-cudnn7-runtime-ubuntu16.04"
echo "nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04"
mseth10 marked this conversation as resolved.
Show resolved Hide resolved
;;
cu102*)
echo "nvidia/cuda:10.2-cudnn7-runtime-ubuntu16.04"
echo "nvidia/cuda:10.2-cudnn7-runtime-ubuntu18.04"
;;
cu110*)
echo "nvidia/cuda:11.0-cudnn8-runtime-ubuntu16.04"
echo "nvidia/cuda:11.0-cudnn8-runtime-ubuntu18.04"
;;
cu112*)
echo "nvidia/cuda:11.2-cudnn8-runtime-ubuntu16.04"
echo "nvidia/cuda:11.2.1-cudnn8-runtime-ubuntu18.04"
;;
cpu)
echo "ubuntu:16.04"
echo "ubuntu:18.04"
;;
native)
echo "ubuntu:16.04"
echo "ubuntu:18.04"
;;
*)
echo "Error: Unrecognized mxnet-variant: '${mxnet_variant}'"
Expand Down