Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Improve CI caching of docker multi-stage builds #19605

Open
leezu opened this issue Nov 30, 2020 · 0 comments
Open

Improve CI caching of docker multi-stage builds #19605

leezu opened this issue Nov 30, 2020 · 0 comments

Comments

@leezu
Copy link
Contributor

leezu commented Nov 30, 2020

As per moby/moby#34715 (comment) we should enable docker buildkit on CI to improve the caching behavior of multi-stage builds.

On the MXNet side, the following patch may suffice

diff --git a/ci/Jenkinsfile_docker_cache b/ci/Jenkinsfile_docker_cache
index 5f378b5d6..edad8aacd 100644
--- a/ci/Jenkinsfile_docker_cache
+++ b/ci/Jenkinsfile_docker_cache
@@ -37,7 +37,7 @@ core_logic: {
       ws('workspace/docker_cache') {
         timeout(time: total_timeout, unit: 'MINUTES') {
           utils.init_git()
-          sh "cd ci && python3 ./docker_login.py --secret-name ${env.DOCKERHUB_SECRET_NAME} && docker-compose -f docker/docker-compose.yml pull && docker-compose -f docker/docker-compose.yml build --parallel && COMPOSE_HTTP_TIMEOUT=600 docker-compose -f docker/docker-compose.yml push && docker logout"
+          sh "cd ci && python3 ./docker_login.py --secret-name ${env.DOCKERHUB_SECRET_NAME} && COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose -f docker/docker-compose.yml pull && COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose -f docker/docker-compose.yml build --parallel && COMPOSE_HTTP_TIMEOUT=600 COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose -f docker/docker-compose.yml push && docker logout"
         }
       }
     }
diff --git a/ci/build.py b/ci/build.py
index 1e9e23fad..57d1fa5b2 100755
--- a/ci/build.py
+++ b/ci/build.py
@@ -72,6 +72,8 @@ def build_docker(platform: str, registry: str, num_retries: int, no_cache: bool,
 
     env = os.environ.copy()
     env["DOCKER_CACHE_REGISTRY"] = registry
+    env["COMPOSE_DOCKER_CLI_BUILD"] = "1"
+    env["DOCKER_BUILDKIT"] = "1"
 
     @retry(subprocess.CalledProcessError, tries=num_retries)
     def run_cmd(env=None):
@@ -204,6 +206,8 @@ def load_docker_cache(platform, tag, docker_registry) -> None:
     if docker_registry:
         env = os.environ.copy()
         env["DOCKER_CACHE_REGISTRY"] = docker_registry
+        env["COMPOSE_DOCKER_CLI_BUILD"] = "1"
+        env["DOCKER_BUILDKIT"] = "1"
         cmd = ['docker-compose', '-f', 'docker/docker-compose.yml', 'pull', platform]
         logging.info("Running command: 'DOCKER_CACHE_REGISTRY=%s %s'", docker_registry, ' '.join(cmd))
         check_call(cmd, env=env)

Prior to applying the patch on the mxnet side, https://github.com/apache/incubator-mxnet-ci/blob/master/tools/jenkins-slave-creation-unix/conf-ubuntu-cpu-c5/install.sh and https://github.com/apache/incubator-mxnet-ci/blob/master/tools/jenkins-slave-creation-unix/conf-ubuntu-gpu/install.sh need to be modified to install buildkit from https://github.com/moby/buildkit/releases and the CI AMI needs to be regenerated.

The main issue with the existing CI cache setup is that on local developer machines, ci/build.py [...] --cache-intermediate won't work for multi-stage builds. In addition to that, CI will not re-use cached builds for the later stages of the build. Thus switching to buildkit may speed CI pipelines relying on multi-stage build by a few minutes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant