Add Ubuntu 24.04 support for Docker builds#35386
Conversation
There was a problem hiding this comment.
Code Review
The pull request successfully adds support for Ubuntu 24.04 across Docker builds, including new build arguments, parameterized base images, and updated release pipeline steps. The fix for the EXTERNALLY-MANAGED pip issue is crucial for compatibility with newer Ubuntu versions. However, there's an inconsistency in the .buildkite/release-pipeline.yaml regarding the BUILD_BASE_IMAGE for CUDA 13.0 Ubuntu 24.04 builds.
docker/Dockerfile
Outdated
| && rm -f /usr/lib/python${PYTHON_VERSION}/EXTERNALLY-MANAGED \ | ||
| && rm -rf /usr/lib/python3/dist-packages/pip /usr/lib/python3/dist-packages/pip-*.dist-info \ |
There was a problem hiding this comment.
.buildkite/release-pipeline.yaml
Outdated
| queue: arm64_cpu_queue_postmerge | ||
| commands: | ||
| - "aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/q9t5s3a7" | ||
| - "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg UBUNTU_VERSION=24.04 --build-arg GDRCOPY_OS_VERSION=Ubuntu24_04 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0 12.1' --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130-ubuntu2404 --target vllm-openai --progress plain -f docker/Dockerfile ." |
There was a problem hiding this comment.
Similar to the x86_64 build, the BUILD_BASE_IMAGE for the aarch64 CUDA 13.0 Ubuntu 24.04 build is explicitly set to nvidia/cuda:13.0.1-devel-ubuntu22.04. This should be updated to ubuntu24.04 for consistency with the UBUNTU_VERSION argument, or adjusted based on the intended base image strategy.
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg UBUNTU_VERSION=24.04 --build-arg GDRCOPY_OS_VERSION=Ubuntu24_04 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0 12.1' --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu24.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130-ubuntu2404 --target vllm-openai --progress plain -f docker/Dockerfile ."|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
98363eb to
383fdff
Compare
fb0e9cd to
14d350e
Compare
- Add UBUNTU_VERSION build arg to Dockerfile, defaulting to 22.04 - Parameterize FINAL_BASE_IMAGE to use UBUNTU_VERSION - Fix pip EXTERNALLY-MANAGED issue for newer Ubuntu versions - Add Ubuntu 24.04 build targets in docker-bake.hcl - Add Ubuntu 24.04 release pipeline steps for x86_64 and aarch64 (CUDA 12.9 and 13.0) with multi-arch manifests Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com>
Update BUILD_BASE_IMAGE from ubuntu22.04 to ubuntu24.04 for the CUDA 13.0 + Ubuntu 24.04 release pipeline steps to be consistent with the UBUNTU_VERSION=24.04 build arg. Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com>
Three bug fixes found during local build testing of Ubuntu 24.04 images:
1. docker/Dockerfile (base stage): Install python${PYTHON_VERSION}-dev before
apt cache cleanup. Ubuntu 24.04 ships cmake 3.28 which requires the
Development.SABIModule component; without Python headers the csrc-build
stage fails with "Could NOT find Python (missing: Python_INCLUDE_DIRS
Development.SABIModule)". The install is best-effort (|| true) so it
silently no-ops on Ubuntu 20.04/22.04 where the package is not in the
default repos.
2. docker/Dockerfile (vllm-base stage): Remove python3-pip from apt deps.
On Ubuntu 24.04, apt installs pip 24.0 without a RECORD file, causing
get-pip.py to fail with "Cannot uninstall pip 24.0: no RECORD file was
found". Removing python3-pip from apt lets get-pip.py install pip fresh
with no conflict.
3. .buildkite/release-pipeline.yaml: Add FLASHINFER_AOT_COMPILE=true to
the CUDA 13.0 + Ubuntu 24.04 build steps (x86_64 and aarch64). It was
already set on the CUDA 12.9 + Ubuntu 24.04 steps; without it the
CUDA 13.0 Ubuntu 24.04 images silently fall back to slow JIT compilation
at runtime.
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com>
14d350e to
6c964bd
Compare
- Add torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' to x86_64 CUDA 12.9 Ubuntu 24.04 release build, matching aarch64 and existing Ubuntu 22.04 builds - Add torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0 12.1' to x86_64 CUDA 13.0 Ubuntu 24.04 release build, matching aarch64 counterpart - Add FLASHINFER_AOT_COMPILE=true to test-ubuntu2404 and openai-ubuntu2404 docker-bake.hcl targets to match CI pipeline and avoid silent JIT fallback Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com>
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com>
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com>
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com> Signed-off-by: Michel Belleau <michel.belleau@malaiwah.com>
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com>
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com> Signed-off-by: Nithin Chalapathi <nithin.ch10@gmail.com>
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com>
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com> (cherry picked from commit 0c1809c)
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com> Signed-off-by: EricccYang <yangyang4991@gmail.com>
Signed-off-by: aasgaonkar <aasgaonkar@nvidia.com> Signed-off-by: bhargav-patel-29 <bhargav.patel@tihiitb.org>
Purpose
Adds Ubuntu 24.04 as an opt-in build target for vLLM Docker release images, closing #35118.
Changes:
docker/Dockerfile: AddUBUNTU_VERSIONARG (default22.04) to parameterizeFINAL_BASE_IMAGE. Installpython${PYTHON_VERSION}-devbefore apt cache cleanup (required bycmake 3.28 on Ubuntu 24.04 for
Development.SABIModule). Removepython3-pipfrom final stageapt deps to avoid conflict with
get-pip.pyon Ubuntu 24.04 (pip 24.0 ships without a RECORDfile). Remove
EXTERNALLY-MANAGEDmarker to allow pip installs into the system Python.docker/docker-bake.hcl: Addtest-ubuntu2404andopenai-ubuntu2404targets withUBUNTU_VERSION=24.04,GDRCOPY_OS_VERSION=Ubuntu24_04, andFLASHINFER_AOT_COMPILE=true.docker/versions.json: AddUBUNTU_VERSIONwith default"22.04"..buildkite/release-pipeline.yaml: Add 4 new release pipeline steps (x86_64 + aarch64 forCUDA 12.9 and 13.0, all with Ubuntu 24.04) and 2 multi-arch manifest steps. CUDA 13.0 steps
explicitly pass
BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu24.04since NVIDIA does notpublish CUDA 13.x devel images for Ubuntu 20.04 (the Dockerfile default).
Test Plan
Built and tested all 4 Ubuntu 24.04 image variants locally:
CUDA 12.9 + Ubuntu 24.04
docker build --build-arg CUDA_VERSION=12.9.1 --build-arg UBUNTU_VERSION=24.04 \
--build-arg GDRCOPY_OS_VERSION=Ubuntu24_04 --build-arg FLASHINFER_AOT_COMPILE=true \
--target vllm-openai -f docker/Dockerfile .
CUDA 13.0 + Ubuntu 24.04
docker build --build-arg CUDA_VERSION=13.0.1 --build-arg UBUNTU_VERSION=24.04 \
--build-arg GDRCOPY_OS_VERSION=Ubuntu24_04 --build-arg FLASHINFER_AOT_COMPILE=true \
--build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu24.04 \
--target vllm-openai -f docker/Dockerfile .
Test Result
Test:
vllm.entrypoints.openai.api_serverwithfacebook/opt-125m, prompt"The capital of France is",max_tokens=20,temperature=0. Multi-GPU test uses--tensor-parallel-size 4across 4×A100. Ubuntu 22.04 rows verify no regression in existing builds.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.