Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
165 changes: 146 additions & 19 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -148,12 +148,36 @@ ARG PYTORCH_CUDA_INDEX_BASE_URL

WORKDIR /workspace

# install build and runtime dependencies
# We can specify the standard or nightly build of PyTorch
ARG PYTORCH_NIGHTLY
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool. how does this impact the image caching?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be unchanged. The non-torch nightly builds look to be caching as expected per @amrmahdi with the rebuild times at ~12 minutes. Unfortunately the torch nightly build artifacts aren't cached at all, but that matches to what we had with the separate Dockerfile (before it stopped working), so this is strictly better. I'd like to see if we can enable caching across all the docker builds (AMD, torch nightly, etc) but that can be separate from this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! I think caching should be considered to make the nightly tests practical. The fresh build time of vllm and all dependencies can easily go over 6 hours on our build agents, so we can hardly afford a fresh rebuild for any PR updates. For example, #30908 would very much need the nightly tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on caching for these is absolutely a next step. I just added some notes at #30443 (comment) around things to work on once this lands.


# Install build and runtime dependencies, including PyTorch
# Check whether to install torch nightly instead of release for this build
COPY requirements/common.txt requirements/common.txt
COPY requirements/cuda.txt requirements/cuda.txt
COPY use_existing_torch.py use_existing_torch.py
COPY pyproject.toml pyproject.toml
RUN --mount=type=cache,target=/root/.cache/uv \
if [ "${PYTORCH_NIGHTLY}" = "1" ]; then \
echo "Installing torch nightly..." \
&& uv pip install --python /opt/venv/bin/python3 torch torchaudio torchvision --pre \
--index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/nightly/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.') \
&& echo "Installing other requirements..." \
&& /opt/venv/bin/python3 use_existing_torch.py --prefix \
&& uv pip install --python /opt/venv/bin/python3 -r requirements/cuda.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/nightly/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
else \
uv pip install --python /opt/venv/bin/python3 -r requirements/cuda.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
fi

Comment on lines +158 to +173
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orionr Thanks for your patience on this, I've been reviewing the changes, and one idea that might let us simplify the changes is to install torch once per base and use a constraints file to prevent drift, instead of the repeated version checks.

the Dockerfile has two roots:

  • Build base (base)
  • Runtime base (vllm-base):

The idea is:
In base, install torch (nightly or stable) and capture the versions:

RUN if [ "${PYTORCH_NIGHTLY}" = "1" ]; then \
        uv pip install torch torchvision torchaudio --pre --index-url .../nightly/...; \
    else \
        uv pip install -r requirements/cuda.txt --extra-index-url ...; \
    fi && \
    uv pip freeze | grep -E '^(torch|torchvision|torchaudio)==' > /opt/torch_constraints.txt

In vllm-base, grab the constraints and install the same torch:

COPY --from=base /opt/torch_constraints.txt /opt/torch_constraints.txt
RUN if [ "${PYTORCH_NIGHTLY}" = "1" ]; then \
        uv pip install $(cat /opt/torch_constraints.txt | xargs) --pre --index-url .../nightly/...; \
    else \
        uv pip install $(cat /opt/torch_constraints.txt | xargs) --extra-index-url ...; \
    fi

Then everywhere else we don't need conditionals and version checks, just:
RUN uv pip install -c /opt/torch_constraints.txt -r requirements/build.txt

The constraints file basically becomes the "contract" between build and runtime — guaranteeing they use the same torch.

We could even go a step further and encapsulate the conditional in a small script making the Dockerfile itself completely conditional free.

Copy link
Copy Markdown
Contributor Author

@orionr orionr Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good feedback. FYI that /opt/torch_constraints.txt is basically the same as torch_lib_versions.txt so we have it, but you are looking to leverage the constraints flag instead of a separate torch lib version check.

I'm actually wondering, given that I'm now appending torch versions to test.in (see uv pip compile requirements/test.in below) this might actually be covering the biggest issue I saw previously, and I should just remove the torch version sections to minimize changes. Let me see if that has tests passing. If so, I'll just do that. If not, I'll see if we can use -c and constraints.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI that /opt/torch_constraints.txt is basically the same as torch_lib_versions.txt

Yup!

I'm actually wondering, given that I'm now appending torch versions to test.in (see uv pip compile requirements/test.in below) this might actually be covering the biggest issue I saw previously, and I should just remove the torch version sections to minimize changes. Let me see if that has tests passing. If so, I'll just do that. If not, I'll see if we can use -c and constraints.

What about other steps that don't have a .in file ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need the constraints, but let me check.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some digging on the weekend and I feel like a pip constraints check isn't required anywhere (surprise) after I did that test.in modification and looked closely at all the other install cases. Also ran the tests and didn't see any skew with the current bits.

With that, I think we're good with just removing the checks, which I've done. Everything built and test failures (a small number, not about version skew) are the same as they were with the checks.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orionr the constraints approach could still simplify things. Right now the PR has many conditionals and multiple copies of use_existing_torch.py / torch_lib_versions.txt across stages.

But stages that inherit from base already have torch installed, they don't need conditionals. With constraints they'd just be:

RUN uv pip install -c /opt/torch_constraints.txt -r requirements/build.txt

Anything I'm missing?

# Track PyTorch lib versions used during build and match in downstream instances.
# We do this for both nightly and release so we can strip dependencies/*.txt as needed.
# Otherwise library dependencies can upgrade/downgrade torch incorrectly.
RUN --mount=type=cache,target=/root/.cache/uv \
uv pip install --python /opt/venv/bin/python3 -r requirements/cuda.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.')
uv pip freeze | grep -i "^torch=\|^torchvision=\|^torchaudio=" > torch_lib_versions.txt \
&& TORCH_LIB_VERSIONS=$(cat torch_lib_versions.txt | xargs) \
&& echo "Installed torch libs: ${TORCH_LIB_VERSIONS}"

# CUDA arch list used by torch
# Explicitly set the list to avoid issues with torch 2.2
Expand All @@ -171,8 +195,13 @@ ARG PIP_INDEX_URL UV_INDEX_URL
ARG PIP_EXTRA_INDEX_URL UV_EXTRA_INDEX_URL
ARG PYTORCH_CUDA_INDEX_BASE_URL

# install build dependencies
# We can specify the standard or nightly build of PyTorch
ARG PYTORCH_NIGHTLY

# Install build dependencies
COPY requirements/build.txt requirements/build.txt
COPY use_existing_torch.py use_existing_torch.py
COPY --from=base /workspace/torch_lib_versions.txt torch_lib_versions.txt

# This timeout (in seconds) is necessary when installing some dependencies via uv since it's likely to time out
# Reference: https://github.com/astral-sh/uv/pull/1694
Expand All @@ -182,8 +211,18 @@ ENV UV_INDEX_STRATEGY="unsafe-best-match"
ENV UV_LINK_MODE=copy

RUN --mount=type=cache,target=/root/.cache/uv \
uv pip install --python /opt/venv/bin/python3 -r requirements/build.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.')
if [ "${PYTORCH_NIGHTLY}" = "1" ]; then \
echo "Installing build requirements without torch..." \
&& python3 use_existing_torch.py --prefix \
&& uv pip install --python /opt/venv/bin/python3 -r requirements/build.txt \
&& echo "Installing torch nightly..." \
&& uv pip install --python /opt/venv/bin/python3 $(cat torch_lib_versions.txt | grep -i "^torch=" | xargs) --pre \
--index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/nightly/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
else \
echo "Installing build requirements..." \
&& uv pip install --python /opt/venv/bin/python3 -r requirements/build.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
fi

WORKDIR /workspace

Expand Down Expand Up @@ -215,6 +254,13 @@ ARG VLLM_MAIN_CUDA_VERSION=""
# Use dummy version for csrc-build wheel (only .so files are extracted, version doesn't matter)
ENV SETUPTOOLS_SCM_PRETEND_VERSION="0.0.0+csrc.build"

# Use existing torch for nightly builds
RUN --mount=type=cache,target=/root/.cache/uv \
if [ "${PYTORCH_NIGHTLY}" = "1" ]; then \
python3 use_existing_torch.py --prefix; \
fi

# Build the vLLM wheel
# if USE_SCCACHE is set, use sccache to speed up compilation
RUN --mount=type=cache,target=/root/.cache/uv \
if [ "$USE_SCCACHE" = "1" ]; then \
Expand Down Expand Up @@ -258,6 +304,7 @@ RUN --mount=type=cache,target=/root/.cache/ccache \
export VLLM_DOCKER_BUILD_CONTEXT=1 && \
python3 setup.py bdist_wheel --dist-dir=dist --py-limited-api=cp38; \
fi

#################### CSRC BUILD IMAGE ####################

#################### EXTENSIONS BUILD IMAGE ####################
Expand Down Expand Up @@ -314,8 +361,13 @@ ARG PIP_INDEX_URL UV_INDEX_URL
ARG PIP_EXTRA_INDEX_URL UV_EXTRA_INDEX_URL
ARG PYTORCH_CUDA_INDEX_BASE_URL

# install build dependencies
# We can specify the standard or nightly build of PyTorch
ARG PYTORCH_NIGHTLY

# Install build dependencies
COPY requirements/build.txt requirements/build.txt
COPY use_existing_torch.py use_existing_torch.py
COPY --from=base /workspace/torch_lib_versions.txt torch_lib_versions.txt

# This timeout (in seconds) is necessary when installing some dependencies via uv since it's likely to time out
# Reference: https://github.com/astral-sh/uv/pull/1694
Expand All @@ -325,14 +377,23 @@ ENV UV_INDEX_STRATEGY="unsafe-best-match"
ENV UV_LINK_MODE=copy

RUN --mount=type=cache,target=/root/.cache/uv \
uv pip install --python /opt/venv/bin/python3 -r requirements/build.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.')
if [ "${PYTORCH_NIGHTLY}" = "1" ]; then \
echo "Installing build requirements without torch..." \
&& python3 use_existing_torch.py --prefix \
&& uv pip install --python /opt/venv/bin/python3 -r requirements/build.txt \
&& echo "Installing torch nightly..." \
&& uv pip install --python /opt/venv/bin/python3 $(cat torch_lib_versions.txt | grep -i "^torch=" | xargs) --pre \
--index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/nightly/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
else \
echo "Installing build requirements..." \
&& uv pip install --python /opt/venv/bin/python3 -r requirements/build.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
fi

WORKDIR /workspace

# Copy pre-built csrc wheel directly
COPY --from=csrc-build /workspace/dist /precompiled-wheels

COPY . .

ARG GIT_REPO_CHECK=0
Expand All @@ -345,6 +406,13 @@ ENV VLLM_TARGET_DEVICE=${vllm_target_device}
# Skip adding +precompiled suffix to version (preserves git-derived version)
ENV VLLM_SKIP_PRECOMPILED_VERSION_SUFFIX=1

# Use existing torch for nightly builds
RUN --mount=type=cache,target=/root/.cache/uv \
if [ "${PYTORCH_NIGHTLY}" = "1" ]; then \
python3 use_existing_torch.py --prefix; \
fi

# Build the vLLM wheel
RUN --mount=type=cache,target=/root/.cache/uv \
--mount=type=bind,source=.git,target=.git \
if [ "${vllm_target_device}" = "cuda" ]; then \
Expand All @@ -367,7 +435,8 @@ RUN if [ "$RUN_WHEEL_CHECK" = "true" ]; then \
else \
echo "Skipping wheel size check."; \
fi
#################### EXTENSION Build IMAGE ####################

#################### WHEEL BUILD IMAGE ####################

#################### DEV IMAGE ####################
FROM base AS dev
Expand All @@ -385,12 +454,34 @@ ENV UV_LINK_MODE=copy

# Install libnuma-dev, required by fastsafetensors (fixes #20384)
RUN apt-get update && apt-get install -y --no-install-recommends libnuma-dev && rm -rf /var/lib/apt/lists/*


# We can specify the standard or nightly build of PyTorch
ARG PYTORCH_NIGHTLY

# Install development dependencies
COPY requirements/lint.txt requirements/lint.txt
COPY requirements/test.in requirements/test.in
COPY requirements/test.txt requirements/test.txt
COPY requirements/dev.txt requirements/dev.txt
COPY use_existing_torch.py use_existing_torch.py
COPY --from=base /workspace/torch_lib_versions.txt torch_lib_versions.txt
RUN --mount=type=cache,target=/root/.cache/uv \
uv pip install --python /opt/venv/bin/python3 -r requirements/dev.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.')
if [ "${PYTORCH_NIGHTLY}" = "1" ]; then \
echo "Installing dev requirements plus torch nightly..." \
&& python3 use_existing_torch.py --prefix \
&& cat torch_lib_versions.txt >> requirements/test.in \
&& uv pip compile requirements/test.in -o requirements/test.txt --index-strategy unsafe-best-match \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/nightly/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.') \
&& uv pip install --python /opt/venv/bin/python3 $(cat torch_lib_versions.txt | xargs) --pre \
-r requirements/dev.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/nightly/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
else \
echo "Installing dev requirements..." \
&& uv pip install --python /opt/venv/bin/python3 -r requirements/dev.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
fi

#################### DEV IMAGE ####################
#################### vLLM installation IMAGE ####################
# image with vLLM installed
Expand Down Expand Up @@ -548,11 +639,26 @@ ARG PIP_EXTRA_INDEX_URL UV_EXTRA_INDEX_URL
ARG PYTORCH_CUDA_INDEX_BASE_URL
ARG PIP_KEYRING_PROVIDER UV_KEYRING_PROVIDER

# Install vllm wheel first, so that torch etc will be installed.
# We can specify the standard or nightly build of PyTorch
ARG PYTORCH_NIGHTLY

# Install vLLM wheel first, so that torch etc will be installed.
# Check whether to install torch nightly instead of release for this build.
COPY --from=base /workspace/torch_lib_versions.txt torch_lib_versions.txt
RUN --mount=type=bind,from=build,src=/workspace/dist,target=/vllm-workspace/dist \
--mount=type=cache,target=/root/.cache/uv \
uv pip install --system dist/*.whl --verbose \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.')
if [ "${PYTORCH_NIGHTLY}" = "1" ]; then \
echo "Installing torch nightly..." \
&& uv pip install --system $(cat torch_lib_versions.txt | xargs) --pre \
--index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/nightly/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.') \
&& echo "Installing vLLM..." \
&& uv pip install --system dist/*.whl --verbose \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/nightly/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
else \
echo "Installing vLLM..." \
&& uv pip install --system dist/*.whl --verbose \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
fi

RUN --mount=type=cache,target=/root/.cache/uv \
. /etc/environment && \
Expand Down Expand Up @@ -612,12 +718,33 @@ RUN echo 'tzdata tzdata/Areas select America' | debconf-set-selections \
&& apt-get update -y \
&& apt-get install -y git

# install development dependencies (for testing)
# We can specify the standard or nightly build of PyTorch
ARG PYTORCH_NIGHTLY

# Install development dependencies (for testing)
COPY requirements/lint.txt requirements/lint.txt
COPY requirements/test.in requirements/test.in
COPY requirements/test.txt requirements/test.txt
COPY requirements/dev.txt requirements/dev.txt
COPY use_existing_torch.py use_existing_torch.py
COPY --from=base /workspace/torch_lib_versions.txt torch_lib_versions.txt
RUN --mount=type=cache,target=/root/.cache/uv \
CUDA_MAJOR="${CUDA_VERSION%%.*}"; \
if [ "$CUDA_MAJOR" -ge 12 ]; then \
uv pip install --system -r requirements/dev.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
if [ "${PYTORCH_NIGHTLY}" = "1" ]; then \
echo "Installing dev requirements plus torch nightly..." \
&& python3 use_existing_torch.py --prefix \
&& cat torch_lib_versions.txt >> requirements/test.in \
&& uv pip compile requirements/test.in -o requirements/test.txt --index-strategy unsafe-best-match \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/nightly/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.') \
&& uv pip install --system $(cat torch_lib_versions.txt | xargs) --pre \
-r requirements/dev.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/nightly/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
else \
echo "Installing dev requirements..." \
&& uv pip install --system -r requirements/dev.txt \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.'); \
fi \
fi

# install development dependencies (for testing)
Expand Down
8 changes: 8 additions & 0 deletions docker/Dockerfile.nightly_torch
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
#######
#
# THIS FILE IS DEPRECATED AND WILL BE REMOVED SHORTLY
#
# Please use the standard Dockerfile with PYTORCH_NIGHTLY=1 instead
#
#######

# The vLLM Dockerfile is used to construct vLLM image against torch nightly that can be directly used for testing

# for torch nightly, cuda >=12.6 is required,
Expand Down
Binary file modified docs/assets/contributing/dockerfile-stages-dependency.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
62 changes: 49 additions & 13 deletions use_existing_torch.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,54 @@
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project

import argparse
import glob
import sys

for file in (*glob.glob("requirements/*.txt"), "pyproject.toml"):
print(f">>> cleaning {file}")
with open(file) as f:
lines = f.readlines()
if "torch" in "".join(lines).lower():
print("removed:")
with open(file, "w") as f:
for line in lines:
if "torch" not in line.lower():
f.write(line)
else:
print(line.strip())
print(f"<<< done cleaning {file}\n")
# Only strip targeted libraries when checking prefix
TORCH_LIB_PREFIXES = (
# requirements/*.txt/in
"torch=",
"torchvision=",
"torchaudio=",
# pyproject.toml
'"torch =',
'"torchvision =',
'"torchaudio =',
)


def main(argv):
parser = argparse.ArgumentParser(
description="Strip torch lib requirements to use installed version."
)
parser.add_argument(
"--prefix",
action="store_true",
help="Strip prefix matches only (default: False)",
)
args = parser.parse_args(argv)

for file in (
*glob.glob("requirements/*.txt"),
*glob.glob("requirements/*.in"),
"pyproject.toml",
):
with open(file) as f:
lines = f.readlines()
if "torch" in "".join(lines).lower():
with open(file, "w") as f:
for line in lines:
if (
args.prefix
and not line.lower().strip().startswith(TORCH_LIB_PREFIXES)
or not args.prefix
and "torch" not in line.lower()
):
f.write(line)
else:
print(f">>> removed from {file}:", line.strip())


if __name__ == "__main__":
main(sys.argv[1:])