-
-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[release 2.11] Update to torch 2.11 #34644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
eb0f7d3
156da9e
d6c7975
543d8c3
638af4b
dc7b0c3
059f048
339f43f
632a9e4
44f2787
606e7dd
106ae57
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,5 @@ | ||
| group: Quantization | ||
| depends_on: | ||
| depends_on: | ||
| - image-build | ||
| steps: | ||
| - label: Quantization | ||
|
|
@@ -16,7 +16,7 @@ steps: | |
| # https://github.com/pytorch/ao/issues/2919, we'll have to skip new torchao tests for now | ||
| # we can only upgrade after this is resolved | ||
| # TODO(jerryzh168): resolve the above comment | ||
| - uv pip install --system torchao==0.14.1 --index-url https://download.pytorch.org/whl/cu129 | ||
| - uv pip install --system torchao==0.17.0 --index-url https://download.pytorch.org/whl/cu130 | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Torchao seems to be consistent in its failures between this version and our current nightly:
Therefore, there is no obvious problem here for now. |
||
| - uv pip install --system conch-triton-kernels | ||
| - VLLM_TEST_FORCE_LOAD_FORMAT=auto pytest -v -s quantization/ --ignore quantization/test_blackwell_moe.py | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,7 +22,7 @@ | |
| # docker buildx bake -f docker/docker-bake.hcl -f docker/versions.json | ||
| # ============================================================================= | ||
|
|
||
| ARG CUDA_VERSION=12.9.1 | ||
| ARG CUDA_VERSION=13.0.0 | ||
| ARG PYTHON_VERSION=3.12 | ||
| ARG UBUNTU_VERSION=22.04 | ||
|
|
||
|
|
@@ -37,7 +37,7 @@ ARG UBUNTU_VERSION=22.04 | |
| # compatibility with other Linux OSes. The main reason for this is that the | ||
| # glibc version is baked into the distro, and binaries built with one glibc | ||
| # version are not backwards compatible with OSes that use an earlier version. | ||
| ARG BUILD_BASE_IMAGE=nvidia/cuda:${CUDA_VERSION}-devel-ubuntu20.04 | ||
| ARG BUILD_BASE_IMAGE=nvidia/cuda:${CUDA_VERSION}-devel-ubuntu22.04 | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updating the build base image Ubuntu version will likely affect the glibc support, as noted in the comment above. Are there no CUDA 13 images available for 20.04? Cc @tlrmchlsmth
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @mgoin no unfortunately they start with 20.04: https://hub.docker.com/r/nvidia/cuda/tags?name=13.0.0-base-ubuntu |
||
| # Using cuda base image with minimal dependencies necessary for JIT compilation (FlashInfer, DeepGEMM, EP kernels) | ||
| ARG FINAL_BASE_IMAGE=nvidia/cuda:${CUDA_VERSION}-base-ubuntu${UBUNTU_VERSION} | ||
|
|
||
|
|
@@ -546,17 +546,21 @@ RUN apt-get update -y \ | |
| # Install CUDA development tools for runtime JIT compilation | ||
| # (FlashInfer, DeepGEMM, EP kernels all require compilation at runtime) | ||
| RUN CUDA_VERSION_DASH=$(echo $CUDA_VERSION | cut -d. -f1,2 | tr '.' '-') && \ | ||
| CUDA_VERSION_SHORT=$(echo $CUDA_VERSION | cut -d. -f1,2) && \ | ||
| apt-get update -y && \ | ||
| apt-get install -y --no-install-recommends \ | ||
| apt-get install -y --no-install-recommends --allow-change-held-packages \ | ||
| cuda-nvcc-${CUDA_VERSION_DASH} \ | ||
| cuda-cudart-${CUDA_VERSION_DASH} \ | ||
| cuda-nvrtc-${CUDA_VERSION_DASH} \ | ||
| cuda-cuobjdump-${CUDA_VERSION_DASH} \ | ||
| libcurand-dev-${CUDA_VERSION_DASH} \ | ||
| libcublas-${CUDA_VERSION_DASH} \ | ||
| # Fixes nccl_allocator requiring nccl.h at runtime | ||
| # https://github.com/vllm-project/vllm/blob/1336a1ea244fa8bfd7e72751cabbdb5b68a0c11a/vllm/distributed/device_communicators/pynccl_allocator.py#L22 | ||
| libnccl-dev && \ | ||
| libcublas-${CUDA_VERSION_DASH} && \ | ||
| # Fixes nccl_allocator requiring nccl.h at runtime | ||
| # https://github.com/vllm-project/vllm/blob/1336a1ea244fa8bfd7e72751cabbdb5b68a0c11a/vllm/distributed/device_communicators/pynccl_allocator.py#L22 | ||
| # NCCL packages don't use the cuda-MAJOR-MINOR naming convention, | ||
| # so we pin the version to match our CUDA version | ||
| NCCL_VER=$(apt-cache madison libnccl-dev | grep "+cuda${CUDA_VERSION_SHORT}" | head -1 | awk -F'|' '{gsub(/^ +| +$/, "", $2); print $2}') && \ | ||
| apt-get install -y --no-install-recommends --allow-change-held-packages libnccl-dev=${NCCL_VER} libnccl2=${NCCL_VER} && \ | ||
| rm -rf /var/lib/apt/lists/* | ||
|
|
||
| # Install uv for faster pip installs | ||
|
|
@@ -822,7 +826,7 @@ RUN --mount=type=cache,target=/root/.cache/uv \ | |
| uv pip install --system -r /tmp/kv_connectors.txt --no-build || ( \ | ||
| # if the above fails, install from source | ||
| apt-get update -y && \ | ||
| apt-get install -y --no-install-recommends ${BUILD_PKGS} && \ | ||
| apt-get install -y --no-install-recommends --allow-change-held-packages ${BUILD_PKGS} && \ | ||
| uv pip install --system -r /tmp/kv_connectors.txt --no-build-isolation && \ | ||
| apt-get purge -y ${BUILD_PKGS} && \ | ||
| # clean up -dev packages, keep runtime libraries | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,10 +1,11 @@ | ||
| --extra-index-url https://download.pytorch.org/whl/cpu | ||
| cmake>=3.26.1 | ||
| ninja | ||
| packaging>=24.2 | ||
| setuptools==77.0.3 # this version can reuse CMake build dir | ||
| setuptools-scm>=8 | ||
| torch==2.10.0+cpu; platform_machine == "x86_64" or platform_machine == "s390x" | ||
| torch==2.10.0; platform_machine == "aarch64" or platform_system == "Darwin" or platform_machine == "ppc64le" | ||
| torch==2.11.0+cpu; platform_machine == "x86_64" or platform_machine == "s390x" or platform_machine == "aarch64" | ||
| torch==2.11.0; platform_system == "Darwin" or platform_machine == "ppc64le" or platform_machine == "riscv64" | ||
| wheel | ||
| jinja2>=3.1.6 | ||
| regex |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,3 @@ | ||
| lmcache >= 0.3.9 | ||
| nixl >= 0.7.1, < 0.10.0 # Required for disaggregated prefill | ||
| nixl[cu13] >= 0.7.1, < 0.10.0 # Required for disaggregated prefill | ||
| mooncake-transfer-engine >= 0.3.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also change this version in
test-amd.yaml? Btw was there a problem with 1.5.2?