Skip to content

[Core] Update PyTorch to 2.12.1, torchvision to 0.27.1, triton to 3.7.1 (test channel)#45082

Draft
atalman wants to merge 5 commits into
vllm-project:mainfrom
atalman:update-pytorch-2.12.1-test
Draft

[Core] Update PyTorch to 2.12.1, torchvision to 0.27.1, triton to 3.7.1 (test channel)#45082
atalman wants to merge 5 commits into
vllm-project:mainfrom
atalman:update-pytorch-2.12.1-test

Conversation

@atalman

@atalman atalman commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Purpose

Please see announcement : https://dev-discuss.pytorch.org/t/pytorch-release-2-12-1/3398

Update the PyTorch ecosystem to the 2.12.1 patch release, resolving wheels from the PyTorch test channel (download.pytorch.org/whl/test/...):

  • torch: 2.11.0 → 2.12.1
  • torchvision: 0.26.0 → 0.27.1
  • triton: 3.6.0 → 3.7.1
  • torchaudio: stays at 2.11.0

This is the test-channel .1 variant of #42848. 2.12.1 is currently published on download.pytorch.org/whl/test/ but not yet on the release index / PyPI, so the index URLs point at the test channel (following the approach in #40077).

Test-channel index URLs

Switched to whl/test/... for:

  • CUDA (cu130): requirements/cuda.txt, requirements/build/cuda.txt, requirements/test/cuda.in, docker/Dockerfile (PYTORCH_CUDA_INDEX_BASE_URL), docker/versions.json
  • CPU: requirements/cpu.txt, requirements/build/cpu.txt, docker/Dockerfile.cpu, docker/Dockerfile.s390x
  • torchao test step: .buildkite/test_areas/quantization.yaml

docker/Dockerfile.cpu seeds requirements/test/cpu.in from cuda.in (which now points at whl/test/cu130), so a sed redirect rewrites it to whl/test/cpu, and --torch-backend cpu is dropped so the explicit test-channel index is used. ROCm keeps its existing whl/rocm7.1 index (matching #40077).

CUDA 13 transitive deps

Bumped to match torch==2.12.x+cu130 (same values as #42848):

  • nvidia-cudnn-cu13: 9.19.0.56 → 9.20.0.48
  • nvidia-cusparselt-cu13: 0.8.0 → 0.8.1
  • nvidia-nccl-cu13: 2.28.9 → 2.29.7

CPU compatibility test fix

.buildkite/scripts/hardware_ci/run-cpu-compatibility-test.sh previously set TORCH_COMPILE_DISABLE=1. On torch 2.12 that's no longer a silent no-op when call sites pass fullgraph=True (engine init goes through vLLM's piecewise-compile path), so it raises and crashes init. Switched to vLLM's --enforce-eager flag, which never constructs a torch.compile wrapper — same SDE speedup, works on both 2.11 and 2.12. (Same fix as #42848.)

Test Plan

CI sign-off — Buildkite full daily / full nightly runs on this branch:

  • CUDA build + tests (CUDA 13)
  • CPU build + tests (x86_64, aarch64, s390x)
  • ROCm build
  • CPU SDE compatibility test (Sky Lake / Cascade Lake / Cooper Lake)

Test Result

To be filled in once CI completes on this branch.

Notes

Related

….1 (test channel)

Update the PyTorch ecosystem to the 2.12.1 patch release, resolving wheels
from the PyTorch test channel (download.pytorch.org/whl/test/...):
- torch: 2.11.0 -> 2.12.1
- torchvision: 0.26.0 -> 0.27.1
- triton: 3.6.0 -> 3.7.1
- torchaudio: stays at 2.11.0

Index URLs switched to the test channel (CUDA cu130, CPU, torchao) since
2.12.1 is published on download.pytorch.org/whl/test/ but not yet on the
release index / PyPI. ROCm continues to use its existing index.

CUDA 13 transitive deps bumped to match torch==2.12.x+cu130:
- nvidia-cudnn-cu13: 9.19.0.56 -> 9.20.0.48
- nvidia-cusparselt-cu13: 0.8.0 -> 0.8.1
- nvidia-nccl-cu13: 2.28.9 -> 2.29.7

CPU compatibility test: switched from TORCH_COMPILE_DISABLE=1 to vLLM's
--enforce-eager flag, which torch 2.12 requires (TORCH_COMPILE_DISABLE is
no longer a silent no-op when callers pass fullgraph=True).

Mirrors vllm-project#42848 (release-channel 2.12.0 bump) and vllm-project#40077 (test-channel
wiring), targeting the 2.12.1 patch release on the test channel.
@mergify mergify Bot added ci/build nvidia cpu Related to CPU backends labels Jun 10, 2026
atalman added 3 commits June 10, 2026 06:57
torch==2.12.1 is a pre-release that is not on PyPI yet, so the
Python-only Installation job's `pip3 install -e .` could not resolve the
build-time torch dependency (`No matching distribution found for
torch==2.12.1`; PyPI only has up to 2.12.0). Add
`--extra-index-url https://download.pytorch.org/whl/test/cu130` so it
resolves from the PyTorch test channel, matching docker/Dockerfile
(PYTORCH_CUDA_INDEX_BASE_URL) and the other CI install paths. This is a
release-only workaround to be dropped once torch 2.12.1 is on PyPI.

Test Plan: re-run the "Python-only Installation" job on the
update-pytorch-2.12.1-test branch; the build-dependency install now
finds torch 2.12.1 from the test channel instead of failing on PyPI.

Authored with the assistance of Claude Code.
test_text_content_and_prompt_embeds_match_with_audio_embeds[text-then-audio_embeds]
fails on torch 2.12: when the text/prompt_embeds part precedes the audio
part, the prompt_embeds output diverges from the raw-text output under
--enforce-eager (deterministic). This is a tracked torch-side regression,
not a vLLM bug, so mark just that parameterization xfail(strict=True) to
unblock release CI while keeping the assertion running. The
audio_embeds-then-text case is unaffected, and strict=True turns it into
a failure (prompting marker removal) once the regression is fixed.

Tracked at pytorch/pytorch#184431.

Test Plan: "Entrypoints Integration (Multimodal)" job on the
update-pytorch-2.12.1-test branch - text-then-audio_embeds reports XFAIL
instead of failing the job; audio_embeds-then-text still passes.

Authored with the assistance of Claude Code.
@atalman atalman force-pushed the update-pytorch-2.12.1-test branch from 2ae680f to 8df38ff Compare June 10, 2026 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build cpu Related to CPU backends nvidia

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant