Skip to content

[Core] Update PyTorch to 2.12.0, torchvision to 0.27.0, triton to 3.7.0#42848

Draft
atalman wants to merge 4 commits into
vllm-project:mainfrom
atalman:fix_release_212
Draft

[Core] Update PyTorch to 2.12.0, torchvision to 0.27.0, triton to 3.7.0#42848
atalman wants to merge 4 commits into
vllm-project:mainfrom
atalman:fix_release_212

Conversation

@atalman

@atalman atalman commented May 16, 2026

Copy link
Copy Markdown
Contributor

Purpose

Update the PyTorch ecosystem to the released versions:

  • torch: 2.11.0 → 2.12.0
  • torchvision: 0.26.0 → 0.27.0
  • triton: 3.6.0 → 3.7.0
  • torchaudio: stays at 2.11.0

This PR supersedes #40077. Now that the torch 2.12.0 release is published on PyPI / download.pytorch.org/whl/, no temporary whl/test/ index URLs are needed — wheels resolve from the regular indexes.

CUDA 13 transitive deps

Bumped to match torch==2.12.0+cu130:

  • nvidia-cudnn-cu13: 9.19.0.56 → 9.20.0.48
  • nvidia-cusparselt-cu13: 0.8.0 → 0.8.1
  • nvidia-nccl-cu13: 2.28.9 → 2.29.7

CPU compatibility test fix

.buildkite/scripts/hardware_ci/run-cpu-compatibility-test.sh previously set TORCH_COMPILE_DISABLE=1 to skip torch.compile (slow under SDE). On torch 2.11 this turned every torch.compile call site into a silent no-op. On torch 2.12, call sites that pass fullgraph=True now raise:

RuntimeError: Worker failed with error 'torch.compile with fullgraph=True
found no compiled frames. The frame was likely skipped (...).'

Engine init goes through vLLM's piecewise-compile path (which uses fullgraph=True), so init crashes inside determine_available_memory. Switched to vLLM's canonical --enforce-eager engine flag, which never constructs a torch.compile wrapper at all — same SDE speedup, no contract violation, works on both torch 2.11 and 2.12.

Tracked upstream as pytorch/pytorch#181247 (under umbrella pytorch/pytorch#180899).

Test Plan

CI sign-off — Buildkite full daily / full nightly runs on this branch:

  • CUDA build + tests (CUDA 13)
  • CPU build + tests (x86_64, aarch64, s390x)
  • ROCm build
  • CPU SDE compatibility test (Sky Lake / Cascade Lake / Cooper Lake)

Test Result

To be filled in once CI completes on this branch.

Duplicate-work check

AI-assistance disclosure

AI assistance (Claude) was used to draft the changes. Every changed line was reviewed and the test plan above was constructed and run by a human submitter.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

@mergify mergify Bot added the cpu Related to CPU backends label May 16, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the project's dependencies to support PyTorch 2.12.0 across various build and environment configurations, including CUDA, ROCm, CPU, and s390x. Key changes include version bumps for torch, torchvision, triton, and several NVIDIA libraries. Additionally, the hardware CI script for CPU compatibility was updated to use the --enforce-eager flag, replacing the TORCH_COMPILE_DISABLE environment variable to prevent crashes during engine initialization with the new PyTorch version. I have no further feedback to provide.

Update PyTorch ecosystem versions:
- torch: 2.11.0 -> 2.12.0
- torchvision: 0.26.0 -> 0.27.0
- triton: 3.6.0 -> 3.7.0
- torchaudio: stays at 2.11.0

Bump CUDA 13 deps to match torch 2.12.0+cu130:
- nvidia-cudnn-cu13: 9.19.0.56 -> 9.20.0.48
- nvidia-cusparselt-cu13: 0.8.0 -> 0.8.1
- nvidia-nccl-cu13: 2.28.9 -> 2.29.7

Use --enforce-eager instead of TORCH_COMPILE_DISABLE=1 in the
CPU SDE compat test. On torch 2.11 TORCH_COMPILE_DISABLE turned
torch.compile call sites into silent no-ops; on torch 2.12 sites
that pass fullgraph=True now raise "found no compiled frames",
which crashes engine init via vLLM's piecewise-compile path.
--enforce-eager skips the wrapper entirely on both versions.

Supersedes vllm-project#40077 (release wheels are now published, so the
download.pytorch.org/whl/test/ indexes are no longer needed).

Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: atalman <atalman@meta.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build cpu Related to CPU backends nvidia

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant