Skip to content

[CI] Add sm_110 to aarch64 CUDA 13.0 builds#31544

Open
NebulaTurnip27 wants to merge 1 commit into
vllm-project:mainfrom
NebulaTurnip27:fix/add-sm110-aarch64-cuda13
Open

[CI] Add sm_110 to aarch64 CUDA 13.0 builds#31544
NebulaTurnip27 wants to merge 1 commit into
vllm-project:mainfrom
NebulaTurnip27:fix/add-sm110-aarch64-cuda13

Conversation

@NebulaTurnip27
Copy link
Copy Markdown

@NebulaTurnip27 NebulaTurnip27 commented Dec 30, 2025

Purpose

This adds sm_110 (11.0) to the torch_cuda_arch_list for the aarch64 CUDA 13.0 wheel build introduced in #30341, ensuring the builds work on Jetson Thor.

Test Plan

building locally for sm110 works and has been working since 0.11.0.

@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

@NebulaTurnip27
Copy link
Copy Markdown
Author

@Harry-Chen @youkaichao

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for sm_110 to the aarch64 CUDA 13.0 build configuration. The change is straightforward, but I've identified a potential inconsistency in how future GPU architectures are specified in the torch_cuda_arch_list. My feedback includes a suggestion to use the +PTX suffix for all future architectures to ensure better forward-compatibility and maintain consistency within the build script.

# #NOTE: torch_cuda_arch_list is derived from upstream PyTorch build files here:
# https://github.com/pytorch/pytorch/blob/main/.ci/aarch64_linux/aarch64_ci_build.sh#L7
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04 --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 11.0 12.0' --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04 --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

For consistency and forward-compatibility, it's best practice to use the +PTX suffix for future GPU architectures. This ensures that only PTX code is generated, which can be JIT-compiled by drivers on future hardware, rather than attempting to generate native SASS code which may not be possible with the current compiler version.

I notice that 10.0 is specified as 10.0+PTX, but the newly added 11.0 and the existing 12.0 lack this suffix. To maintain consistency and follow best practices, I recommend adding +PTX to both 11.0 and 12.0.

      - "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 11.0+PTX 12.0+PTX' --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04  --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."

@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

@github-actions github-actions Bot added the stale Over 90 days of inactivity label Mar 31, 2026
@xs-alt
Copy link
Copy Markdown

xs-alt commented Apr 17, 2026

Any updates~?

@Harry-Chen
Copy link
Copy Markdown
Member

@xs-alt @NebulaTurnip27 I have absorbed it in #39878.

@NebulaTurnip27
Copy link
Copy Markdown
Author

Thank you!!

@github-actions github-actions Bot added unstale Recieved activity after being labelled stale and removed stale Over 90 days of inactivity labels Apr 18, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 18, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @NebulaTurnip27.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label Apr 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build needs-rebase nvidia unstale Recieved activity after being labelled stale

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants