Skip to content

[CI] Implement uploading to PyPI and GitHub in the release pipeline, enable release image building for CUDA 13.0#31032

Merged
khluu merged 7 commits intovllm-project:mainfrom
Harry-Chen:release-pipeline-upload
Jan 17, 2026
Merged

[CI] Implement uploading to PyPI and GitHub in the release pipeline, enable release image building for CUDA 13.0#31032
khluu merged 7 commits intovllm-project:mainfrom
Harry-Chen:release-pipeline-upload

Conversation

@Harry-Chen
Copy link
Copy Markdown
Member

@Harry-Chen Harry-Chen commented Dec 19, 2025

Purpose

The current release workflow involves manual uploading to PyPI and GitHub Releases. This needs to be automated.

Test Plan

Not tested. Needs to be reviewed and tested in the next release, maybe.

Test Result

Pending...


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copilot AI review requested due to automatic review settings December 19, 2025 14:17
@mergify mergify bot added the ci/build label Dec 19, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request automates the process of uploading release artifacts to PyPI and GitHub by introducing a new step in the release pipeline and an associated script. The changes are well-intentioned and move towards better automation. However, the new release script has a few issues that need to be addressed. I've identified a critical bug that would cause the script to fail because it attempts to write to a directory before creating it. Additionally, there are a couple of high-severity issues: the source tarball is not being uploaded to the GitHub release, and the way file paths are passed to twine is not robust and could lead to failures. I have provided code suggestions to fix these problems.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@mergify
Copy link
Copy Markdown

mergify bot commented Dec 23, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Harry-Chen.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 23, 2025
@Harry-Chen Harry-Chen force-pushed the release-pipeline-upload branch from cb3e8d8 to a0f80a0 Compare December 31, 2025 09:00
@mergify mergify bot removed the needs-rebase label Dec 31, 2025
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
@Harry-Chen Harry-Chen force-pushed the release-pipeline-upload branch from a0f80a0 to 1ed3b18 Compare January 5, 2026 12:21
@Harry-Chen Harry-Chen changed the title [CI] Implement uploading to PyPI and GitHub in the release pipeline [CI] Implement uploading to PyPI and GitHub in the release pipeline, enable release image building for CUDA 13.0 Jan 5, 2026
@mergify mergify bot added the nvidia label Jan 5, 2026
@Harry-Chen Harry-Chen force-pushed the release-pipeline-upload branch from 1ed3b18 to 4bb10ef Compare January 5, 2026 12:29
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
@Harry-Chen Harry-Chen force-pushed the release-pipeline-upload branch from 4bb10ef to b3bcaa1 Compare January 5, 2026 12:33
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
@Harry-Chen
Copy link
Copy Markdown
Member Author

If #31822 gets merged first, I will remove the duplicated part in my PR.

@khluu
Copy link
Copy Markdown
Collaborator

khluu commented Jan 16, 2026

Launching a test build here: https://buildkite.com/vllm/release/builds/12029

@khluu khluu enabled auto-merge (squash) January 17, 2026 04:10
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 17, 2026
@github-project-automation github-project-automation bot moved this to Ready in NVIDIA Jan 17, 2026
@khluu khluu merged commit 8e61425 into vllm-project:main Jan 17, 2026
29 of 30 checks passed
@github-project-automation github-project-automation bot moved this from Ready to Done in NVIDIA Jan 17, 2026
khluu pushed a commit that referenced this pull request Jan 17, 2026
…enable release image building for CUDA 13.0 (#31032)

(cherry picked from commit 8e61425)
Copy link
Copy Markdown
Collaborator

@wangshangsam wangshangsam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Harry-Chen if you are going to take #31822 and merge it as your own, could you at least check out the differences and consult us (NVIDIA) why those differences are there?

FYI, CUDA_VERSION=13.0.1 on vLLM, we have tested out extensively on (G)B(2|3)00. CUDA_VERSION=13.0.2, while it likely still would work out-of-the-box, we haven't tested it much in the context of vLLM.

queue: cpu_queue_postmerge
commands:
- "aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/q9t5s3a7"
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.2 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.2-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130 --target vllm-openai --progress plain -f docker/Dockerfile ."
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.2 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.2-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130 --target vllm-openai --progress plain -f docker/Dockerfile ."
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.2 --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.2-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130 --target vllm-openai --progress plain -f docker/Dockerfile ."

FLASHINFER_AOT_COMPILE no longer exists.

queue: arm64_cpu_queue_postmerge
commands:
- "aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/q9t5s3a7"
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.2 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.2-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130 --target vllm-openai --progress plain -f docker/Dockerfile ."
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.2 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.2-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130 --target vllm-openai --progress plain -f docker/Dockerfile ."
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.2 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0 12.1' --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.2-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130 --target vllm-openai --progress plain -f docker/Dockerfile ."

@@ -26,12 +26,12 @@ steps:
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04 --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04 --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0 12.1' --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04 --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."

We need 12.1 to support DGX Spark.

@@ -26,12 +26,12 @@ steps:
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04 --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is CUDA_VERSION=13.0.2 below for building the image, but CUDA_VERSION=13.0.1 for building the wheel?

@Harry-Chen
Copy link
Copy Markdown
Member Author

@Harry-Chen if you are going to take #31822 and merge it as your own, could you at least check out the differences and consult us (NVIDIA) why those differences are there?

Please note that my PR is older than yours, and I did not mean to "take" anything as my own from the beginning. This request was forwarded from @youkaichao to me about one month ago, and I did not notice the PR until recently, and left my comment here: #31032 (comment)

That said, I think we are willing to accept any changes if it is a more mature one.

@Harry-Chen Harry-Chen deleted the release-pipeline-upload branch January 17, 2026 08:34
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
…enable release image building for CUDA 13.0 (vllm-project#31032)

Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build nvidia ready ONLY add when PR is ready to merge/full CI is needed

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants