[CI] Implement uploading to PyPI and GitHub in the release pipeline, enable release image building for CUDA 13.0 by Harry-Chen · Pull Request #31032 · vllm-project/vllm

Harry-Chen · 2025-12-19T14:17:52Z

Purpose

The current release workflow involves manual uploading to PyPI and GitHub Releases. This needs to be automated.

Test Plan

Not tested. Needs to be reviewed and tested in the next release, maybe.

Test Result

Pending...

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request automates the process of uploading release artifacts to PyPI and GitHub by introducing a new step in the release pipeline and an associated script. The changes are well-intentioned and move towards better automation. However, the new release script has a few issues that need to be addressed. I've identified a critical bug that would cause the script to fail because it attempts to write to a directory before creating it. Additionally, there are a couple of high-severity issues: the source tarball is not being uploaded to the GitHub release, and the way file paths are passed to twine is not robust and could lead to failures. I have provided code suggestions to fix these problems.

.buildkite/scripts/upload-release-wheels.sh

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

.buildkite/scripts/upload-release-wheels.sh

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

mergify · 2025-12-23T02:05:55Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Harry-Chen.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>

Harry-Chen · 2026-01-15T04:02:09Z

If #31822 gets merged first, I will remove the duplicated part in my PR.

khluu · 2026-01-16T12:19:13Z

Launching a test build here: https://buildkite.com/vllm/release/builds/12029

…enable release image building for CUDA 13.0 (#31032) (cherry picked from commit 8e61425)

wangshangsam

@Harry-Chen if you are going to take #31822 and merge it as your own, could you at least check out the differences and consult us (NVIDIA) why those differences are there?

FYI, CUDA_VERSION=13.0.1 on vLLM, we have tested out extensively on (G)B(2|3)00. CUDA_VERSION=13.0.2, while it likely still would work out-of-the-box, we haven't tested it much in the context of vLLM.

wangshangsam · 2026-01-17T08:16:53Z

.buildkite/release-pipeline.yaml

+      queue: cpu_queue_postmerge
+    commands:
+      - "aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/q9t5s3a7"
+      - "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.2 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.2-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130 --target vllm-openai --progress plain -f docker/Dockerfile ."


Suggested change

- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.2 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.2-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130 --target vllm-openai --progress plain -f docker/Dockerfile ."

- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.2 --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.2-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130 --target vllm-openai --progress plain -f docker/Dockerfile ."

FLASHINFER_AOT_COMPILE no longer exists.

wangshangsam · 2026-01-17T08:19:07Z

.buildkite/release-pipeline.yaml

+      queue: arm64_cpu_queue_postmerge
+    commands:
+      - "aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/q9t5s3a7"
+      - "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.2 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.2-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130 --target vllm-openai --progress plain -f docker/Dockerfile ."


Suggested change

- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.2 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.2-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130 --target vllm-openai --progress plain -f docker/Dockerfile ."

- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.2 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0 12.1' --build-arg INSTALL_KV_CONNECTORS=true --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.2-devel-ubuntu22.04 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cu130 --target vllm-openai --progress plain -f docker/Dockerfile ."

wangshangsam · 2026-01-17T08:20:55Z

.buildkite/release-pipeline.yaml

@@ -26,12 +26,12 @@ steps:
      - "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04  --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."


Suggested change

- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04 --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."

- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0 12.1' --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04 --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."

We need 12.1 to support DGX Spark.

wangshangsam · 2026-01-17T08:22:21Z

.buildkite/release-pipeline.yaml

@@ -26,12 +26,12 @@ steps:
      - "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04  --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."


Why is CUDA_VERSION=13.0.2 below for building the image, but CUDA_VERSION=13.0.1 for building the wheel?

Harry-Chen · 2026-01-17T08:33:01Z

@Harry-Chen if you are going to take #31822 and merge it as your own, could you at least check out the differences and consult us (NVIDIA) why those differences are there?

Please note that my PR is older than yours, and I did not mean to "take" anything as my own from the beginning. This request was forwarded from @youkaichao to me about one month ago, and I did not notice the PR until recently, and left my comment here: #31032 (comment)

That said, I think we are willing to accept any changes if it is a more mature one.

…enable release image building for CUDA 13.0 (vllm-project#31032) Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

…enable release image building for CUDA 13.0 (vllm-project#31032)

Copilot AI review requested due to automatic review settings December 19, 2025 14:17

mergify bot added the ci/build label Dec 19, 2025

gemini-code-assist bot reviewed Dec 19, 2025

View reviewed changes

.buildkite/scripts/upload-release-wheels.sh Outdated Show resolved Hide resolved

.buildkite/scripts/upload-release-wheels.sh Outdated Show resolved Hide resolved

.buildkite/scripts/upload-release-wheels.sh Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Dec 19, 2025

View reviewed changes

.buildkite/scripts/upload-release-wheels.sh Show resolved Hide resolved

.buildkite/scripts/upload-release-wheels.sh Outdated Show resolved Hide resolved

Copilot started reviewing on behalf of Harry-Chen December 19, 2025 14:29 View session

Copilot AI reviewed Dec 19, 2025

View reviewed changes

mergify bot added the needs-rebase label Dec 23, 2025

Harry-Chen force-pushed the release-pipeline-upload branch from cb3e8d8 to a0f80a0 Compare December 31, 2025 09:00

mergify bot removed the needs-rebase label Dec 31, 2025

Harry-Chen added 3 commits January 5, 2026 20:03

ci: unify step names in release pipeline

2df00bf

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>

ci: implement uploading of wheels to pypi and github in release pipeline

e760766

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>

Fix comments from Gemini

0017b01

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>

Harry-Chen force-pushed the release-pipeline-upload branch from a0f80a0 to 1ed3b18 Compare January 5, 2026 12:21

Harry-Chen changed the title ~~[CI] Implement uploading to PyPI and GitHub in the release pipeline~~ [CI] Implement uploading to PyPI and GitHub in the release pipeline, enable release image building for CUDA 13.0 Jan 5, 2026

mergify bot added the nvidia label Jan 5, 2026

github-project-automation bot added this to NVIDIA Jan 5, 2026

Harry-Chen force-pushed the release-pipeline-upload branch from 1ed3b18 to 4bb10ef Compare January 5, 2026 12:29

ci: build release images for CUDA 13.0 (needs manual trigger)

b3bcaa1

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>

Harry-Chen force-pushed the release-pipeline-upload branch from 4bb10ef to b3bcaa1 Compare January 5, 2026 12:33

Harry-Chen added 3 commits January 5, 2026 20:41

ci: fix BUILD_BASE_IMAGE for CUDA 13 release image

584bf41

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>

ci: fix nits in release pipeline

f3f678d

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>

ci: add block to confirm before uploading wheels to pypi

842e58a

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>

simon-mo assigned khluu Jan 13, 2026

khluu enabled auto-merge (squash) January 17, 2026 04:10

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 17, 2026

khluu approved these changes Jan 17, 2026

View reviewed changes

github-project-automation bot moved this to Ready in NVIDIA Jan 17, 2026

khluu merged commit 8e61425 into vllm-project:main Jan 17, 2026
29 of 30 checks passed

github-project-automation bot moved this from Ready to Done in NVIDIA Jan 17, 2026

khluu pushed a commit that referenced this pull request Jan 17, 2026

[CI] Implement uploading to PyPI and GitHub in the release pipeline, …

b17039b

…enable release image building for CUDA 13.0 (#31032) (cherry picked from commit 8e61425)

wangshangsam reviewed Jan 17, 2026

View reviewed changes

Harry-Chen deleted the release-pipeline-upload branch January 17, 2026 08:34

Harry-Chen mentioned this pull request Jan 17, 2026

[build] fix cu130 related release pipeline steps and publish as nightly image #32522

Merged

5 tasks

dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026

[CI] Implement uploading to PyPI and GitHub in the release pipeline, …

ece9ae9

…enable release image building for CUDA 13.0 (vllm-project#31032) Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026

[CI] Implement uploading to PyPI and GitHub in the release pipeline, …

c3926f6

…enable release image building for CUDA 13.0 (vllm-project#31032)

dougbtv mentioned this pull request Feb 23, 2026

[Release] Include source distribution (sdist) in PyPI uploads #35136

Merged

		@@ -26,12 +26,12 @@ steps:
		- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04 --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."

Uh oh!

Conversation

Harry-Chen commented Dec 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Dec 23, 2025

Uh oh!

Harry-Chen commented Jan 15, 2026

Uh oh!

khluu commented Jan 16, 2026

Uh oh!

Uh oh!

wangshangsam left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangshangsam Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

wangshangsam Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

wangshangsam Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

wangshangsam Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Harry-Chen commented Jan 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Harry-Chen commented Dec 19, 2025 •

edited by github-actions bot

Loading

wangshangsam left a comment •

edited

Loading