Skip to content

Conversation

@simon-mo
Copy link
Collaborator

@simon-mo simon-mo commented Sep 15, 2025

Purpose

  • Add in a new env var envs.VLLM_MAIN_CUDA_VERSION to prevent wheels suffix +cu129 for arm wheels.
  • Fix some release pipeline's blocking issues
  • Update annotate-release.sh

Test Plan

Will trigger a release pipeline build

Test Result

ARM was built without suffix https://buildkite.com/vllm/release/builds/8247/steps/canvas?sid=01994ea6-7da0-4e8b-9cb3-e2f25224d543

aws s3 cp artifacts/dist/vllm-1.0.0.dev-cp38-abi3-manylinux2014_aarch64.whl s3://vllm-wheels/1.0.0.dev/

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: simon-mo <[email protected]>
Signed-off-by: simon-mo <[email protected]>
@mergify mergify bot added the ci/build label Sep 15, 2025
@simon-mo simon-mo marked this pull request as ready for review September 15, 2025 18:17
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex Review: Here are some suggestions.

Reply with @codex fix comments to fix any unresolved comments.

About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".

Comment on lines 506 to 510
else:
cuda_version = str(get_nvcc_cuda_version())
if cuda_version != MAIN_CUDA_VERSION:
if cuda_version != envs.MAIN_CUDA_VERSION:
cuda_version_str = cuda_version.replace(".", "")[:3]
# skip this for source tarball, required for pypi

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P0] Access undefined envs.MAIN_CUDA_VERSION in setup

get_vllm_version() now compares the CUDA/ROCm version against envs.MAIN_CUDA_VERSION, but envs exposes the value under VLLM_MAIN_CUDA_VERSION via __getattr__. Because no attribute named MAIN_CUDA_VERSION exists, importing or running setup.py in a CUDA/ROCm environment will raise AttributeError, stopping wheel builds (including the new arm64 build) before artifacts are produced. The comparison should reference envs.VLLM_MAIN_CUDA_VERSION instead.

Useful? React with 👍 / 👎.

Comment on lines 41 to 44
docker manifest create vllm/vllm-openai:latest vllm/vllm-openai:latest-x86_64 vllm/vllm-openai:latest-aarch64 --amend
docker manifest create vllm/vllm-openai:v${RELEASE_VERSION} vllm/vllm-openai:v${RELEASE_VERSION}-x86_64 vllm/vllm-openai:v${RELEASE_VERSION}-aarch64 --amend
docker manifest push vllm/vllm-openai:v${RELEASE_VERSION}
\`\`\`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Multi-arch image instructions never push :latest manifest

The annotated release instructions create two manifests (vllm/vllm-openai:latest and vllm/vllm-openai:v${RELEASE_VERSION}) but only push the versioned manifest. After executing this script, DockerHub will have the architecture-specific tags and the versioned multi-arch tag, yet the latest manifest remains local and docker pull vllm/vllm-openai:latest will continue to serve the previous release. A docker manifest push vllm/vllm-openai:latest is still required.

Useful? React with 👍 / 👎.

Signed-off-by: simon-mo <[email protected]>
Signed-off-by: simon-mo <[email protected]>
@simon-mo simon-mo merged commit fd2f105 into vllm-project:main Sep 15, 2025
16 of 17 checks passed
simon-mo added a commit that referenced this pull request Sep 15, 2025
shyeh25 pushed a commit to shyeh25/vllm that referenced this pull request Sep 23, 2025
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
QierLi pushed a commit to QierLi/vllm that referenced this pull request Oct 5, 2025
Signed-off-by: bbartels <[email protected]>

[gpt-oss] Add IncompleteDetails to ResponsesRepsonse (vllm-project#24561)

Signed-off-by: Andrew Xia <[email protected]>

[gpt-oss][1a] create_responses stream outputs BaseModel type, api server is SSE still (vllm-project#24759)

Signed-off-by: Andrew Xia <[email protected]>

[Performance] Remove redundant clone() calls in cutlass_mla (vllm-project#24891)

[Bug] Fix Cutlass Scaled MM Compilation Error (vllm-project#24887)

Signed-off-by: yewentao256 <[email protected]>

[ci] fix wheel names for arm wheels (vllm-project#24898)

Signed-off-by: simon-mo <[email protected]>

[Tests] fix initialization of kv hash in tests (vllm-project#24273)

Signed-off-by: Mickael Seznec <[email protected]>

[Compile] Fix noop_elimination pass and add tests for noop_elimination (vllm-project#24880)

Signed-off-by: zjy0516 <[email protected]>

Propagate entire tokens to connector for resumed preemptions

Signed-off-by: Qier Li <[email protected]>

Fix pre-commit

Signed-off-by: Qier Li <[email protected]>

Rename field and nullify empty lists

Signed-off-by: Qier Li <[email protected]>

Update vllm/v1/core/sched/scheduler.py

Co-authored-by: Nick Hill <[email protected]>
Signed-off-by: Qier Li <[email protected]>

Add unit test for preemption resumption

Signed-off-by: Qier Li <[email protected]>
simon-mo added a commit that referenced this pull request Oct 7, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant