-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
[ci] fix wheel names for arm wheels #24898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: simon-mo <[email protected]>
Signed-off-by: simon-mo <[email protected]>
Signed-off-by: simon-mo <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codex Review: Here are some suggestions.
Reply with @codex fix comments to fix any unresolved comments.
About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".
| else: | ||
| cuda_version = str(get_nvcc_cuda_version()) | ||
| if cuda_version != MAIN_CUDA_VERSION: | ||
| if cuda_version != envs.MAIN_CUDA_VERSION: | ||
| cuda_version_str = cuda_version.replace(".", "")[:3] | ||
| # skip this for source tarball, required for pypi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P0] Access undefined envs.MAIN_CUDA_VERSION in setup
get_vllm_version() now compares the CUDA/ROCm version against envs.MAIN_CUDA_VERSION, but envs exposes the value under VLLM_MAIN_CUDA_VERSION via __getattr__. Because no attribute named MAIN_CUDA_VERSION exists, importing or running setup.py in a CUDA/ROCm environment will raise AttributeError, stopping wheel builds (including the new arm64 build) before artifacts are produced. The comparison should reference envs.VLLM_MAIN_CUDA_VERSION instead.
Useful? React with 👍 / 👎.
| docker manifest create vllm/vllm-openai:latest vllm/vllm-openai:latest-x86_64 vllm/vllm-openai:latest-aarch64 --amend | ||
| docker manifest create vllm/vllm-openai:v${RELEASE_VERSION} vllm/vllm-openai:v${RELEASE_VERSION}-x86_64 vllm/vllm-openai:v${RELEASE_VERSION}-aarch64 --amend | ||
| docker manifest push vllm/vllm-openai:v${RELEASE_VERSION} | ||
| \`\`\` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P1] Multi-arch image instructions never push :latest manifest
The annotated release instructions create two manifests (vllm/vllm-openai:latest and vllm/vllm-openai:v${RELEASE_VERSION}) but only push the versioned manifest. After executing this script, DockerHub will have the architecture-specific tags and the versioned multi-arch tag, yet the latest manifest remains local and docker pull vllm/vllm-openai:latest will continue to serve the previous release. A docker manifest push vllm/vllm-openai:latest is still required.
Useful? React with 👍 / 👎.
Signed-off-by: simon-mo <[email protected]>
Signed-off-by: simon-mo <[email protected]>
Signed-off-by: simon-mo <[email protected]>
Signed-off-by: simon-mo <[email protected]>
Signed-off-by: simon-mo <[email protected]>
Signed-off-by: bbartels <[email protected]> [gpt-oss] Add IncompleteDetails to ResponsesRepsonse (vllm-project#24561) Signed-off-by: Andrew Xia <[email protected]> [gpt-oss][1a] create_responses stream outputs BaseModel type, api server is SSE still (vllm-project#24759) Signed-off-by: Andrew Xia <[email protected]> [Performance] Remove redundant clone() calls in cutlass_mla (vllm-project#24891) [Bug] Fix Cutlass Scaled MM Compilation Error (vllm-project#24887) Signed-off-by: yewentao256 <[email protected]> [ci] fix wheel names for arm wheels (vllm-project#24898) Signed-off-by: simon-mo <[email protected]> [Tests] fix initialization of kv hash in tests (vllm-project#24273) Signed-off-by: Mickael Seznec <[email protected]> [Compile] Fix noop_elimination pass and add tests for noop_elimination (vllm-project#24880) Signed-off-by: zjy0516 <[email protected]> Propagate entire tokens to connector for resumed preemptions Signed-off-by: Qier Li <[email protected]> Fix pre-commit Signed-off-by: Qier Li <[email protected]> Rename field and nullify empty lists Signed-off-by: Qier Li <[email protected]> Update vllm/v1/core/sched/scheduler.py Co-authored-by: Nick Hill <[email protected]> Signed-off-by: Qier Li <[email protected]> Add unit test for preemption resumption Signed-off-by: Qier Li <[email protected]>
Signed-off-by: simon-mo <[email protected]>
Signed-off-by: simon-mo <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
Signed-off-by: simon-mo <[email protected]>
Signed-off-by: simon-mo <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
Purpose
envs.VLLM_MAIN_CUDA_VERSIONto prevent wheels suffix+cu129for arm wheels.annotate-release.shTest Plan
Will trigger a release pipeline build
Test Result
ARM was built without suffix https://buildkite.com/vllm/release/builds/8247/steps/canvas?sid=01994ea6-7da0-4e8b-9cb3-e2f25224d543
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.