Update release pipeline post PyTorch 2.8.0 update#23960
Update release pipeline post PyTorch 2.8.0 update#23960huydhn wants to merge 7 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Huy Do <huydhn@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request correctly updates the release pipeline to replace the deprecated CUDA 11.8 build with a CUDA 12.9 build, following the PyTorch 2.8.0 update. It also addresses an arm64 build failure by adding libnuma-dev. However, a critical dependency is missing from the final runtime image stage in the Dockerfile, which will likely lead to runtime errors. The libnuma-dev package needs to be added to the vllm-base stage as well.
|
Does this PR mean, we can expect vllm nightlies built against 2.8.0 very soon? :) Might be good to even have a v0.10.1.2 release built against 2.8.0 - as v0.10.1.1 was released when 2.8.0 was already out PyTorch 2.8.0 has an important fix #18851 (comment) , so would be very beneficial to have a vllm proper release built against 2.8.0... |
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Yup, once this lands, the vLLM nightlies (and the next vLLM release) wheel will be on PyTorch 2.8.0 |
Nit: if you are ok with x86_64, the "nightlies" are already there. Where? Here: https://gallery.ecr.aws/q9t5s3a7/vllm-release-repo Update: direct wheel can be obtained from URL like: #20358 (comment) |
|
Are there any plans of releasing a service release built against 2.8.0? E.g. v0.10.1.2 or v0.10.2? that would be exactly v0.10.1.1 code, but built against 2.8.0 |
Signed-off-by: Huy Do <huydhn@gmail.com>
Seems there are docker images, right? I'm looking for s3/http-published whl files |
Yes, I can see vllm-0.10.1rc2.dev371+g67c14906a-cp38-abi3-manylinux1_x86_64.whl from a build job that uploaded it to S3. Give it a try? Update: #20358 (comment) |
|
Probably from this PR there should be some fresh builds against 2.8.0... I propose to still have a service release - as this could actually provide feedback to PyTorch if there are any perf regressions |
Signed-off-by: Huy Do <huydhn@gmail.com>
malfet
left a comment
There was a problem hiding this comment.
There is an issue on PyTorch side, to align build matrix for CUDA+PyTorch across x86 and aarch64
|
@simon-mo Should we align on the cuda version used between x86 images and aarch64 images? |
|
I realize for PyTorch project, during v2.8.0 release, the aarch64 binary wheel was only available for cuda 12.9. That was probably why we had to use cu129 for ARM container. |
|
yes, so my question is: should we also upgrade the cuda version in x86 docker images so that the x86 images and aarch64 images have the same cuda version? Otherwise, it is kind of weird that the same vLLM release images have different cuda versions on different archs |
youkaichao
left a comment
There was a problem hiding this comment.
this one looks better than #24020 , as it also takes care of the wheel uploading part.
please resolve some comments i left. thanks!
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
There was a problem hiding this comment.
Seeing b72ebd5 aarch64 image (on https://gallery.ecr.aws/q9t5s3a7/vllm-release-repo) as well from this PR. Nice!
Purpose
This is the second part after #20358. This PR does 3 things:
libnuma-devto fix arm64 build https://buildkite.com/vllm/release/builds/7768#0198f57a-b3ef-4861-8528-97ce129f5c03/114-5868Test Plan
CI https://buildkite.com/vllm/release/builds/7784
cc @simon-mo @khluu @seemethere
Essential Elements of an Effective PR Description Checklist