[deps][llm] Upgrade to vLLM 0.19.0#62349
Conversation
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request upgrades the vllm dependency to version 0.19.0 across the codebase, including Docker configurations, requirement files, and lock files. It also refactors the pause operation interface by replacing the wait_for_inflight_requests boolean with a more flexible mode parameter (supporting 'abort', 'wait', and 'keep') and updates internal prompt class references to align with the new vLLM version. Feedback suggests improving type safety in VLLMPauseConfig by using Literal for the mode field to ensure only valid strings are accepted.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
| def cleanup_ray_resources(): | ||
| """Automatically cleanup Ray resources between tests to prevent conflicts.""" | ||
| yield | ||
| _kill_gpu_processes_on_all_nodes() |
There was a problem hiding this comment.
What's the story with this one?
There was a problem hiding this comment.
I observed dangling vLLM processes when mp backend is used. Test failure: https://console.anyscale-staging.com/cld_wy5a6nhazplvu32526ams61d98/prj_lhlrf1u5yv8qz9qg3xzw8fkiiq/jobs/prodjob_lxrlak5vpsmxtvkigpf8bpw4ct?job-logs-section-tabs=application_logs&job-tab=overview.
There was a problem hiding this comment.
ray backend doesn't suffer from the same problem.
There was a problem hiding this comment.
Is there a reason we didn't see this before? Could this cause process cleanup issues for users outside the test env?
There was a problem hiding this comment.
Yeah this could happen outside of the test env. Let me try again without this explicit cleanup.
There was a problem hiding this comment.
There was a problem hiding this comment.
I bisected the changes from v0.18.0 to 0.19.0 and identified that https://github.com/vllm-project/vllm/pull/37131/changes#diff-58339ebbbdb34ac8183ced2a9cb11840321a70ceed13e3b17a68ee9d2e0a2ac8R151 is the root cause. When an external orchestrator like Ray shuts down a rank while NCCL destroys the collective, ncclCommDestroy could hang, leaving orphaned processes. Here's a fix for vLLM: vllm-project/vllm#39846. For this upgrade to 0.19.0, I think we should proceed with external cleanup in the test and remove it once we upgrade to 0.20.0.
aslonnie
left a comment
There was a problem hiding this comment.
approval for dependency change.
the python deps change is quite non-trivial this time.
This reverts commit 07af14f. Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
|
premerge is blocked waiting for windows tests to be approved, but I don't think they're relevant to this change. |
|
@jeffreywang-anyscale it's the macOS test; we should request a force merge since they are meant to be disabled #62640 |
## Description - **Upgrade vLLM** from 0.18.0 to 0.19.0 across requirements, setup.py, Dockerfile, and lock files. - Adapt to vLLM 0.19.0 **API changes**: pause wait_for_inflight_requests → tri-state mode, fix `vllm.inputs` import paths. - **Stabilize test infra**: add GPU process cleanup between batch tests (temporary cleanup and will be removed once vllm-project/vllm#39846 fixes the underlying vLLM issue), lower NIXL ports to avoid ephemeral range conflicts. ## Related issues N/A ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
## Description - **Upgrade vLLM** from 0.18.0 to 0.19.0 across requirements, setup.py, Dockerfile, and lock files. - Adapt to vLLM 0.19.0 **API changes**: pause wait_for_inflight_requests → tri-state mode, fix `vllm.inputs` import paths. - **Stabilize test infra**: add GPU process cleanup between batch tests (temporary cleanup and will be removed once vllm-project/vllm#39846 fixes the underlying vLLM issue), lower NIXL ports to avoid ephemeral range conflicts. ## Related issues N/A ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Description
vllm.inputsimport paths.Related issues
N/A
Additional information