Skip to content

[deps][llm] Upgrade to vLLM 0.19.0#62349

Merged
aslonnie merged 15 commits into
masterfrom
vllm-0.19.0
Apr 15, 2026
Merged

[deps][llm] Upgrade to vLLM 0.19.0#62349
aslonnie merged 15 commits into
masterfrom
vllm-0.19.0

Conversation

@jeffreywang88

@jeffreywang88 jeffreywang88 commented Apr 4, 2026

Copy link
Copy Markdown
Contributor

Description

  • Upgrade vLLM from 0.18.0 to 0.19.0 across requirements, setup.py, Dockerfile, and lock files.
  • Adapt to vLLM 0.19.0 API changes: pause wait_for_inflight_requests → tri-state mode, fix vllm.inputs import paths.
  • Stabilize test infra: add GPU process cleanup between batch tests (temporary cleanup and will be removed once [BugFix] Prevent orphaned process on NCCL destroy vllm-project/vllm#39846 fixes the underlying vLLM issue), lower NIXL ports to avoid ephemeral range conflicts.

Related issues

N/A

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
@jeffreywang88 jeffreywang88 added the go add ONLY when ready to merge, run all tests label Apr 4, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request upgrades the vllm dependency to version 0.19.0 across the codebase, including Docker configurations, requirement files, and lock files. It also refactors the pause operation interface by replacing the wait_for_inflight_requests boolean with a more flexible mode parameter (supporting 'abort', 'wait', and 'keep') and updates internal prompt class references to align with the new vLLM version. Feedback suggests improving type safety in VLLMPauseConfig by using Literal for the mode field to ensure only valid strings are accepted.

Comment thread python/ray/llm/_internal/serve/engines/vllm/vllm_engine.py Outdated
jeffreywang88 and others added 9 commits April 4, 2026 15:29
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
@jeffreywang88 jeffreywang88 marked this pull request as ready for review April 14, 2026 00:29
def cleanup_ray_resources():
"""Automatically cleanup Ray resources between tests to prevent conflicts."""
yield
_kill_gpu_processes_on_all_nodes()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the story with this one?

@jeffreywang88 jeffreywang88 Apr 14, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ray backend doesn't suffer from the same problem.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we didn't see this before? Could this cause process cleanup issues for users outside the test env?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this could happen outside of the test env. Let me try again without this explicit cleanup.

@jeffreywang88 jeffreywang88 Apr 14, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bisected the changes from v0.18.0 to 0.19.0 and identified that https://github.com/vllm-project/vllm/pull/37131/changes#diff-58339ebbbdb34ac8183ced2a9cb11840321a70ceed13e3b17a68ee9d2e0a2ac8R151 is the root cause. When an external orchestrator like Ray shuts down a rank while NCCL destroys the collective, ncclCommDestroy could hang, leaving orphaned processes. Here's a fix for vLLM: vllm-project/vllm#39846. For this upgrade to 0.19.0, I think we should proceed with external cleanup in the test and remove it once we upgrade to 0.20.0.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicee

@ray-gardener ray-gardener Bot added the serve Ray Serve Related Issue label Apr 14, 2026
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>

@aslonnie aslonnie left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approval for dependency change.

the python deps change is quite non-trivial this time.

This reverts commit 07af14f.

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
@jeffreywang88

Copy link
Copy Markdown
Contributor Author

premerge is blocked waiting for windows tests to be approved, but I don't think they're relevant to this change.

@eicherseiji

Copy link
Copy Markdown
Contributor

@jeffreywang-anyscale it's the macOS test; we should request a force merge since they are meant to be disabled #62640

@kouroshHakha kouroshHakha left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STMP

@kouroshHakha kouroshHakha enabled auto-merge (squash) April 15, 2026 18:31
@aslonnie aslonnie disabled auto-merge April 15, 2026 19:29
@aslonnie aslonnie merged commit 36a5d61 into master Apr 15, 2026
6 of 7 checks passed
@aslonnie aslonnie deleted the vllm-0.19.0 branch April 15, 2026 19:29
HLDKNotFound pushed a commit to chichic21039/ray that referenced this pull request Apr 22, 2026
## Description
- **Upgrade vLLM** from 0.18.0 to 0.19.0 across requirements, setup.py,
Dockerfile, and lock files.
- Adapt to vLLM 0.19.0 **API changes**: pause wait_for_inflight_requests
→ tri-state mode, fix `vllm.inputs` import paths.
- **Stabilize test infra**: add GPU process cleanup between batch tests
(temporary cleanup and will be removed once
vllm-project/vllm#39846 fixes the underlying
vLLM issue), lower NIXL ports to avoid ephemeral range conflicts.


## Related issues
N/A

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
## Description
- **Upgrade vLLM** from 0.18.0 to 0.19.0 across requirements, setup.py,
Dockerfile, and lock files.
- Adapt to vLLM 0.19.0 **API changes**: pause wait_for_inflight_requests
→ tri-state mode, fix `vllm.inputs` import paths.
- **Stabilize test infra**: add GPU process cleanup between batch tests
(temporary cleanup and will be removed once
vllm-project/vllm#39846 fixes the underlying
vLLM issue), lower NIXL ports to avoid ephemeral range conflicts.


## Related issues
N/A

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants