[Feat][Executor] Introduce RayExecutorV2#36836
Conversation
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
|
Hi @jeffreywang-anyscale, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
|
@njhill FYI this PR is not ready for review yet as I'm iterating on the CI. Will let you know once it's in a good shape for review! |
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
|
Hi @jeffreywang-anyscale, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
44868f7 to
39402d7
Compare
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
39402d7 to
c3ad8e5
Compare
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
|
Hi @jeffreywang-anyscale, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
kouroshHakha
left a comment
There was a problem hiding this comment.
ok beautiful. some broad comments after the first pass.
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
|
Ray LLM release tests and premerge tests both pass with the latest non-rebase commit ad8f6d0. |
kouroshHakha
left a comment
There was a problem hiding this comment.
Looks good overall — the round-1/round-2 feedback has been well addressed. The two-phase worker init and MQ transport selection are clean. A few remaining items below, mostly minor.
Note
This review was co-written with AI assistance (Claude Code).
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
njhill
left a comment
There was a problem hiding this comment.
LGTM, thanks @jeffreywang-anyscale @kouroshHakha
Disclaimer: I mainly reviewed the integration surfaces and changes to common code. I didn't review the ray executor v2 and ray utils impl/changes in detail but @kouroshHakha has already reviewed those
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: rishitdholakia13 <rishit+github@cohere.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: Rishi Puri <riship@nvidia.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Purpose
MessageQueue(shared memory + TCP fallback) for the control plane instead of Ray compiled graphs. It reusesMultiprocExecutor's MQ-based RPC and NCCL data plane while spawning workers as Ray actors into placement group bundles.VLLM_USE_RAY_V2_EXECUTOR_BACKENDenv var feature flag (default off) to opt into the new executor whendistributed_executor_backend="ray". Enable async scheduling support for the new backend.For more details, please refer to RFC: #35848.
EEP support is out-of-scope for this PR and is tracked here: #38164.
Test Plan
Unit tests
pytest tests/distributed/test_ray_v2_executor.py: executor init, TP/PP combos, placement groups, RPC, worker death, shutdownpytest tests/utils_/test_ray_utils.py: bundle sorting logicMessageQueuewithtest_mq_tcp_multinode.pyIntegration tests
pytest tests/distributed/test_ray_v2_executor_e2e.py: Creates Ray actors which initialize AsyncLLMEngine internally and verify that they can serve requests.pytest tests/distributed/test_pipeline_parallel.py -k "ray": PP correctness with the new backendpytest tests/basic_correctness/test_basic_correctness.py -k "ray": basic correctnessTest Result
Benchmark results (Qwen/Qwen3-8B on L4)
Server:
Client
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.