[Tests] Modify test cases by amy-why-3459 · Pull Request #2991 · vllm-project/vllm-omni

amy-why-3459 · 2026-04-21T13:25:47Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

In the VLLM implementation, if distributed_executor_backend is not configured, it will be selected based on world_size.

if self.distributed_executor_backend is None and self.world_size_across_dp > 1:
            # We use multiprocessing by default if world_size fits on the
            # current node and we aren't in a ray placement group.

            from vllm.v1.executor import ray_utils

            backend: DistributedExecutorBackend = "mp"
            ray_found = ray_utils.ray_is_available()
            if current_platform.is_tpu() and envs.VLLM_XLA_USE_SPMD:
                backend = "uni"
            elif current_platform.is_cuda() and self.nnodes > 1:
                backend = "mp"
            elif (
                current_platform.is_cuda()
                and current_platform.device_count() < self.world_size
            ):
                gpu_count = current_platform.device_count()
                raise ValueError(
                    f"World size ({self.world_size}) is larger than the number of "
                    f"available GPUs ({gpu_count}) in this node. If this is "
                    "intentional and you are using:\n"
                    "- ray, set '--distributed-executor-backend ray'.\n"
                    "- multiprocessing, set '--nnodes' appropriately."
                )
            elif self.data_parallel_backend == "ray":
                logger.info(
                    "Using ray distributed inference because "
                    "data_parallel_backend is ray"
                )
                backend = "ray"
            elif ray_found:
                if self.placement_group:
                    backend = "ray"
                else:
                    from ray import is_initialized as ray_is_initialized

                    if ray_is_initialized():
                        from ray.util import get_current_placement_group

                        if get_current_placement_group():
                            backend = "ray"
            self.distributed_executor_backend = backend
            logger.debug("Defaulting to use %s for distributed inference", backend)

        if self.distributed_executor_backend is None and self.world_size == 1:
            self.distributed_executor_backend = "uni"

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

hsliuustc0106 · 2026-04-21T13:37:08Z

-                "num_prompts": [4, 16, 40],
-                "max_concurrency": [1, 4, 10],
+                "num_prompts": [4, 16, 40, 64],
+                "max_concurrency": [1, 4, 10, 16],


what's the maximal in theory?

what's the maximal in theory?

In async_chunk mode, since chunked prefill is not currently supported, the theoretical maximum supported concurrency is 26 (65536/2500) for an input of 2500.

gcanlin · 2026-04-21T13:38:37Z


    # === Pipeline-wide engine settings (applied uniformly to every stage) ===
    trust_remote_code: bool = True
-    distributed_executor_backend: str = "mp"


Could you help check whether this default mp distributed_executor_backend will be applied in every stage? I notice that if so, it's making UX degradation. Because vLLM has a complete init for distributed_executor_backend. How about removing it and make every stage choose by themselves even when default? cc @lishunyang12

https://github.com/vllm-project/vllm/blob/7b1e0b07d0ea9fe11f0c4a35001baade2c3b1074/vllm/config/parallel.py#L790-L839

Could you help check whether this default mp distributed_executor_backend will be applied in every stage? I notice that if so, it's making UX degradation. Because vLLM has a complete init for distributed_executor_backend. How about removing it and make every stage choose by themselves even when default? cc @lishunyang12

https://github.com/vllm-project/vllm/blob/7b1e0b07d0ea9fe11f0c4a35001baade2c3b1074/vllm/config/parallel.py#L790-L839

I completely agree with your point. I think we can discuss whether we need to set a default value for the distributed_executor_backend parameter.

hsliuustc0106 · 2026-04-21T21:13:11Z

BLOCKING:

Breaking Changes — This PR changes the default distributed_executor_backend from "mp" to "uni" in vllm_omni/config/stage_config.py:422. This is a breaking change that affects all deployments using the default DeployConfig. Please:
1. Split this PR into two: one for the test changes, one for the config default change
2. For the config default change PR, add justification for why "uni" should be the new default
3. Add a migration guide or note in the PR description about how existing deployments will be affected
Documentation — The PR description doesn't mention the config default change, making it unclear that this is a breaking change. Please update the PR title (e.g., [Breaking Change]) and body to document this change.

amy-why-3459 · 2026-04-22T01:53:12Z

BLOCKING:

Breaking Changes — This PR changes the default distributed_executor_backend from "mp" to "uni" in vllm_omni/config/stage_config.py:422. This is a breaking change that affects all deployments using the default DeployConfig. Please:

Split this PR into two: one for the test changes, one for the config default change

For the config default change PR, add justification for why "uni" should be the new default

Add a migration guide or note in the PR description about how existing deployments will be affected

Documentation — The PR description doesn't mention the config default change, making it unclear that this is a breaking change. Please update the PR title (e.g., [Breaking Change]) and body to document this change.

I will submit a separate PR to remove the default values from the config file.

Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

amy-why-3459 · 2026-04-22T09:51:18Z

@gcanlin @hsliuustc0106 I think this PR is ready and can be merged.

gcanlin

LGTM

Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

amy-why-3459 requested a review from hsliuustc0106 as a code owner April 21, 2026 13:25

gcanlin added the omni-test label to trigger buildkite omni model test in nightly CI label Apr 21, 2026

hsliuustc0106 reviewed Apr 21, 2026

View reviewed changes

gcanlin reviewed Apr 21, 2026

View reviewed changes

amy-why-3459 force-pushed the bugfix-tests branch 2 times, most recently from 9c901b8 to 71dd056 Compare April 22, 2026 01:41

gcanlin added the ready label to trigger buildkite CI label Apr 22, 2026

amy-why-3459 force-pushed the bugfix-tests branch from 71dd056 to e9aee80 Compare April 22, 2026 04:49

Modify test cases

85abddb

Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

amy-why-3459 force-pushed the bugfix-tests branch from e9aee80 to 85abddb Compare April 22, 2026 05:05

yenuo26 removed the omni-test label to trigger buildkite omni model test in nightly CI label Apr 22, 2026

gcanlin approved these changes Apr 22, 2026

View reviewed changes

gcanlin merged commit 9337bec into vllm-project:main Apr 22, 2026
8 checks passed

qinganrice pushed a commit to qinganrice/vllm-omni that referenced this pull request Apr 23, 2026

[Tests] Modify test cases (vllm-project#2991)

d2f026d

Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tests] Modify test cases#2991

[Tests] Modify test cases#2991
gcanlin merged 1 commit intovllm-project:mainfrom
amy-why-3459:bugfix-tests

amy-why-3459 commented Apr 21, 2026 •

edited

Loading

Uh oh!

hsliuustc0106 Apr 21, 2026

Uh oh!

amy-why-3459 Apr 21, 2026

Uh oh!

gcanlin Apr 21, 2026

Uh oh!

amy-why-3459 Apr 21, 2026 •

edited

Loading

Uh oh!

hsliuustc0106 commented Apr 21, 2026

Uh oh!

amy-why-3459 commented Apr 22, 2026

Uh oh!

amy-why-3459 commented Apr 22, 2026

Uh oh!

gcanlin left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

amy-why-3459 commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

hsliuustc0106 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

amy-why-3459 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

gcanlin Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

amy-why-3459 Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 commented Apr 21, 2026

Uh oh!

amy-why-3459 commented Apr 22, 2026

Uh oh!

amy-why-3459 commented Apr 22, 2026

Uh oh!

gcanlin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

amy-why-3459 commented Apr 21, 2026 •

edited

Loading

amy-why-3459 Apr 21, 2026 •

edited

Loading