Skip to content

[Tests] Modify test cases#2991

Merged
gcanlin merged 1 commit intovllm-project:mainfrom
amy-why-3459:bugfix-tests
Apr 22, 2026
Merged

[Tests] Modify test cases#2991
gcanlin merged 1 commit intovllm-project:mainfrom
amy-why-3459:bugfix-tests

Conversation

@amy-why-3459
Copy link
Copy Markdown
Contributor

@amy-why-3459 amy-why-3459 commented Apr 21, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

In the VLLM implementation, if distributed_executor_backend is not configured, it will be selected based on world_size.

if self.distributed_executor_backend is None and self.world_size_across_dp > 1:
            # We use multiprocessing by default if world_size fits on the
            # current node and we aren't in a ray placement group.

            from vllm.v1.executor import ray_utils

            backend: DistributedExecutorBackend = "mp"
            ray_found = ray_utils.ray_is_available()
            if current_platform.is_tpu() and envs.VLLM_XLA_USE_SPMD:
                backend = "uni"
            elif current_platform.is_cuda() and self.nnodes > 1:
                backend = "mp"
            elif (
                current_platform.is_cuda()
                and current_platform.device_count() < self.world_size
            ):
                gpu_count = current_platform.device_count()
                raise ValueError(
                    f"World size ({self.world_size}) is larger than the number of "
                    f"available GPUs ({gpu_count}) in this node. If this is "
                    "intentional and you are using:\n"
                    "- ray, set '--distributed-executor-backend ray'.\n"
                    "- multiprocessing, set '--nnodes' appropriately."
                )
            elif self.data_parallel_backend == "ray":
                logger.info(
                    "Using ray distributed inference because "
                    "data_parallel_backend is ray"
                )
                backend = "ray"
            elif ray_found:
                if self.placement_group:
                    backend = "ray"
                else:
                    from ray import is_initialized as ray_is_initialized

                    if ray_is_initialized():
                        from ray.util import get_current_placement_group

                        if get_current_placement_group():
                            backend = "ray"
            self.distributed_executor_backend = backend
            logger.debug("Defaulting to use %s for distributed inference", backend)

        if self.distributed_executor_backend is None and self.world_size == 1:
            self.distributed_executor_backend = "uni"

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

@gcanlin gcanlin added the omni-test label to trigger buildkite omni model test in nightly CI label Apr 21, 2026
"num_prompts": [4, 16, 40],
"max_concurrency": [1, 4, 10],
"num_prompts": [4, 16, 40, 64],
"max_concurrency": [1, 4, 10, 16],
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the maximal in theory?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the maximal in theory?

In async_chunk mode, since chunked prefill is not currently supported, the theoretical maximum supported concurrency is 26 (65536/2500) for an input of 2500.


# === Pipeline-wide engine settings (applied uniformly to every stage) ===
trust_remote_code: bool = True
distributed_executor_backend: str = "mp"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help check whether this default mp distributed_executor_backend will be applied in every stage? I notice that if so, it's making UX degradation. Because vLLM has a complete init for distributed_executor_backend. How about removing it and make every stage choose by themselves even when default? cc @lishunyang12

https://github.com/vllm-project/vllm/blob/7b1e0b07d0ea9fe11f0c4a35001baade2c3b1074/vllm/config/parallel.py#L790-L839

Copy link
Copy Markdown
Contributor Author

@amy-why-3459 amy-why-3459 Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help check whether this default mp distributed_executor_backend will be applied in every stage? I notice that if so, it's making UX degradation. Because vLLM has a complete init for distributed_executor_backend. How about removing it and make every stage choose by themselves even when default? cc @lishunyang12

https://github.com/vllm-project/vllm/blob/7b1e0b07d0ea9fe11f0c4a35001baade2c3b1074/vllm/config/parallel.py#L790-L839

I completely agree with your point. I think we can discuss whether we need to set a default value for the distributed_executor_backend parameter.

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

BLOCKING:

  • Breaking Changes — This PR changes the default distributed_executor_backend from "mp" to "uni" in vllm_omni/config/stage_config.py:422. This is a breaking change that affects all deployments using the default DeployConfig. Please:

    1. Split this PR into two: one for the test changes, one for the config default change
    2. For the config default change PR, add justification for why "uni" should be the new default
    3. Add a migration guide or note in the PR description about how existing deployments will be affected
  • Documentation — The PR description doesn't mention the config default change, making it unclear that this is a breaking change. Please update the PR title (e.g., [Breaking Change]) and body to document this change.

@amy-why-3459 amy-why-3459 force-pushed the bugfix-tests branch 2 times, most recently from 9c901b8 to 71dd056 Compare April 22, 2026 01:41
@amy-why-3459
Copy link
Copy Markdown
Contributor Author

BLOCKING:

  • Breaking Changes — This PR changes the default distributed_executor_backend from "mp" to "uni" in vllm_omni/config/stage_config.py:422. This is a breaking change that affects all deployments using the default DeployConfig. Please:

    1. Split this PR into two: one for the test changes, one for the config default change
    2. For the config default change PR, add justification for why "uni" should be the new default
    3. Add a migration guide or note in the PR description about how existing deployments will be affected
  • Documentation — The PR description doesn't mention the config default change, making it unclear that this is a breaking change. Please update the PR title (e.g., [Breaking Change]) and body to document this change.

I will submit a separate PR to remove the default values ​​from the config file.

@gcanlin gcanlin added the ready label to trigger buildkite CI label Apr 22, 2026
Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
@yenuo26 yenuo26 removed the omni-test label to trigger buildkite omni model test in nightly CI label Apr 22, 2026
@amy-why-3459
Copy link
Copy Markdown
Contributor Author

@gcanlin @hsliuustc0106 I think this PR is ready and can be merged.

Copy link
Copy Markdown
Collaborator

@gcanlin gcanlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gcanlin gcanlin merged commit 9337bec into vllm-project:main Apr 22, 2026
8 checks passed
qinganrice pushed a commit to qinganrice/vllm-omni that referenced this pull request Apr 23, 2026
Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants