[Bugfix] Add regression test for allreduce RMS fusion with PP by robellliu-dev · Pull Request #35960 · vllm-project/vllm

robellliu-dev · 2026-03-04T05:37:21Z

Motivation
Prevent a regression where enable_allreduce_rms_fusion would be incorrectly enabled when pipeline_parallel_size > 1, covering the scenario reported in issue #32730.
Ensure the gating logic for enabling the allreduce+RMS fusion is explicitly tested for tensor-parallel-only and pipeline-parallel cases.
Description
Added a new unit test test_enable_allreduce_rms_fusion_disabled_for_pp in tests/test_config.py that exercises enable_allreduce_rms_fusion under different ParallelConfig settings.
Imported enable_allreduce_rms_fusion into the test module and mocked vllm.utils.flashinfer.has_flashinfer and current_platform capability checks to isolate config gating logic.
Commit uses the [Bugfix] PR title prefix and includes a Signed-off-by: header to satisfy the project's DCO requirement.
Testing
Ran ruff check tests/test_config.py and it passed.
Ran python -m py_compile tests/test_config.py and it succeeded.
Attempted pytest -q tests/test_config.py -k enable_allreduce_rms_fusion_disabled_for_pp, but the run was blocked in this environment due to a missing test dependency (tblib) and network restrictions preventing installation, so the pytest run could not be completed.

Signed-off-by: Codex <codex@openai.com>

…eate-pr [Bugfix] Add regression test for allreduce RMS fusion with PP

github-actions · 2026-03-04T05:37:28Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

gemini-code-assist

Code Review

This pull request adds a regression test to ensure enable_allreduce_rms_fusion is correctly disabled when pipeline parallelism is used. The added test is correct and addresses the issue. I've suggested an improvement to make the test more comprehensive by using parameterization to cover all gating conditions in the function, which will enhance its robustness against future regressions.

gemini-code-assist · 2026-03-04T05:39:41Z

+def test_enable_allreduce_rms_fusion_disabled_for_pp():
+    cfg = VllmConfig(
+        parallel_config=ParallelConfig(
+            tensor_parallel_size=2,
+            pipeline_parallel_size=1,
+            data_parallel_size=1,
+        )
+    )
+
+    with (
+        patch("vllm.utils.flashinfer.has_flashinfer", return_value=True),
+        patch.object(current_platform, "is_cuda", return_value=True),
+        patch.object(current_platform, "is_device_capability", return_value=True),
+    ):
+        assert enable_allreduce_rms_fusion(cfg)
+
+        cfg.parallel_config.pipeline_parallel_size = 2
+        assert not enable_allreduce_rms_fusion(cfg)


The current test correctly covers the pipeline parallelism case as intended. However, to make it more robust and prevent future regressions, it would be beneficial to cover all gating conditions within enable_allreduce_rms_fusion, including tensor_parallel_size and data_parallel_size.

A parameterized test would be a clean way to test all ParallelConfig combinations and improve the test's clarity and maintainability.

@pytest.mark.parametrize( ("parallel_config", "should_be_enabled"), [ # Should be enabled with only TP > 1 (ParallelConfig(tensor_parallel_size=2, pipeline_parallel_size=1, data_parallel_size=1), True), # Should be disabled with TP <= 1 (ParallelConfig(tensor_parallel_size=1, pipeline_parallel_size=1, data_parallel_size=1), False), # Should be disabled with PP > 1 (ParallelConfig(tensor_parallel_size=2, pipeline_parallel_size=2, data_parallel_size=1), False), # Should be disabled with DP > 1 (ParallelConfig(tensor_parallel_size=2, pipeline_parallel_size=1, data_parallel_size=2), False), ], ids=["TP-only", "No-TP", "With-PP", "With-DP"] ) def test_enable_allreduce_rms_fusion_gating(parallel_config, should_be_enabled): cfg = VllmConfig(parallel_config=parallel_config) with ( patch("vllm.utils.flashinfer.has_flashinfer", return_value=True), patch.object(current_platform, "is_cuda", return_value=True), patch.object(current_platform, "is_device_capability", return_value=True), ): assert enable_allreduce_rms_fusion(cfg) is should_be_enabled

Signed-off-by: Codex <codex@openai.com>

mergify · 2026-03-17T06:13:46Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @robellliu-dev.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify · 2026-04-29T17:15:52Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @robellliu-dev.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

robellliu-dev added 3 commits March 4, 2026 11:58

[Bugfix] Add regression test for allreduce RMS fusion with PP

5ceaaee

Signed-off-by: Codex <codex@openai.com>

[Bugfix] Format config regression test for pre-commit

1966511

Signed-off-by: Codex <codex@openai.com>

Merge pull request #3 from robellliu-dev/codex/fix-issue-32730-and-cr…

2b69071

…eate-pr [Bugfix] Add regression test for allreduce RMS fusion with PP

mergify Bot added the bug Something isn't working label Mar 4, 2026

gemini-code-assist Bot reviewed Mar 4, 2026

View reviewed changes

robellliu-dev added 2 commits March 9, 2026 13:33

Merge branch 'main' into main

a27a45f

Improve allreduce RMS fusion gating regression coverage

20c927b

Signed-off-by: Codex <codex@openai.com>

robellliu-dev requested review from DarkLight1337, NickLucche, aarnphm, chaunceyjiang, robertgshaw2-redhat and russellb as code owners March 9, 2026 05:44

mergify Bot added the frontend label Mar 9, 2026

robellliu-dev mentioned this pull request Mar 9, 2026

[Bugfix] Add regression test for allreduce RMS fusion with PP #36457

Open

5 tasks

Merge branch 'vllm-project:main' into main

ca8b242

mergify Bot added the needs-rebase label Mar 17, 2026

mergify Bot removed the needs-rebase label Apr 29, 2026

mergify Bot added the needs-rebase label Apr 29, 2026

zixi-qi mentioned this pull request May 25, 2026

[Bugfix] Disable allreduce_rms_fusion when pipeline_parallel_size > 1 #43616

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Add regression test for allreduce RMS fusion with PP#35960

[Bugfix] Add regression test for allreduce RMS fusion with PP#35960
robellliu-dev wants to merge 6 commits into
vllm-project:mainfrom
robellliu-dev:main

robellliu-dev commented Mar 4, 2026 •

edited by github-actions Bot

Loading

Uh oh!

github-actions Bot commented Mar 4, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 4, 2026

Uh oh!

mergify Bot commented Mar 17, 2026

Uh oh!

mergify Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

robellliu-dev commented Mar 4, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Mar 4, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

mergify Bot commented Mar 17, 2026

Uh oh!

mergify Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

robellliu-dev commented Mar 4, 2026 •

edited by github-actions Bot

Loading