[ROCm] Disable AITER allreduce fusion#41866
[ROCm] Disable AITER allreduce fusion#41866jpvillam-amd wants to merge 4 commits intovllm-project:mainfrom
Conversation
Signed-off-by: jpvillam <juan.villamizar@amd.com>
There was a problem hiding this comment.
Code Review
This pull request refactors the PassManager to simplify the inclusion of the AllReduceFusionPass. Feedback indicates that AllReduceFusionPass is NVIDIA-specific and its usage should be guarded to prevent issues on ROCm platforms. Additionally, the reviewer suggested removing the now-unused RocmAiterAllReduceFusionPass import.
| if rocm_aiter_ops.is_enabled() or current_platform.is_cuda(): | ||
| from .fusion.allreduce_rms_fusion import AllReduceFusionPass |
There was a problem hiding this comment.
The AllReduceFusionPass is an NVIDIA-specific optimization that relies on FlashInfer. Including rocm_aiter_ops.is_enabled() in this import condition is misleading and unnecessary if the usage is properly guarded. It should remain restricted to CUDA platforms to avoid confusing initialization warnings on ROCm. Additionally, since RocmAiterAllReduceFusionPass is no longer used in this file, its import (lines 24-27) should also be removed.
| if rocm_aiter_ops.is_enabled() or current_platform.is_cuda(): | |
| from .fusion.allreduce_rms_fusion import AllReduceFusionPass | |
| if current_platform.is_cuda(): | |
| from .fusion.allreduce_rms_fusion import AllReduceFusionPass |
Signed-off-by: jpvillam <juan.villamizar@amd.com>
|
Hi @jpvillam-amd, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: jpvillam <juan.villamizar@amd.com>
Head branch was pushed to by a user without write access
| self.passes += [AsyncTPPass(config)] | ||
|
|
||
| if self.pass_config.fuse_allreduce_rms: | ||
| if rocm_aiter_ops.is_enabled(): |
There was a problem hiding this comment.
@jpvillam-amd @gshtras may I know what is the reason to disable all reduce fusion when we can control through python argument by setting --compilation_config.pass_config.fuse_allreduce_rms=false
There was a problem hiding this comment.
And it is preferred to not remove the path completely as it was working with older aiter version, and we should should disable else where and add a TODO to fix it.
|
@jpvillam-amd which aiter version are you using? During the integration of the kernel, the rmsnorm all reduce fusion pass doesn't seem to be enabled by default. Can you share your vllm serve commands? If you are using https://github.com/vllm-project/vllm/blob/d4b00484040c9a0bbd4ee2d55983df7a50ab1fd3/vllm/config/vllm.py#L239-258 |
|
Hi! I made a new PR which does not require disabling the fusion if someone wants to check it (it was a simple one liner fix) Check #41972 |
|
Closing for #41972 |
Subset of #41816
Singular fix to disable aiter all reduce fusion pass
Purpose
RocmAiterAllReduceFusionPass is currently causing accuracy issues, disable until its fixed
Test Plan
Tested with DeepSeek FP4 and Llama
Test Result
Recovered accuracy after the change
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.