[bugfix] remove the EP buffer allocation introduced by fused-op dispatch_ffn_c…#5284
[bugfix] remove the EP buffer allocation introduced by fused-op dispatch_ffn_c…#5284zzzzwwjj merged 1 commit intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request removes the calculate_ep_buffer_size function and its usage for configuring the buffer size for the 'ep' (expert parallel) process group. This change appears to be a cleanup of obsolete code. Based on the pull request title, this specific buffer allocation was likely introduced for the dispatch_ffn_combine fused operator. The codebase indicates that this operator now uses the 'mc2' communication group, which has a different buffer configuration mechanism, rendering the 'ep' buffer calculation unnecessary. By removing this, the 'ep' group will fall back to using the default buffer size, which is appropriate for its remaining uses. The change is sound and improves code maintainability by removing dead code.
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
…ombine Signed-off-by: Chen Chen <0109chenchen@gmail.com>
…tch_ffn_c… (vllm-project#5284) ### What this PR does / why we need it? - This PR removes the Expert Parallel (EP) HCCL buffer allocation that was previously introduced by the fused-op `dispatch_ffn_combine` (vllm-project#3532 ), since the fused-op has switch to MC2 HCCL buffer (vllm-project#5156 ). ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: Chen Chen <0109chenchen@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…tch_ffn_c… (vllm-project#5284) ### What this PR does / why we need it? - This PR removes the Expert Parallel (EP) HCCL buffer allocation that was previously introduced by the fused-op `dispatch_ffn_combine` (vllm-project#3532 ), since the fused-op has switch to MC2 HCCL buffer (vllm-project#5156 ). ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: Chen Chen <0109chenchen@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
What this PR does / why we need it?
dispatch_ffn_combine(adddispatch_gmm_combinekernel #3532 ), since the fused-op has switch to MC2 HCCL buffer ([bugfix] Use FUSED_MC2 MoE comm path for the opdispatch_ffn_combine#5156 ).Does this PR introduce any user-facing change?
How was this patch tested?