-
Notifications
You must be signed in to change notification settings - Fork 604
[Refactor] Adjustments to moe_comm_method selection process #3001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the Mixture-of-Experts (MoE) communication method selection to use an Enum instead of raw strings, which is a great improvement for code clarity and type safety. The changes are mostly consistent with this goal. However, I've identified a few critical issues, including typos in Enum member access and an incorrect import, which would lead to runtime errors. Please address these issues to ensure the refactoring is robust.
vllm_ascend/ops/common_fused_moe.py
Outdated
| moe_comm_method_name = forward_context.moe_comm_method_name | ||
| if moe_comm_method_name in {"alltoallcommimpl", "mc2commimpl"}: | ||
| moe_comm_method_type = forward_context.moe_comm_method_type | ||
| if moe_comm_method_type in {MoECommImpl.AllTOAll, MoECommImpl.MC2}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vllm_ascend/ops/common_fused_moe.py
Outdated
| moe_comm_method_name = forward_context.moe_comm_method_name | ||
| if moe_comm_method_name in {"alltoallcommimpl", "mc2commimpl"}: | ||
| moe_comm_method_type = forward_context.moe_comm_method_type | ||
| if moe_comm_method_type in {MoECommImpl.AllTOAll, MoECommImpl.MC2}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| from vllm_ascend.ascend_config import (MoECommImpl, | ||
| get_ascend_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MoECommImpl enum is being imported from vllm_ascend.ascend_config, but it is defined in vllm_ascend.ascend_forward_context. This will cause an ImportError at runtime. Please correct the import statement. For better code style, you could also combine this with the existing import from vllm_ascend.ascend_forward_context on the next line.
| from vllm_ascend.ascend_config import (MoECommImpl, | |
| get_ascend_config) | |
| from vllm_ascend.ascend_config import get_ascend_config | |
| from vllm_ascend.ascend_forward_context import MoECommImpl |
9c54be4 to
fc83278
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
e1696a2 to
1a137ac
Compare
f6f84fc to
836011f
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
836011f to
6d02dfb
Compare
6d02dfb to
a6da506
Compare
a6da506 to
ded2318
Compare
Co-Authored-By: weijinqian0 <[email protected]> Signed-off-by: Pr0Wh1teGivee <[email protected]>
c0b68d0 to
9ddd0f6
Compare
…ject#3001) ### What this PR does / why we need it? Fix issues mentioned in vllm-project#2791 and some minor refactoring. 1. Use Enum instead of string. 2. Avoid setting a new property to forward_context in AscendFusedMoE.forward(). 3. Enabling TokenDispatcherWithMoge. 4. Remove redundant code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Qwen3-30B-A3B/Qwen3-30B-A3B-W8A8/DeepSeek-V3-W4A8-Pruing/deepseek-mtp/pangu-pro-moe-pruing: 1. Enable/Disable EP 2. Aclgraph & eager - vLLM version: v0.10.2 - vLLM main: vllm-project/vllm@9607d5e Signed-off-by: Pr0Wh1teGivee <[email protected]> Co-authored-by: weijinqian0 <[email protected]> Signed-off-by: Che Ruan <[email protected]>
…ject#3001) ### What this PR does / why we need it? Fix issues mentioned in vllm-project#2791 and some minor refactoring. 1. Use Enum instead of string. 2. Avoid setting a new property to forward_context in AscendFusedMoE.forward(). 3. Enabling TokenDispatcherWithMoge. 4. Remove redundant code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Qwen3-30B-A3B/Qwen3-30B-A3B-W8A8/DeepSeek-V3-W4A8-Pruing/deepseek-mtp/pangu-pro-moe-pruing: 1. Enable/Disable EP 2. Aclgraph & eager - vLLM version: v0.10.2 - vLLM main: vllm-project/vllm@9607d5e Signed-off-by: Pr0Wh1teGivee <[email protected]> Co-authored-by: weijinqian0 <[email protected]>
…ject#3001) ### What this PR does / why we need it? Fix issues mentioned in vllm-project#2791 and some minor refactoring. 1. Use Enum instead of string. 2. Avoid setting a new property to forward_context in AscendFusedMoE.forward(). 3. Enabling TokenDispatcherWithMoge. 4. Remove redundant code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Qwen3-30B-A3B/Qwen3-30B-A3B-W8A8/DeepSeek-V3-W4A8-Pruing/deepseek-mtp/pangu-pro-moe-pruing: 1. Enable/Disable EP 2. Aclgraph & eager - vLLM version: v0.10.2 - vLLM main: vllm-project/vllm@9607d5e Signed-off-by: Pr0Wh1teGivee <[email protected]> Co-authored-by: weijinqian0 <[email protected]> Signed-off-by: hwhaokun <[email protected]>
…ject#3001) ### What this PR does / why we need it? Fix issues mentioned in vllm-project#2791 and some minor refactoring. 1. Use Enum instead of string. 2. Avoid setting a new property to forward_context in AscendFusedMoE.forward(). 3. Enabling TokenDispatcherWithMoge. 4. Remove redundant code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Qwen3-30B-A3B/Qwen3-30B-A3B-W8A8/DeepSeek-V3-W4A8-Pruing/deepseek-mtp/pangu-pro-moe-pruing: 1. Enable/Disable EP 2. Aclgraph & eager - vLLM version: v0.10.2 - vLLM main: vllm-project/vllm@9607d5e Signed-off-by: Pr0Wh1teGivee <[email protected]> Co-authored-by: weijinqian0 <[email protected]> Signed-off-by: nsdie <[email protected]>
What this PR does / why we need it?
Fix issues mentioned in #2791 and some minor refactoring.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Qwen3-30B-A3B/Qwen3-30B-A3B-W8A8/DeepSeek-V3-W4A8-Pruing/deepseek-mtp/pangu-pro-moe-pruing: