[2/N][Pangu][MoE] Remove Pangu Related Code#5130
[2/N][Pangu][MoE] Remove Pangu Related Code#5130wangxiyuan merged 2 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request removes Pangu-related code, specifically the AscendW8A8FusedMoEMethod for static W8A8 MoE quantization, along with its registration and associated tests. The changes are mostly clean removals. However, the PR also removes unit tests for generic MoE helper functions (select_experts and _native_grouped_topk) that are still in use by other parts of the codebase. This reduces test coverage for core functionality. I've left a comment suggesting to retain these tests, possibly by moving them to a more appropriate location.
I am having trouble creating individual review comments. Click here to see my feedback.
tests/ut/quantization/test_w8a8.py (557-985)
The test classes TestSelectExperts and TestNativeGroupedTopkPartialMock are being removed. However, the functions they test, select_experts and _native_grouped_topk from vllm_ascend/ops/fused_moe/experts_selector.py, are not being removed and are still used in other parts of the codebase (e.g., AscendFusedMoE). Removing these tests reduces test coverage for core MoE functionality. Please consider moving these tests to a more appropriate location, such as a new test file for experts_selector.py, instead of deleting them.
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
f0329c9 to
a20feac
Compare
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
…to eplb_refactor * 'main' of https://github.com/vllm-project/vllm-ascend: (52 commits) [Doc]Add the user_guide doc file regarding fine-grained TP. (vllm-project#5084) [pref] qwen3_next add triton ops : fused_sigmoid_gating_delta_rule_update (vllm-project#4818) [Feature] Add token mask for DispatchGmmCombineDecode operator (vllm-project#5171) [CI] Improve CI (vllm-project#5078) [Refactor] remove some metadata variables in attention_v1. (vllm-project#5160) Add Qwen3-VL-235B-A22B-Instruct tutorials (vllm-project#5167) [Doc] Add a perf tune section (vllm-project#5127) [Image] Refactor image build (vllm-project#5175) [refactor] refactor weight trans nz and transpose (vllm-project#4878) [BugFix]Fix precision issue for LoRA feature (vllm-project#4141) 【Doc】Deepseekv3.1/R1 doc enhancement (vllm-project#4827) support basic long_seq feature st (vllm-project#5140) [Bugfix] install trition for test_custom_op (vllm-project#5112) [2/N][Pangu][MoE] Remove Pangu Related Code (vllm-project#5130) [bugfix] Use FUSED_MC2 MoE comm path for the op `dispatch_ffn_combine` (vllm-project#5156) [BugFix] Fix top_p,top_k issue with EAGLE and add top_p,top_k in EAGLE e2e (vllm-project#5131) [Doc][P/D] Fix MooncakeConnector's name (vllm-project#5172) [Bugfix] Fix in_profile_run in mtp_proposer dummy_run (vllm-project#5165) [Doc] Refact benchmark doc (vllm-project#5173) [Nightly] Avoid max_model_len being smaller than the decoder prompt to prevent single-node-accuray-tests from failing (vllm-project#5174) ... Signed-off-by: 白永斌 <baiyongbin3@h-partners.com>
### What this PR does / why we need it? Remove Pangu Related Code ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? e2e & ut - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: weichen <calvin_zhu0210@outlook.com>
### What this PR does / why we need it? Remove Pangu Related Code ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? e2e & ut - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: weichen <calvin_zhu0210@outlook.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
### What this PR does / why we need it? Remove Pangu Related Code ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? e2e & ut - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: weichen <calvin_zhu0210@outlook.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
What this PR does / why we need it?
Remove Pangu Related Code
Does this PR introduce any user-facing change?
No
How was this patch tested?
e2e & ut