Skip to content

[2/N][Pangu][MoE] Remove Pangu Related Code#5130

Merged
wangxiyuan merged 2 commits intovllm-project:mainfrom
Pr0Wh1teGivee:w8a8
Dec 19, 2025
Merged

[2/N][Pangu][MoE] Remove Pangu Related Code#5130
wangxiyuan merged 2 commits intovllm-project:mainfrom
Pr0Wh1teGivee:w8a8

Conversation

@Pr0Wh1teGivee
Copy link
Copy Markdown
Contributor

@Pr0Wh1teGivee Pr0Wh1teGivee commented Dec 17, 2025

What this PR does / why we need it?

Remove Pangu Related Code

Does this PR introduce any user-facing change?

No

How was this patch tested?

e2e & ut

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes Pangu-related code, specifically the AscendW8A8FusedMoEMethod for static W8A8 MoE quantization, along with its registration and associated tests. The changes are mostly clean removals. However, the PR also removes unit tests for generic MoE helper functions (select_experts and _native_grouped_topk) that are still in use by other parts of the codebase. This reduces test coverage for core functionality. I've left a comment suggesting to retain these tests, possibly by moving them to a more appropriate location.

I am having trouble creating individual review comments. Click here to see my feedback.

tests/ut/quantization/test_w8a8.py (557-985)

high

The test classes TestSelectExperts and TestNativeGroupedTopkPartialMock are being removed. However, the functions they test, select_experts and _native_grouped_topk from vllm_ascend/ops/fused_moe/experts_selector.py, are not being removed and are still used in other parts of the codebase (e.g., AscendFusedMoE). Removing these tests reduces test coverage for core MoE functionality. Please consider moving these tests to a more appropriate location, such as a new test file for experts_selector.py, instead of deleting them.

@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@weijinqian0 weijinqian0 added ready read for review ready-for-test start test by label for PR labels Dec 18, 2025
@Pr0Wh1teGivee Pr0Wh1teGivee changed the title [2/N] Remove Pangu Related Code [2/N][Pangu][MoE] Remove Pangu Related Code Dec 18, 2025
Comment thread vllm_ascend/quantization/w8a8.py Outdated
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
@Pr0Wh1teGivee Pr0Wh1teGivee force-pushed the w8a8 branch 3 times, most recently from f0329c9 to a20feac Compare December 18, 2025 07:49
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
@wangxiyuan wangxiyuan merged commit ca6f631 into vllm-project:main Dec 19, 2025
25 checks passed
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Dec 19, 2025
…to eplb_refactor

* 'main' of https://github.com/vllm-project/vllm-ascend: (52 commits)
  [Doc]Add the user_guide doc file regarding fine-grained TP. (vllm-project#5084)
  [pref] qwen3_next add triton ops : fused_sigmoid_gating_delta_rule_update (vllm-project#4818)
  [Feature] Add token mask for DispatchGmmCombineDecode operator (vllm-project#5171)
  [CI] Improve CI (vllm-project#5078)
  [Refactor] remove some metadata variables in attention_v1. (vllm-project#5160)
  Add Qwen3-VL-235B-A22B-Instruct tutorials (vllm-project#5167)
  [Doc] Add a perf tune section (vllm-project#5127)
  [Image] Refactor image build (vllm-project#5175)
  [refactor] refactor weight trans nz and transpose (vllm-project#4878)
  [BugFix]Fix precision issue for LoRA feature (vllm-project#4141)
  【Doc】Deepseekv3.1/R1 doc enhancement (vllm-project#4827)
  support basic long_seq feature st (vllm-project#5140)
  [Bugfix] install trition for test_custom_op (vllm-project#5112)
  [2/N][Pangu][MoE] Remove Pangu Related Code (vllm-project#5130)
  [bugfix] Use FUSED_MC2 MoE comm path for the op `dispatch_ffn_combine` (vllm-project#5156)
  [BugFix] Fix top_p,top_k issue with EAGLE and add top_p,top_k in EAGLE e2e (vllm-project#5131)
  [Doc][P/D] Fix MooncakeConnector's name (vllm-project#5172)
  [Bugfix] Fix in_profile_run in mtp_proposer dummy_run (vllm-project#5165)
  [Doc] Refact benchmark doc (vllm-project#5173)
  [Nightly]  Avoid max_model_len being smaller than the decoder prompt to prevent single-node-accuray-tests from failing (vllm-project#5174)
  ...

Signed-off-by: 白永斌 <baiyongbin3@h-partners.com>
chenaoxuan pushed a commit to chenaoxuan/vllm-ascend that referenced this pull request Dec 20, 2025
### What this PR does / why we need it?
Remove Pangu Related Code

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
e2e & ut

- vLLM version: v0.12.0
- vLLM main:
vllm-project/vllm@ad32e3e

---------

Signed-off-by: weichen <calvin_zhu0210@outlook.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
### What this PR does / why we need it?
Remove Pangu Related Code

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
e2e & ut

- vLLM version: v0.12.0
- vLLM main:
vllm-project/vllm@ad32e3e

---------

Signed-off-by: weichen <calvin_zhu0210@outlook.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
### What this PR does / why we need it?
Remove Pangu Related Code

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
e2e & ut

- vLLM version: v0.12.0
- vLLM main:
vllm-project/vllm@ad32e3e

---------

Signed-off-by: weichen <calvin_zhu0210@outlook.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:quantization module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants