[2/N][Pangu][MoE] Remove Pangu Related Code by Pr0Wh1teGivee · Pull Request #5130 · vllm-project/vllm-ascend

Pr0Wh1teGivee · 2025-12-17T09:22:46Z

What this PR does / why we need it?

Remove Pangu Related Code

Does this PR introduce any user-facing change?

No

How was this patch tested?

e2e & ut

vLLM version: v0.12.0
vLLM main: vllm-project/vllm@ad32e3e

gemini-code-assist

Code Review

This pull request removes Pangu-related code, specifically the AscendW8A8FusedMoEMethod for static W8A8 MoE quantization, along with its registration and associated tests. The changes are mostly clean removals. However, the PR also removes unit tests for generic MoE helper functions (select_experts and _native_grouped_topk) that are still in use by other parts of the codebase. This reduces test coverage for core functionality. I've left a comment suggesting to retain these tests, possibly by moving them to a more appropriate location.

I am having trouble creating individual review comments. Click here to see my feedback.

tests/ut/quantization/test_w8a8.py (557-985)

The test classes TestSelectExperts and TestNativeGroupedTopkPartialMock are being removed. However, the functions they test, select_experts and _native_grouped_topk from vllm_ascend/ops/fused_moe/experts_selector.py, are not being removed and are still used in other parts of the codebase (e.g., AscendFusedMoE). Removing these tests reduces test coverage for core MoE functionality. Please consider moving these tests to a more appropriate location, such as a new test file for experts_selector.py, instead of deleting them.

github-actions · 2025-12-17T10:58:35Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: weichen <calvin_zhu0210@outlook.com>

…to eplb_refactor * 'main' of https://github.com/vllm-project/vllm-ascend: (52 commits) [Doc]Add the user_guide doc file regarding fine-grained TP. (vllm-project#5084) [pref] qwen3_next add triton ops : fused_sigmoid_gating_delta_rule_update (vllm-project#4818) [Feature] Add token mask for DispatchGmmCombineDecode operator (vllm-project#5171) [CI] Improve CI (vllm-project#5078) [Refactor] remove some metadata variables in attention_v1. (vllm-project#5160) Add Qwen3-VL-235B-A22B-Instruct tutorials (vllm-project#5167) [Doc] Add a perf tune section (vllm-project#5127) [Image] Refactor image build (vllm-project#5175) [refactor] refactor weight trans nz and transpose (vllm-project#4878) [BugFix]Fix precision issue for LoRA feature (vllm-project#4141) 【Doc】Deepseekv3.1/R1 doc enhancement (vllm-project#4827) support basic long_seq feature st (vllm-project#5140) [Bugfix] install trition for test_custom_op (vllm-project#5112) [2/N][Pangu][MoE] Remove Pangu Related Code (vllm-project#5130) [bugfix] Use FUSED_MC2 MoE comm path for the op `dispatch_ffn_combine` (vllm-project#5156) [BugFix] Fix top_p,top_k issue with EAGLE and add top_p,top_k in EAGLE e2e (vllm-project#5131) [Doc][P/D] Fix MooncakeConnector's name (vllm-project#5172) [Bugfix] Fix in_profile_run in mtp_proposer dummy_run (vllm-project#5165) [Doc] Refact benchmark doc (vllm-project#5173) [Nightly] Avoid max_model_len being smaller than the decoder prompt to prevent single-node-accuray-tests from failing (vllm-project#5174) ... Signed-off-by: 白永斌 <baiyongbin3@h-partners.com>

### What this PR does / why we need it? Remove Pangu Related Code ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? e2e & ut - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: weichen <calvin_zhu0210@outlook.com>

### What this PR does / why we need it? Remove Pangu Related Code ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? e2e & ut - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: weichen <calvin_zhu0210@outlook.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

Pr0Wh1teGivee force-pushed the w8a8 branch from 86c2d88 to 2a35d66 Compare December 17, 2025 09:24

gemini-code-assist Bot reviewed Dec 17, 2025

View reviewed changes

github-actions Bot added module:tests module:quantization labels Dec 17, 2025

Pr0Wh1teGivee force-pushed the w8a8 branch from 2a35d66 to 0eb3864 Compare December 18, 2025 01:04

zzzzwwjj approved these changes Dec 18, 2025

View reviewed changes

weijinqian0 added ready read for review ready-for-test start test by label for PR labels Dec 18, 2025

Pr0Wh1teGivee changed the title ~~[2/N] Remove Pangu Related Code~~ [2/N][Pangu][MoE] Remove Pangu Related Code Dec 18, 2025

wangxiyuan approved these changes Dec 18, 2025

View reviewed changes

Comment thread vllm_ascend/quantization/w8a8.py Outdated

[2/N] Remove Pangu Related Code

6430b2f

Signed-off-by: weichen <calvin_zhu0210@outlook.com>

Pr0Wh1teGivee force-pushed the w8a8 branch 3 times, most recently from f0329c9 to a20feac Compare December 18, 2025 07:49

wangxiyuan approved these changes Dec 18, 2025

View reviewed changes

weijinqian0 approved these changes Dec 18, 2025

View reviewed changes

Remove C8 related code

f307977

Signed-off-by: weichen <calvin_zhu0210@outlook.com>

Pr0Wh1teGivee force-pushed the w8a8 branch from a20feac to f307977 Compare December 18, 2025 09:53

wangxiyuan merged commit ca6f631 into vllm-project:main Dec 19, 2025
25 checks passed

PengtuLi mentioned this pull request Jan 6, 2026

[Usage]: why kv-cache int8 quantization (c8) was removed? #5630

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[2/N][Pangu][MoE] Remove Pangu Related Code#5130

[2/N][Pangu][MoE] Remove Pangu Related Code#5130
wangxiyuan merged 2 commits intovllm-project:mainfrom
Pr0Wh1teGivee:w8a8

Pr0Wh1teGivee commented Dec 17, 2025 •

edited by github-actions Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

github-actions Bot commented Dec 17, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Pr0Wh1teGivee commented Dec 17, 2025 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

tests/ut/quantization/test_w8a8.py (557-985)

Uh oh!

github-actions Bot commented Dec 17, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Pr0Wh1teGivee commented Dec 17, 2025 •

edited by github-actions Bot

Loading