Conversation
This reverts commit 45c3c27.
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request reverts the moe_gating_top_k custom operator. The changes primarily consist of removing the implementation files for this operator. A key modification is in vllm_ascend/ops/fused_moe/experts_selector.py, where the removed custom operator is replaced with torch_npu.npu_moe_gating_top_k. This change also adjusts the logic to align with the native implementation, which appears to be a correct and beneficial update. The pull request is clean and serves its purpose of reverting the feature.
Reverts vllm-project#5271 It breaks e2e test - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@45c1ca1 Signed-off-by: f00824209 <fuzhihong4@huawei.com>
…to FIA_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (88 commits) [1/N] Refactor nightly test structure (vllm-project#5479) Docs: Remove deprecated --task parameter for embedding models (vllm-project#5257) Revert "moe_gating_top_k" (vllm-project#5512) [Doc] Fix issue link for 0.12.0 (vllm-project#5500) [CI]update triton ascend version (vllm-project#5392) moe_gating_top_k (vllm-project#5271) [refactor] refactor model runner capture model (vllm-project#5230) Update corresponding vllm commit ID to 12 29 (vllm-project#5475) [Kernel]update csrc cmakelist for open-source cann (vllm-project#5458) [OP] add custom op aclnnMoeInitRoutingCustom (vllm-project#5251) [Refactor][EAGLE] 1/N delete __init__ in mtp_proposer (vllm-project#5176) [Refactor][Triton] Move reject sample triton kernels into ops/triton (vllm-project#5324) [Feature] support eager mode in model runner v2 (vllm-project#5210) [feature] fia support sliding windows (vllm-project#5239) Optimize some rejectsampler functions to make npu op launch non-blocking (vllm-project#4587) [Feature] Support to use fullgraph with eagle (vllm-project#5118) [EPLB][refactor] Modification of the initialization logic for expert_map and log2phy(depend on pr5285) (vllm-project#5311) [Refactor]6/N Extract common code of class AscendMLAImpl (vllm-project#5314) [Refactor] cache cos/sin in mla & remove parameter model in builder. (vllm-project#5277) update vllm pin to 12.27 (vllm-project#5412) ...
Reverts vllm-project#5271 It breaks e2e test - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@45c1ca1
Reverts vllm-project#5271 It breaks e2e test - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@45c1ca1 Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
Reverts vllm-project#5271 It breaks e2e test - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@45c1ca1
Reverts vllm-project#5271 It breaks e2e test - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@45c1ca1 Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
Reverts #5271
It breaks e2e test