[Refactor]6/N Extract common code of class AscendMLAImpl#5314
[Refactor]6/N Extract common code of class AscendMLAImpl#5314weijinqian0 merged 24 commits intovllm-project:mainfrom wujinyuan1:main
Conversation
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
There was a problem hiding this comment.
Code Review
This pull request refactors the AscendMLAImpl classes to extract common code into a base class, which is a good step towards improving code maintainability. However, in vllm_ascend/attention/mla_cp.py, some of the overridden methods (mla_preprocess_prefill and mla_preprocess_decode) still contain significant code duplication from the base class. My review includes suggestions to further refactor these methods to eliminate duplication by calling the superclass implementation for common cases and using helper methods for specialized logic. This will better align with the PR's goal of reducing duplicate code.
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
# Conflicts: # vllm_ascend/attention/mla_cp.py # vllm_ascend/attention/mla_v1.py
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Signed-off-by: wujinyuan1 <wjy9595@qq.com>
…to eplb_refactor * 'main' of https://github.com/vllm-project/vllm-ascend: (46 commits) [Feature] Support to use fullgraph with eagle (vllm-project#5118) [EPLB][refactor] Modification of the initialization logic for expert_map and log2phy(depend on pr5285) (vllm-project#5311) [Refactor]6/N Extract common code of class AscendMLAImpl (vllm-project#5314) [Refactor] cache cos/sin in mla & remove parameter model in builder. (vllm-project#5277) update vllm pin to 12.27 (vllm-project#5412) [ReleaseNote] Add release note for v0.13.0rc1 (vllm-project#5334) [Bugfix] Correctly handle the output shape in multimodal attention (vllm-project#5443) Fix nightly (vllm-project#5413) [bugfix] fix typo of _skip_all_reduce_across_dp_group (vllm-project#5435) [Doc]modify pcp tutorial doc (vllm-project#5440) [Misc] fast fail for exiting if tools/install_flash_infer_attention_score_ops_a2.sh (vllm-project#5422) [Doc] Update DeepSeek V3.1/R1 2P1D doc (vllm-project#5387) [DOC]Fix model weight download links (vllm-project#5436) [Doc] Modify DeepSeek-R1/V3.1 documentation (vllm-project#5426) Revert "[feat] enable hierarchical mc2 ops on A2 by default (vllm-project#5300)" (vllm-project#5434) [Bugfix] fix greedy temperature detection (vllm-project#5417) [doc] Update Qwen3-235B doc for reproducing latest performance (vllm-project#5323) [feat] enable hierarchical mc2 ops on A2 by default (vllm-project#5300) [Doc] delete environment variable HCCL_OP_EXPANSION_MODE in DeepSeekV3.1/R1 (vllm-project#5419) [Doc] add long_sequence feature user guide (vllm-project#5343) ...
…t#5314) RFC: vllm-project#4629 Reason: Eliminate duplicate code for two file(mla_v1.py mla_cp.py) of IMPL classes. vLLM version: 0.13.0rc3 vLLM main: vllm-project/vllm@ad32e3e - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@5fbfa8d --------- Signed-off-by: wujinyuan1 <wjy9595@qq.com> Co-authored-by: wujinyuan1 <wjy9595@qq.com> Co-authored-by: weijinqian0 <1184188277@qq.com> Signed-off-by: Che Ruan <cr623@ic.ac.uk>
…to FIA_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (88 commits) [1/N] Refactor nightly test structure (vllm-project#5479) Docs: Remove deprecated --task parameter for embedding models (vllm-project#5257) Revert "moe_gating_top_k" (vllm-project#5512) [Doc] Fix issue link for 0.12.0 (vllm-project#5500) [CI]update triton ascend version (vllm-project#5392) moe_gating_top_k (vllm-project#5271) [refactor] refactor model runner capture model (vllm-project#5230) Update corresponding vllm commit ID to 12 29 (vllm-project#5475) [Kernel]update csrc cmakelist for open-source cann (vllm-project#5458) [OP] add custom op aclnnMoeInitRoutingCustom (vllm-project#5251) [Refactor][EAGLE] 1/N delete __init__ in mtp_proposer (vllm-project#5176) [Refactor][Triton] Move reject sample triton kernels into ops/triton (vllm-project#5324) [Feature] support eager mode in model runner v2 (vllm-project#5210) [feature] fia support sliding windows (vllm-project#5239) Optimize some rejectsampler functions to make npu op launch non-blocking (vllm-project#4587) [Feature] Support to use fullgraph with eagle (vllm-project#5118) [EPLB][refactor] Modification of the initialization logic for expert_map and log2phy(depend on pr5285) (vllm-project#5311) [Refactor]6/N Extract common code of class AscendMLAImpl (vllm-project#5314) [Refactor] cache cos/sin in mla & remove parameter model in builder. (vllm-project#5277) update vllm pin to 12.27 (vllm-project#5412) ...
…t#5314) RFC: vllm-project#4629 Reason: Eliminate duplicate code for two file(mla_v1.py mla_cp.py) of IMPL classes. vLLM version: 0.13.0rc3 vLLM main: vllm-project/vllm@ad32e3e - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@5fbfa8d --------- Signed-off-by: wujinyuan1 <wjy9595@qq.com> Co-authored-by: wujinyuan1 <wjy9595@qq.com> Co-authored-by: weijinqian0 <1184188277@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…t#5314) RFC: vllm-project#4629 Reason: Eliminate duplicate code for two file(mla_v1.py mla_cp.py) of IMPL classes. vLLM version: 0.13.0rc3 vLLM main: vllm-project/vllm@ad32e3e - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@5fbfa8d --------- Signed-off-by: wujinyuan1 <wjy9595@qq.com> Co-authored-by: wujinyuan1 <wjy9595@qq.com> Co-authored-by: weijinqian0 <1184188277@qq.com>
…t#5314) RFC: vllm-project#4629 Reason: Eliminate duplicate code for two file(mla_v1.py mla_cp.py) of IMPL classes. vLLM version: 0.13.0rc3 vLLM main: vllm-project/vllm@ad32e3e - vLLM version: release/v0.13.0 - vLLM main: vllm-project/vllm@5fbfa8d --------- Signed-off-by: wujinyuan1 <wjy9595@qq.com> Co-authored-by: wujinyuan1 <wjy9595@qq.com> Co-authored-by: weijinqian0 <1184188277@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
RFC: #4629
Reason:
Eliminate duplicate code for two file(mla_v1.py mla_cp.py) of IMPL classes.
vLLM version: 0.13.0rc3
vLLM main: vllm-project/vllm@ad32e3e