[Refactor] Provide a framework to accommodate operators for different hardware devices by weijinqian0 · Pull Request #5735 · vllm-project/vllm-ascend

weijinqian0 · 2026-01-08T12:47:10Z

come from: #5463

Reason:

During the iteration process of the hardware version, there may be a large number of iterations for the operators, which can lead to short-term compatibility differences. Therefore, an intermediate adaptation layer is provided to accommodate the short-term differences in operators.

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@2f4e654

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

gemini-code-assist

Code Review

This pull request introduces a good refactoring by creating a DeviceOperator abstraction to handle hardware-specific operations, improving code structure and maintainability. The implementation uses a factory pattern to select the correct operator class based on the device type, which is a solid approach.

However, I've identified a critical issue and a high-severity issue in the new vllm_ascend/device/device_op.py file. The critical issue is the removal of a .contiguous() call, which is likely to cause runtime errors on A5 devices. The high-severity issue relates to an incorrect type hint that undermines static analysis and future maintenance. I have provided specific comments and suggestions to address these points.

gemini-code-assist · 2026-01-08T12:49:01Z

+DeviceOperator: Optional[
+    CommonDeviceOperator.__class__] = get_device_operator()


The type hint for DeviceOperator is incorrect and misleading for a few reasons:

get_device_operator() never returns None, so Optional is incorrect.

CommonDeviceOperator.__class__ resolves to the generic type, which prevents static type checkers from verifying calls to methods like reshape_and_cache. This can allow typos or signature mismatches to become runtime errors.

I am suggesting the removal of the incorrect type hint. For proper type safety, you should add from typing import Type and then annotate DeviceOperator as DeviceOperator: Type[CommonDeviceOperator] = get_device_operator().

DeviceOperator = get_device_operator()

github-actions · 2026-01-08T12:51:08Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

github-actions · 2026-01-10T15:04:14Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: weijinqian0 <1184188277@qq.com>

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

…to eplb_refactor * 'main' of https://github.com/vllm-project/vllm-ascend: [CI] Unblock 4-cards test (vllm-project#5831) [Refactor] Provide a framework to accommodate operators for different hardware devices (vllm-project#5735) [Refactor] Modify the binding logic to allocate CPU cores for each NPU card (vllm-project#5555) [BugFix] Support setting tp=1 for the Eagle draft model to take effect (vllm-project#5519) support triton of mrope (vllm-project#5664) [bugfix] A2 Environment Pooling for Memcache Compatibility (vllm-project#5601) [Doc] Update community contributors and versioning naming to follow vLLM (vllm-project#5820) [Refactor] Add comments for Metadata classes in attention module (vllm-project#5789) [Bugfix] bugfix for the order of dummy run pad and sync (vllm-project#5777) [CI] Move nightly-a2 test to hk (vllm-project#5807) [CI] Show disk usage for CI shared volume (vllm-project#5821) Bump actions/checkout from 4 to 6 (vllm-project#5795) Bump actions/github-script from 7 to 8 (vllm-project#5796) [bugfix](cp) align max_context_chunk to cp_virtual_block_size (vllm-project#5767) [bugfix]limit graph replay sync (vllm-project#5761) [CI]Add Kimi k2 nightly test (vllm-project#5682) [Doc] add tls check to pd disaggregation readme (vllm-project#5638) [CI] adpat v0.13.0 change (vllm-project#5793)

… hardware devices (vllm-project#5735) come from: vllm-project#5463 Reason: During the iteration process of the hardware version, there may be a large number of iterations for the operators, which can lead to short-term compatibility differences. Therefore, an intermediate adaptation layer is provided to accommodate the short-term differences in operators. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2f4e654 --------- Signed-off-by: weijinqian_v1 <weijinqian@huawei.com> Signed-off-by: weijinqian0 <1184188277@qq.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>

weijinqian0 · 2026-01-27T06:47:54Z

epoch3

中文测试数据集

draft-vocab_size	首token接受度	接受长度
37984	0.27	1.33
75968	0.36	1.45
113952	0.36	1.46

英文测试数据集

draft-vocab_size	首token接受度	接受长度
37984	0.39	1.51
75968	0.51	1.77
113952	0.51	1.80

… hardware devices (vllm-project#5735) come from: vllm-project#5463 Reason: During the iteration process of the hardware version, there may be a large number of iterations for the operators, which can lead to short-term compatibility differences. Therefore, an intermediate adaptation layer is provided to accommodate the short-term differences in operators. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2f4e654 --------- Signed-off-by: weijinqian_v1 <weijinqian@huawei.com> Signed-off-by: weijinqian0 <1184188277@qq.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>

… hardware devices (vllm-project#5735) come from: vllm-project#5463 Reason: During the iteration process of the hardware version, there may be a large number of iterations for the operators, which can lead to short-term compatibility differences. Therefore, an intermediate adaptation layer is provided to accommodate the short-term differences in operators. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2f4e654 --------- Signed-off-by: weijinqian_v1 <weijinqian@huawei.com> Signed-off-by: weijinqian0 <1184188277@qq.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

… hardware devices (vllm-project#5735) come from: vllm-project#5463 Reason: During the iteration process of the hardware version, there may be a large number of iterations for the operators, which can lead to short-term compatibility differences. Therefore, an intermediate adaptation layer is provided to accommodate the short-term differences in operators. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2f4e654 --------- Signed-off-by: weijinqian_v1 <weijinqian@huawei.com> Signed-off-by: weijinqian0 <1184188277@qq.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>

… hardware devices (vllm-project#5735) come from: vllm-project#5463 Reason: During the iteration process of the hardware version, there may be a large number of iterations for the operators, which can lead to short-term compatibility differences. Therefore, an intermediate adaptation layer is provided to accommodate the short-term differences in operators. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2f4e654 --------- Signed-off-by: weijinqian_v1 <weijinqian@huawei.com> Signed-off-by: weijinqian0 <1184188277@qq.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

… hardware devices (vllm-project#5735) come from: vllm-project#5463 Reason: During the iteration process of the hardware version, there may be a large number of iterations for the operators, which can lead to short-term compatibility differences. Therefore, an intermediate adaptation layer is provided to accommodate the short-term differences in operators. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2f4e654 --------- Signed-off-by: weijinqian_v1 <weijinqian@huawei.com> Signed-off-by: weijinqian0 <1184188277@qq.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>

weijinqian_v1 added 6 commits January 8, 2026 13:39

[Refactor] refactor mask in attention_v1 and mla_v1.

394a1c2

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] refactor mask in attention_v1 and mla_v1.

1039ee8

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] refactor mask in attention_v1 and mla_v1.

ff6d2e9

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] refactor mask in attention_v1 and mla_v1.

66ed782

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] refactor mask in attention_v1 and mla_v1.

06f56ad

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] refactor mask in attention_v1 and mla_v1.

8ff0923

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

weijinqian0 changed the title ~~[Refactor]~~ [Refactor] Provide a framework to accommodate operators for different hardware devices Jan 8, 2026

gemini-code-assist bot reviewed Jan 8, 2026

View reviewed changes

[Refactor] refactor mask in attention_v1 and mla_v1.

f0e49e5

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

weijinqian0 mentioned this pull request Jan 9, 2026

[RFC]: Refactor Attention module #5463

Closed

weijinqian0 added ready read for review ready-for-test start test by label for PR labels Jan 9, 2026

wangxiyuan reviewed Jan 9, 2026

View reviewed changes

Comment thread vllm_ascend/device/device_op.py Outdated

weijinqian_v1 added 3 commits January 10, 2026 09:39

[Refactor] rename class name.

3b5299a

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] rename class name.

1180120

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] rename class name.

fcef387

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

github-actions bot added the merge-conflicts label Jan 10, 2026

Merge branch 'main' into main_for_device_adaptor

9672dfe

Signed-off-by: weijinqian0 <1184188277@qq.com>

github-actions bot removed the merge-conflicts label Jan 11, 2026

[Refactor] rename class name.

6bed04f

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

yiz-liu approved these changes Jan 12, 2026

View reviewed changes

weijinqian0 merged commit 1ccb9ac into vllm-project:main Jan 13, 2026
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor] Provide a framework to accommodate operators for different hardware devices#5735

[Refactor] Provide a framework to accommodate operators for different hardware devices#5735
weijinqian0 merged 12 commits intovllm-project:mainfrom
weijinqian0:main_for_device_adaptor

weijinqian0 commented Jan 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

gemini-code-assist bot Jan 8, 2026

Uh oh!

github-actions bot commented Jan 8, 2026

Uh oh!

Uh oh!

github-actions bot commented Jan 10, 2026

Uh oh!

Uh oh!

weijinqian0 commented Jan 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		DeviceOperator: Optional[
		CommonDeviceOperator.__class__] = get_device_operator()

Conversation

weijinqian0 commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 8, 2026

Uh oh!

Uh oh!

github-actions bot commented Jan 10, 2026

Uh oh!

Uh oh!

weijinqian0 commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

中文测试数据集

英文测试数据集

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

weijinqian0 commented Jan 8, 2026 •

edited

Loading

weijinqian0 commented Jan 27, 2026 •

edited

Loading