[refact] unified soc_version code by zzzzwwjj · Pull Request #4359 · vllm-project/vllm-ascend

zzzzwwjj · 2025-11-22T10:48:58Z

What this PR does / why we need it?

Currently, there are two paths to judge the chip type in code, get_ascend_soc_version use get_soc_version api in torch_npu, and is_310p use _build_info.__soc_version__, which generate when install. We need to unify the two paths.

We need to unify these codes based on the following points:

We need to ensure consistency in chip type judgment between compiling and running states;
In compiling state, we need chip type to complete op's compilation, but in running state, we only need device type(910B/910_93/310P/910_95/etc) to make code branch judgement;
In compiling state, torch_npu may not have been installed yet, so we can't use torch_npu's api.

Based on the above points, we have made the following changes:

When user set env SOC_VERSION, use it; when not set, query soc_version by npu-smi;
generate device_type based on soc_version when compiling, and write __device_type__ instead of __soc_version__ in _build_info.py;
In running state, use __device_type__ to judge code branch.

Does this PR introduce any user-facing change?

When not set env SOC_VERSION, it will not be ASCEND910B1 by default, we will query soc_version by npu-smi. And env SOC_VERSION must be in the list soc_to_device in setup.py.

How was this patch tested?

vLLM version: v0.11.0
vLLM main: vllm-project/vllm@2918c1b

github-actions · 2025-11-22T10:49:05Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request is a good refactoring that unifies the SoC version detection and usage across the codebase. By introducing get_chip_info() and centralizing the logic, the code becomes more robust and easier to maintain. The changes are consistently applied across all relevant files, including tests.

I've found a critical bug in the chip detection logic where a string is compared to an integer, which would lead to incorrect hardware detection. I've also added a couple of high-severity comments to improve the robustness of parsing npu-smi output and handling of the SOC_VERSION configuration. Addressing these points will make the implementation more solid.

gemini-code-assist · 2025-11-22T10:51:26Z

+            return "ascend310p"
+        elif "910" in chip_name:
+            # A2 and A3 case
+            return "ascend910b" if chip_count == 1 else "ascend910_93"


The chip_count variable is a string returned from get_value_from_lines. Comparing it with an integer 1 will always evaluate to False, which is a bug. It should be compared with the string '1'.

Suggested change

return "ascend910b" if chip_count == 1 else "ascend910_93"

return "ascend910b" if chip_count == '1' else "ascend910_93"

gemini-code-assist · 2025-11-22T10:51:26Z

+    for line in lines:
+        line = ' '.join(line.split())
+        if key in line:
+            return line.split(':')[-1]


The value returned from line.split(':')[-1] can have leading/trailing whitespace. This can lead to subtle bugs in comparisons later on. For example, if the output of npu-smi is Chip Count : 1, this function will return ' 1'.

Suggested change

return line.split(':')[-1]

return line.split(':')[-1].strip()

gemini-code-assist · 2025-11-22T10:51:26Z

+    soc_version_dict = {
+        "ascend310p": "310P",
+        "ascend910b": "A2",
+        "ascend910_93": "A3",
+    }


The soc_version variable, which can be set from an environment variable, is used as a key for soc_version_dict without validation. If an unsupported soc_version is provided, this will cause a KeyError and crash the setup process with an unclear error message. It's better to validate soc_version first and provide a helpful error message.

Suggested change

soc_version_dict = {

"ascend310p": "310P",

"ascend910b": "A2",

"ascend910_93": "A3",

}

soc_version_dict = {

"ascend310p": "310P",

"ascend910b": "A2",

"ascend910_93": "A3",

}

if soc_version not in soc_version_dict:

raise ValueError(

f"Unsupported SOC_VERSION: {soc_version}. Supported versions are: "

f"{list(soc_version_dict.keys())}")

whx-sjtu · 2025-11-24T03:11:44Z

 class AscendSocVersion(Enum):
    A2 = 0
    A3 = 1
-    UNDEFINED = 2


A variable name starting with _ generally indicates private usage, but the _310P here is actually directly used by other modules. Why use _310P instead of 310P here?

cannot start with digit.

whx-sjtu · 2025-11-24T03:17:48Z

                             os.path.join(ROOT_DIR, "vllm_ascend", "envs.py"))

+if not envs.SOC_VERSION:
+    envs.SOC_VERSION = get_chip_info()


Add a warning here if user setting of SOC_VERSION doesn't match with results of get_chip_info()

whx-sjtu · 2025-11-24T03:24:22Z

    with open(package_dir, "w+") as f:
        f.write('# Auto-generated file\n')
-        f.write(f"__soc_version__ = '{soc_version}'\n")
+        f.write(f"__soc_version__ = '{soc_version_dict[soc_version]}'\n")


What if user sets a value of SOC_VERSION with wrong format that doesn't match with any keys in soc_version_dict?

github-actions · 2025-11-24T09:10:55Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

ChenxiQ

LGTM

whx-sjtu · 2025-11-25T11:48:01Z

+        "ascend910_9392": "A3",
+        "ascend910_9382": "A3",
+        "ascend910_9362": "A3",
+        "ascend310p1": "300I",


Why use 300I instead of 310P here? I think 300I is confused because there are machines like 300I A2.

whx-sjtu · 2025-11-25T11:48:17Z

    A2 = 0
    A3 = 1
-    UNDEFINED = 2
+    _300I = 2


Signed-off-by: zzzzwwjj <1183291235@qq.com>

Currently, there are two paths to judge the chip type in code, `get_ascend_soc_version` use `get_soc_version` api in torch_npu, and `is_310p` `use _build_info.__soc_version__`, which generate when install. We need to unify the two paths. We need to unify these codes based on the following points: 1. We need to ensure consistency in chip type judgment between compiling and running states; 2. In compiling state, we need chip type to complete op's compilation, but in running state, we only need device type(910B/910_93/310P/910_95/etc) to make code branch judgement; 3. In compiling state, torch_npu may not have been installed yet, so we can't use torch_npu's api. Based on the above points, we have made the following changes: 1. When user set env `SOC_VERSION`, use it; when not set, query soc_version by `npu-smi`; 2. generate device_type based on soc_version when compiling, and write `__device_type__` instead of `__soc_version__` in `_build_info.py`; 3. In running state, use `__device_type__` to judge code branch. When not set env `SOC_VERSION`, it will not be `ASCEND910B1` by default, we will query soc_version by `npu-smi`. And env `SOC_VERSION` must be in the list `soc_to_device` in `setup.py`. - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@2918c1b Signed-off-by: zzzzwwjj <1183291235@qq.com>

### What this PR does / why we need it? Currently, there are two paths to judge the chip type in code, `get_ascend_soc_version` use `get_soc_version` api in torch_npu, and `is_310p` `use _build_info.__soc_version__`, which generate when install. We need to unify the two paths. We need to unify these codes based on the following points: 1. We need to ensure consistency in chip type judgment between compiling and running states; 2. In compiling state, we need chip type to complete op's compilation, but in running state, we only need device type(910B/910_93/310P/910_95/etc) to make code branch judgement; 3. In compiling state, torch_npu may not have been installed yet, so we can't use torch_npu's api. Based on the above points, we have made the following changes: 1. When user set env `SOC_VERSION`, use it; when not set, query soc_version by `npu-smi`; 2. generate device_type based on soc_version when compiling, and write `__device_type__` instead of `__soc_version__` in `_build_info.py`; 3. In running state, use `__device_type__` to judge code branch. ### Does this PR introduce _any_ user-facing change? When not set env `SOC_VERSION`, it will not be `ASCEND910B1` by default, we will query soc_version by `npu-smi`. And env `SOC_VERSION` must be in the list `soc_to_device` in `setup.py`. - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@2918c1b Signed-off-by: zzzzwwjj <1183291235@qq.com>

### What this PR does / why we need it? Currently, there are two paths to judge the chip type in code, `get_ascend_soc_version` use `get_soc_version` api in torch_npu, and `is_310p` `use _build_info.__soc_version__`, which generate when install. We need to unify the two paths. We need to unify these codes based on the following points: 1. We need to ensure consistency in chip type judgment between compiling and running states; 2. In compiling state, we need chip type to complete op's compilation, but in running state, we only need device type(910B/910_93/310P/910_95/etc) to make code branch judgement; 3. In compiling state, torch_npu may not have been installed yet, so we can't use torch_npu's api. Based on the above points, we have made the following changes: 1. When user set env `SOC_VERSION`, use it; when not set, query soc_version by `npu-smi`; 2. generate device_type based on soc_version when compiling, and write `__device_type__` instead of `__soc_version__` in `_build_info.py`; 3. In running state, use `__device_type__` to judge code branch. ### Does this PR introduce _any_ user-facing change? When not set env `SOC_VERSION`, it will not be `ASCEND910B1` by default, we will query soc_version by `npu-smi`. And env `SOC_VERSION` must be in the list `soc_to_device` in `setup.py`. - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@2918c1b Signed-off-by: zzzzwwjj <1183291235@qq.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

### What this PR does / why we need it? Currently, there are two paths to judge the chip type in code, `get_ascend_soc_version` use `get_soc_version` api in torch_npu, and `is_310p` `use _build_info.__soc_version__`, which generate when install. We need to unify the two paths. We need to unify these codes based on the following points: 1. We need to ensure consistency in chip type judgment between compiling and running states; 2. In compiling state, we need chip type to complete op's compilation, but in running state, we only need device type(910B/910_93/310P/910_95/etc) to make code branch judgement; 3. In compiling state, torch_npu may not have been installed yet, so we can't use torch_npu's api. Based on the above points, we have made the following changes: 1. When user set env `SOC_VERSION`, use it; when not set, query soc_version by `npu-smi`; 2. generate device_type based on soc_version when compiling, and write `__device_type__` instead of `__soc_version__` in `_build_info.py`; 3. In running state, use `__device_type__` to judge code branch. ### Does this PR introduce _any_ user-facing change? When not set env `SOC_VERSION`, it will not be `ASCEND910B1` by default, we will query soc_version by `npu-smi`. And env `SOC_VERSION` must be in the list `soc_to_device` in `setup.py`. - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@2918c1b Signed-off-by: zzzzwwjj <1183291235@qq.com>

@zzzzwwjj

I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend committer team. @zzzzwwjj --- - Review Quality‌: He has completed 80+reviews since April. 2025, include #3232 (comment), #4822 (comment), #4768 (comment) high quality review. - Sustained Contributions 15+ Valuable bug fix and refactor is very good. https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved Continuous optimization of code architecture https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged - Quality Contribution‌: #1229 #1979 #4359 #4878 - Community Involvement‌: He lead the #1147, to refactor AscendFusedMoE at the first time. He shared topics about large-scale distributed inference and reinforcement learning on vLLM-Ascend meetup on August 2nd. @realliujiaxu --- - Review Quality‌: He has completed about [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+) since September, include #4868 (comment), #2275 (comment). - Sustained Contributions He has completed (17 commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged], continuously optimizing the performance of the MoE model. - Quality Contribution‌: Contributed the Flash Comm1 feature to the community, supporting both eager and aclgraph execution modes, while compatible with multiple MoE models including DeepSeek and GLM4.5. - #3334 - #3420 - #3015 co-author: - #3495 - #4868 - Community Involvement‌: 1. Completed two major refactors, enabling vllm-ascend to evolve more rapidly and robustly: [Linear module](#2867) and [rejection sampler](#4975) 2. [fixed 8 bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+) in graph mode, spec decoding and async scheduling. @LCAIZJ --- - Review Quality‌: He's been the go-to reviewer for virtually all PD disaggregation and KV Pool related PRs, having completed [30+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+) since May 2025. Notable examples include [discussion_r2553887360](#4345 (comment)), [issuecomment-3540994801](#4161 (comment)), and [discussion_r2492593988](#3981 (comment)), all demonstrating thorough and insightful feedback. - Sustained and Quality Contributions: His contributions reflect a strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in prefill-decode disaggregation and KV pool areas ([7 PRs merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)). Prefill-Decode Disaggregation: Delivered KV transfer functionality using Mooncake TransferEngine and enabled layerwise KV transfer #1568 #2602 KV Pool: Developed the foundational KV Pool infrastructure and migrated it to the latest ADXL stack #2913 #3350 - Quality Contribution‌: #1568 #2602 #2913 #3350 - Community Involvement‌: He actively responds to [community issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ), continuously monitors functionality and accuracy issues related to PD disaggregation and KV Pool, and proactively delivers [bug fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

@zzzzwwjj

…t#5152) I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend committer team. @zzzzwwjj --- - Review Quality‌: He has completed 80+reviews since April. 2025, include vllm-project#3232 (comment), vllm-project#4822 (comment), vllm-project#4768 (comment) high quality review. - Sustained Contributions 15+ Valuable bug fix and refactor is very good. https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved Continuous optimization of code architecture https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged - Quality Contribution‌: vllm-project#1229 vllm-project#1979 vllm-project#4359 vllm-project#4878 - Community Involvement‌: He lead the vllm-project#1147, to refactor AscendFusedMoE at the first time. He shared topics about large-scale distributed inference and reinforcement learning on vLLM-Ascend meetup on August 2nd. @realliujiaxu --- - Review Quality‌: He has completed about [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+) since September, include vllm-project#4868 (comment), vllm-project#2275 (comment). - Sustained Contributions He has completed (17 commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged], continuously optimizing the performance of the MoE model. - Quality Contribution‌: Contributed the Flash Comm1 feature to the community, supporting both eager and aclgraph execution modes, while compatible with multiple MoE models including DeepSeek and GLM4.5. - vllm-project#3334 - vllm-project#3420 - vllm-project#3015 co-author: - vllm-project#3495 - vllm-project#4868 - Community Involvement‌: 1. Completed two major refactors, enabling vllm-ascend to evolve more rapidly and robustly: [Linear module](vllm-project#2867) and [rejection sampler](vllm-project#4975) 2. [fixed 8 bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+) in graph mode, spec decoding and async scheduling. @LCAIZJ --- - Review Quality‌: He's been the go-to reviewer for virtually all PD disaggregation and KV Pool related PRs, having completed [30+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+) since May 2025. Notable examples include [discussion_r2553887360](vllm-project#4345 (comment)), [issuecomment-3540994801](vllm-project#4161 (comment)), and [discussion_r2492593988](vllm-project#3981 (comment)), all demonstrating thorough and insightful feedback. - Sustained and Quality Contributions: His contributions reflect a strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in prefill-decode disaggregation and KV pool areas ([7 PRs merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)). Prefill-Decode Disaggregation: Delivered KV transfer functionality using Mooncake TransferEngine and enabled layerwise KV transfer vllm-project#1568 vllm-project#2602 KV Pool: Developed the foundational KV Pool infrastructure and migrated it to the latest ADXL stack vllm-project#2913 vllm-project#3350 - Quality Contribution‌: vllm-project#1568 vllm-project#2602 vllm-project#2913 vllm-project#3350 - Community Involvement‌: He actively responds to [community issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ), continuously monitors functionality and accuracy issues related to PD disaggregation and KV Pool, and proactively delivers [bug fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

@zzzzwwjj

…t#5152) I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend committer team. @zzzzwwjj --- - Review Quality‌: He has completed 80+reviews since April. 2025, include vllm-project#3232 (comment), vllm-project#4822 (comment), vllm-project#4768 (comment) high quality review. - Sustained Contributions 15+ Valuable bug fix and refactor is very good. https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved Continuous optimization of code architecture https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged - Quality Contribution‌: vllm-project#1229 vllm-project#1979 vllm-project#4359 vllm-project#4878 - Community Involvement‌: He lead the vllm-project#1147, to refactor AscendFusedMoE at the first time. He shared topics about large-scale distributed inference and reinforcement learning on vLLM-Ascend meetup on August 2nd. @realliujiaxu --- - Review Quality‌: He has completed about [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+) since September, include vllm-project#4868 (comment), vllm-project#2275 (comment). - Sustained Contributions He has completed (17 commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged], continuously optimizing the performance of the MoE model. - Quality Contribution‌: Contributed the Flash Comm1 feature to the community, supporting both eager and aclgraph execution modes, while compatible with multiple MoE models including DeepSeek and GLM4.5. - vllm-project#3334 - vllm-project#3420 - vllm-project#3015 co-author: - vllm-project#3495 - vllm-project#4868 - Community Involvement‌: 1. Completed two major refactors, enabling vllm-ascend to evolve more rapidly and robustly: [Linear module](vllm-project#2867) and [rejection sampler](vllm-project#4975) 2. [fixed 8 bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+) in graph mode, spec decoding and async scheduling. @LCAIZJ --- - Review Quality‌: He's been the go-to reviewer for virtually all PD disaggregation and KV Pool related PRs, having completed [30+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+) since May 2025. Notable examples include [discussion_r2553887360](vllm-project#4345 (comment)), [issuecomment-3540994801](vllm-project#4161 (comment)), and [discussion_r2492593988](vllm-project#3981 (comment)), all demonstrating thorough and insightful feedback. - Sustained and Quality Contributions: His contributions reflect a strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in prefill-decode disaggregation and KV pool areas ([7 PRs merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)). Prefill-Decode Disaggregation: Delivered KV transfer functionality using Mooncake TransferEngine and enabled layerwise KV transfer vllm-project#1568 vllm-project#2602 KV Pool: Developed the foundational KV Pool infrastructure and migrated it to the latest ADXL stack vllm-project#2913 vllm-project#3350 - Quality Contribution‌: vllm-project#1568 vllm-project#2602 vllm-project#2913 vllm-project#3350 - Community Involvement‌: He actively responds to [community issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ), continuously monitors functionality and accuracy issues related to PD disaggregation and KV Pool, and proactively delivers [bug fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

@zzzzwwjj

…t#5152) I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend committer team. @zzzzwwjj --- - Review Quality‌: He has completed 80+reviews since April. 2025, include vllm-project#3232 (comment), vllm-project#4822 (comment), vllm-project#4768 (comment) high quality review. - Sustained Contributions 15+ Valuable bug fix and refactor is very good. https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved Continuous optimization of code architecture https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged - Quality Contribution‌: vllm-project#1229 vllm-project#1979 vllm-project#4359 vllm-project#4878 - Community Involvement‌: He lead the vllm-project#1147, to refactor AscendFusedMoE at the first time. He shared topics about large-scale distributed inference and reinforcement learning on vLLM-Ascend meetup on August 2nd. @realliujiaxu --- - Review Quality‌: He has completed about [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+) since September, include vllm-project#4868 (comment), vllm-project#2275 (comment). - Sustained Contributions He has completed (17 commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged], continuously optimizing the performance of the MoE model. - Quality Contribution‌: Contributed the Flash Comm1 feature to the community, supporting both eager and aclgraph execution modes, while compatible with multiple MoE models including DeepSeek and GLM4.5. - vllm-project#3334 - vllm-project#3420 - vllm-project#3015 co-author: - vllm-project#3495 - vllm-project#4868 - Community Involvement‌: 1. Completed two major refactors, enabling vllm-ascend to evolve more rapidly and robustly: [Linear module](vllm-project#2867) and [rejection sampler](vllm-project#4975) 2. [fixed 8 bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+) in graph mode, spec decoding and async scheduling. @LCAIZJ --- - Review Quality‌: He's been the go-to reviewer for virtually all PD disaggregation and KV Pool related PRs, having completed [30+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+) since May 2025. Notable examples include [discussion_r2553887360](vllm-project#4345 (comment)), [issuecomment-3540994801](vllm-project#4161 (comment)), and [discussion_r2492593988](vllm-project#3981 (comment)), all demonstrating thorough and insightful feedback. - Sustained and Quality Contributions: His contributions reflect a strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in prefill-decode disaggregation and KV pool areas ([7 PRs merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)). Prefill-Decode Disaggregation: Delivered KV transfer functionality using Mooncake TransferEngine and enabled layerwise KV transfer vllm-project#1568 vllm-project#2602 KV Pool: Developed the foundational KV Pool infrastructure and migrated it to the latest ADXL stack vllm-project#2913 vllm-project#3350 - Quality Contribution‌: vllm-project#1568 vllm-project#2602 vllm-project#2913 vllm-project#3350 - Community Involvement‌: He actively responds to [community issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ), continuously monitors functionality and accuracy issues related to PD disaggregation and KV Pool, and proactively delivers [bug fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

github-actions bot added module:tests module:ops module:core module:quantization labels Nov 22, 2025

gemini-code-assist bot reviewed Nov 22, 2025

View reviewed changes

zzzzwwjj force-pushed the rm_is_310p branch 3 times, most recently from 370252b to 5d2d853 Compare November 24, 2025 03:00

whx-sjtu reviewed Nov 24, 2025

View reviewed changes

zzzzwwjj force-pushed the rm_is_310p branch from 5d2d853 to 42336cb Compare November 24, 2025 03:12

whx-sjtu requested changes Nov 24, 2025

View reviewed changes

zzzzwwjj force-pushed the rm_is_310p branch from 42336cb to bc8c534 Compare November 24, 2025 03:22

whx-sjtu reviewed Nov 24, 2025

View reviewed changes

zzzzwwjj force-pushed the rm_is_310p branch 4 times, most recently from c4a1a51 to b49b9c3 Compare November 24, 2025 06:47

github-actions bot added the merge-conflicts label Nov 24, 2025

yiz-liu mentioned this pull request Nov 24, 2025

vllm-ascend support Ascend950 with Qwen dense model. #4228

Merged

zzzzwwjj force-pushed the rm_is_310p branch 3 times, most recently from 2df21fa to dc3b196 Compare November 24, 2025 12:57

github-actions bot removed the merge-conflicts label Nov 24, 2025

zzzzwwjj requested a review from whx-sjtu November 24, 2025 13:09

zzzzwwjj force-pushed the rm_is_310p branch 4 times, most recently from 2199c52 to 32b86c5 Compare November 25, 2025 02:19

zzzzwwjj force-pushed the rm_is_310p branch 4 times, most recently from 3652c39 to 1307d03 Compare November 25, 2025 04:17

ChenxiQ approved these changes Nov 25, 2025

View reviewed changes

zzzzwwjj force-pushed the rm_is_310p branch from 1307d03 to af5c18a Compare November 25, 2025 09:42

weijinqian0 added ready read for review ready-for-test start test by label for PR labels Nov 25, 2025

whx-sjtu requested changes Nov 25, 2025

View reviewed changes

weijinqian0 approved these changes Nov 25, 2025

View reviewed changes

zzzzwwjj force-pushed the rm_is_310p branch from af5c18a to f69bfca Compare November 26, 2025 02:24

[refact] unified soc_version code

f9898ee

Signed-off-by: zzzzwwjj <1183291235@qq.com>

zzzzwwjj force-pushed the rm_is_310p branch from f69bfca to f9898ee Compare November 26, 2025 02:29

MengqingCao approved these changes Nov 26, 2025

View reviewed changes

MengqingCao merged commit 136ea9f into vllm-project:main Nov 26, 2025
22 checks passed

zzzzwwjj deleted the rm_is_310p branch November 26, 2025 07:27

wangxiyuan mentioned this pull request Dec 18, 2025

Nominate new maintainers @zzzzwwjj @realliujiaxu @LCAIZJ #5152

Merged

	return "ascend910b" if chip_count == 1 else "ascend910_93"
	return "ascend910b" if chip_count == '1' else "ascend910_93"

	return line.split(':')[-1]
	return line.split(':')[-1].strip()

Conversation

zzzzwwjj commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Nov 22, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

whx-sjtu Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

zzzzwwjj Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

whx-sjtu Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

whx-sjtu Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 24, 2025

Uh oh!

ChenxiQ left a comment

Choose a reason for hiding this comment

Uh oh!

whx-sjtu Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

whx-sjtu Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zzzzwwjj commented Nov 22, 2025 •

edited

Loading