Skip to content

[refact] unified soc_version code#4359

Merged
MengqingCao merged 1 commit intovllm-project:mainfrom
zzzzwwjj:rm_is_310p
Nov 26, 2025
Merged

[refact] unified soc_version code#4359
MengqingCao merged 1 commit intovllm-project:mainfrom
zzzzwwjj:rm_is_310p

Conversation

@zzzzwwjj
Copy link
Copy Markdown
Collaborator

@zzzzwwjj zzzzwwjj commented Nov 22, 2025

What this PR does / why we need it?

Currently, there are two paths to judge the chip type in code, get_ascend_soc_version use get_soc_version api in torch_npu, and is_310p use _build_info.__soc_version__, which generate when install. We need to unify the two paths.

We need to unify these codes based on the following points:

  1. We need to ensure consistency in chip type judgment between compiling and running states;
  2. In compiling state, we need chip type to complete op's compilation, but in running state, we only need device type(910B/910_93/310P/910_95/etc) to make code branch judgement;
  3. In compiling state, torch_npu may not have been installed yet, so we can't use torch_npu's api.

Based on the above points, we have made the following changes:

  1. When user set env SOC_VERSION, use it; when not set, query soc_version by npu-smi;
  2. generate device_type based on soc_version when compiling, and write __device_type__ instead of __soc_version__ in _build_info.py;
  3. In running state, use __device_type__ to judge code branch.

Does this PR introduce any user-facing change?

When not set env SOC_VERSION, it will not be ASCEND910B1 by default, we will query soc_version by npu-smi. And env SOC_VERSION must be in the list soc_to_device in setup.py.

How was this patch tested?

@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a good refactoring that unifies the SoC version detection and usage across the codebase. By introducing get_chip_info() and centralizing the logic, the code becomes more robust and easier to maintain. The changes are consistently applied across all relevant files, including tests.

I've found a critical bug in the chip detection logic where a string is compared to an integer, which would lead to incorrect hardware detection. I've also added a couple of high-severity comments to improve the robustness of parsing npu-smi output and handling of the SOC_VERSION configuration. Addressing these points will make the implementation more solid.

Comment thread setup.py Outdated
return "ascend310p"
elif "910" in chip_name:
# A2 and A3 case
return "ascend910b" if chip_count == 1 else "ascend910_93"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The chip_count variable is a string returned from get_value_from_lines. Comparing it with an integer 1 will always evaluate to False, which is a bug. It should be compared with the string '1'.

Suggested change
return "ascend910b" if chip_count == 1 else "ascend910_93"
return "ascend910b" if chip_count == '1' else "ascend910_93"

Comment thread setup.py Outdated
for line in lines:
line = ' '.join(line.split())
if key in line:
return line.split(':')[-1]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The value returned from line.split(':')[-1] can have leading/trailing whitespace. This can lead to subtle bugs in comparisons later on. For example, if the output of npu-smi is Chip Count : 1, this function will return ' 1'.

Suggested change
return line.split(':')[-1]
return line.split(':')[-1].strip()

Comment thread setup.py Outdated
Comment on lines +112 to +116
soc_version_dict = {
"ascend310p": "310P",
"ascend910b": "A2",
"ascend910_93": "A3",
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The soc_version variable, which can be set from an environment variable, is used as a key for soc_version_dict without validation. If an unsupported soc_version is provided, this will cause a KeyError and crash the setup process with an unclear error message. It's better to validate soc_version first and provide a helpful error message.

Suggested change
soc_version_dict = {
"ascend310p": "310P",
"ascend910b": "A2",
"ascend910_93": "A3",
}
soc_version_dict = {
"ascend310p": "310P",
"ascend910b": "A2",
"ascend910_93": "A3",
}
if soc_version not in soc_version_dict:
raise ValueError(
f"Unsupported SOC_VERSION: {soc_version}. Supported versions are: "
f"{list(soc_version_dict.keys())}")

@zzzzwwjj zzzzwwjj force-pushed the rm_is_310p branch 3 times, most recently from 370252b to 5d2d853 Compare November 24, 2025 03:00
Comment thread vllm_ascend/utils.py
class AscendSocVersion(Enum):
A2 = 0
A3 = 1
UNDEFINED = 2
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A variable name starting with _ generally indicates private usage, but the _310P here is actually directly used by other modules. Why use _310P instead of 310P here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cannot start with digit.

Comment thread setup.py Outdated
os.path.join(ROOT_DIR, "vllm_ascend", "envs.py"))

if not envs.SOC_VERSION:
envs.SOC_VERSION = get_chip_info()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a warning here if user setting of SOC_VERSION doesn't match with results of get_chip_info()

Comment thread setup.py Outdated
with open(package_dir, "w+") as f:
f.write('# Auto-generated file\n')
f.write(f"__soc_version__ = '{soc_version}'\n")
f.write(f"__soc_version__ = '{soc_version_dict[soc_version]}'\n")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if user sets a value of SOC_VERSION with wrong format that doesn't match with any keys in soc_version_dict?

@zzzzwwjj zzzzwwjj force-pushed the rm_is_310p branch 4 times, most recently from c4a1a51 to b49b9c3 Compare November 24, 2025 06:47
@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@zzzzwwjj zzzzwwjj force-pushed the rm_is_310p branch 4 times, most recently from 3652c39 to 1307d03 Compare November 25, 2025 04:17
Copy link
Copy Markdown
Contributor

@ChenxiQ ChenxiQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@weijinqian0 weijinqian0 added ready read for review ready-for-test start test by label for PR labels Nov 25, 2025
Comment thread setup.py Outdated
"ascend910_9392": "A3",
"ascend910_9382": "A3",
"ascend910_9362": "A3",
"ascend310p1": "300I",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use 300I instead of 310P here? I think 300I is confused because there are machines like 300I A2.

Comment thread vllm_ascend/utils.py Outdated
A2 = 0
A3 = 1
UNDEFINED = 2
_300I = 2
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Signed-off-by: zzzzwwjj <1183291235@qq.com>
@MengqingCao MengqingCao merged commit 136ea9f into vllm-project:main Nov 26, 2025
22 checks passed
@zzzzwwjj zzzzwwjj deleted the rm_is_310p branch November 26, 2025 07:27
Kurumi5210 pushed a commit to lidenghui1110/vllm-ascend that referenced this pull request Nov 26, 2025
Currently, there are two paths to judge the chip type in code,
`get_ascend_soc_version` use `get_soc_version` api in torch_npu, and
`is_310p` `use _build_info.__soc_version__`, which generate when
install. We need to unify the two paths.

We need to unify these codes based on the following points:

1. We need to ensure consistency in chip type judgment between compiling
and running states;
2. In compiling state, we need chip type to complete op's compilation,
but in running state, we only need device
type(910B/910_93/310P/910_95/etc) to make code branch judgement;
3. In compiling state, torch_npu may not have been installed yet, so we
can't use torch_npu's api.

Based on the above points, we have made the following changes:

1. When user set env `SOC_VERSION`, use it; when not set, query
soc_version by `npu-smi`;
2. generate device_type based on soc_version when compiling, and write
`__device_type__` instead of `__soc_version__` in `_build_info.py`;
3. In running state, use `__device_type__` to judge code branch.

When not set env `SOC_VERSION`, it will not be `ASCEND910B1` by default,
we will query soc_version by `npu-smi`. And env `SOC_VERSION` must be in
the list `soc_to_device` in `setup.py`.

- vLLM version: v0.11.0
- vLLM main:
vllm-project/vllm@2918c1b

Signed-off-by: zzzzwwjj <1183291235@qq.com>
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Nov 29, 2025
Currently, there are two paths to judge the chip type in code,
`get_ascend_soc_version` use `get_soc_version` api in torch_npu, and
`is_310p` `use _build_info.__soc_version__`, which generate when
install. We need to unify the two paths.

We need to unify these codes based on the following points:

1. We need to ensure consistency in chip type judgment between compiling
and running states;
2. In compiling state, we need chip type to complete op's compilation,
but in running state, we only need device
type(910B/910_93/310P/910_95/etc) to make code branch judgement;
3. In compiling state, torch_npu may not have been installed yet, so we
can't use torch_npu's api.

Based on the above points, we have made the following changes:

1. When user set env `SOC_VERSION`, use it; when not set, query
soc_version by `npu-smi`;
2. generate device_type based on soc_version when compiling, and write
`__device_type__` instead of `__soc_version__` in `_build_info.py`;
3. In running state, use `__device_type__` to judge code branch.

When not set env `SOC_VERSION`, it will not be `ASCEND910B1` by default,
we will query soc_version by `npu-smi`. And env `SOC_VERSION` must be in
the list `soc_to_device` in `setup.py`.

- vLLM version: v0.11.0
- vLLM main:
vllm-project/vllm@2918c1b

Signed-off-by: zzzzwwjj <1183291235@qq.com>
845473182 pushed a commit to Skywalker-EP/vllm-ascend that referenced this pull request Dec 1, 2025
Currently, there are two paths to judge the chip type in code,
`get_ascend_soc_version` use `get_soc_version` api in torch_npu, and
`is_310p` `use _build_info.__soc_version__`, which generate when
install. We need to unify the two paths.

We need to unify these codes based on the following points:

1. We need to ensure consistency in chip type judgment between compiling
and running states;
2. In compiling state, we need chip type to complete op's compilation,
but in running state, we only need device
type(910B/910_93/310P/910_95/etc) to make code branch judgement;
3. In compiling state, torch_npu may not have been installed yet, so we
can't use torch_npu's api.

Based on the above points, we have made the following changes:

1. When user set env `SOC_VERSION`, use it; when not set, query
soc_version by `npu-smi`;
2. generate device_type based on soc_version when compiling, and write
`__device_type__` instead of `__soc_version__` in `_build_info.py`;
3. In running state, use `__device_type__` to judge code branch.

When not set env `SOC_VERSION`, it will not be `ASCEND910B1` by default,
we will query soc_version by `npu-smi`. And env `SOC_VERSION` must be in
the list `soc_to_device` in `setup.py`.

- vLLM version: v0.11.0
- vLLM main:
vllm-project/vllm@2918c1b

Signed-off-by: zzzzwwjj <1183291235@qq.com>
Meihan-chen pushed a commit to Meihan-chen/vllm-ascend that referenced this pull request Dec 5, 2025
### What this PR does / why we need it?

Currently, there are two paths to judge the chip type in code,
`get_ascend_soc_version` use `get_soc_version` api in torch_npu, and
`is_310p` `use _build_info.__soc_version__`, which generate when
install. We need to unify the two paths.

We need to unify these codes based on the following points:

1. We need to ensure consistency in chip type judgment between compiling
and running states;
2. In compiling state, we need chip type to complete op's compilation,
but in running state, we only need device
type(910B/910_93/310P/910_95/etc) to make code branch judgement;
3. In compiling state, torch_npu may not have been installed yet, so we
can't use torch_npu's api.

Based on the above points, we have made the following changes:

1. When user set env `SOC_VERSION`, use it; when not set, query
soc_version by `npu-smi`;
2. generate device_type based on soc_version when compiling, and write
`__device_type__` instead of `__soc_version__` in `_build_info.py`;
3. In running state, use `__device_type__` to judge code branch.

### Does this PR introduce _any_ user-facing change?

When not set env `SOC_VERSION`, it will not be `ASCEND910B1` by default,
we will query soc_version by `npu-smi`. And env `SOC_VERSION` must be in
the list `soc_to_device` in `setup.py`.

- vLLM version: v0.11.0
- vLLM main:
vllm-project/vllm@2918c1b

Signed-off-by: zzzzwwjj <1183291235@qq.com>
Clorist33 pushed a commit to Clorist33/vllm-ascend that referenced this pull request Dec 9, 2025
### What this PR does / why we need it?

Currently, there are two paths to judge the chip type in code,
`get_ascend_soc_version` use `get_soc_version` api in torch_npu, and
`is_310p` `use _build_info.__soc_version__`, which generate when
install. We need to unify the two paths.

We need to unify these codes based on the following points:

1. We need to ensure consistency in chip type judgment between compiling
and running states;
2. In compiling state, we need chip type to complete op's compilation,
but in running state, we only need device
type(910B/910_93/310P/910_95/etc) to make code branch judgement;
3. In compiling state, torch_npu may not have been installed yet, so we
can't use torch_npu's api.

Based on the above points, we have made the following changes:

1. When user set env `SOC_VERSION`, use it; when not set, query
soc_version by `npu-smi`;
2. generate device_type based on soc_version when compiling, and write
`__device_type__` instead of `__soc_version__` in `_build_info.py`;
3. In running state, use `__device_type__` to judge code branch.

### Does this PR introduce _any_ user-facing change?

When not set env `SOC_VERSION`, it will not be `ASCEND910B1` by default,
we will query soc_version by `npu-smi`. And env `SOC_VERSION` must be in
the list `soc_to_device` in `setup.py`.

- vLLM version: v0.11.0
- vLLM main:
vllm-project/vllm@2918c1b

Signed-off-by: zzzzwwjj <1183291235@qq.com>
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
Clorist33 pushed a commit to Clorist33/vllm-ascend that referenced this pull request Dec 10, 2025
### What this PR does / why we need it?

Currently, there are two paths to judge the chip type in code,
`get_ascend_soc_version` use `get_soc_version` api in torch_npu, and
`is_310p` `use _build_info.__soc_version__`, which generate when
install. We need to unify the two paths.

We need to unify these codes based on the following points:

1. We need to ensure consistency in chip type judgment between compiling
and running states;
2. In compiling state, we need chip type to complete op's compilation,
but in running state, we only need device
type(910B/910_93/310P/910_95/etc) to make code branch judgement;
3. In compiling state, torch_npu may not have been installed yet, so we
can't use torch_npu's api.

Based on the above points, we have made the following changes:

1. When user set env `SOC_VERSION`, use it; when not set, query
soc_version by `npu-smi`;
2. generate device_type based on soc_version when compiling, and write
`__device_type__` instead of `__soc_version__` in `_build_info.py`;
3. In running state, use `__device_type__` to judge code branch.

### Does this PR introduce _any_ user-facing change?

When not set env `SOC_VERSION`, it will not be `ASCEND910B1` by default,
we will query soc_version by `npu-smi`. And env `SOC_VERSION` must be in
the list `soc_to_device` in `setup.py`.

- vLLM version: v0.11.0
- vLLM main:
vllm-project/vllm@2918c1b

Signed-off-by: zzzzwwjj <1183291235@qq.com>
wangxiyuan added a commit that referenced this pull request Dec 18, 2025
I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend
committer team.

@zzzzwwjj
---
- Review Quality‌:
He has completed 80+reviews since April. 2025, include
#3232 (comment),
#4822 (comment),
#4768 (comment)
high quality review.

- Sustained Contributions
15+ Valuable bug fix and refactor is very good.

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved
Continuous optimization of code architecture

https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged

- Quality Contribution‌:
#1229
#1979
#4359
#4878

- Community Involvement‌: 
He lead the #1147, to
refactor AscendFusedMoE at the first time.
He shared topics about large-scale distributed inference and
reinforcement learning on vLLM-Ascend meetup on August 2nd.

@realliujiaxu
---
- Review Quality‌:
He has completed about [40+
reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+)
since September, include
#4868 (comment),
#2275 (comment).

- Sustained Contributions
He has completed (17
commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged],
continuously optimizing the performance of the MoE model.

- Quality Contribution‌:

Contributed the Flash Comm1 feature to the community, supporting both
eager and aclgraph execution modes, while compatible with multiple MoE
models including DeepSeek and GLM4.5.
  - #3334
  - #3420
  - #3015
  
  co-author:
  - #3495
  - #4868

- Community Involvement‌: 
1. Completed two major refactors, enabling vllm-ascend to evolve more
rapidly and robustly: [Linear
module](#2867) and
[rejection
sampler](#4975)
2. [fixed 8
bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+)
in graph mode, spec decoding and async scheduling.

@LCAIZJ
---
- Review Quality‌: He's been the go-to reviewer for virtually all PD
disaggregation and KV Pool related PRs, having completed [30+
reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+)
since May 2025. Notable examples include
[discussion_r2553887360](#4345 (comment)),
[issuecomment-3540994801](#4161 (comment)),
and
[discussion_r2492593988](#3981 (comment)),
all demonstrating thorough and insightful feedback.
- Sustained and Quality Contributions: His contributions reflect a
strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in
prefill-decode disaggregation and KV pool areas ([7 PRs
merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)).
Prefill-Decode Disaggregation: Delivered KV transfer functionality using
Mooncake TransferEngine and enabled layerwise KV transfer
#1568
#2602
KV Pool: Developed the foundational KV Pool infrastructure and migrated
it to the latest ADXL stack
#2913
#3350
- Quality Contribution‌:
#1568
#2602
#2913
#3350
- Community Involvement‌: 
He actively responds to [community
issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ),
continuously monitors functionality and accuracy issues related to PD
disaggregation and KV Pool, and proactively delivers [bug
fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix).
- vLLM version: v0.12.0
- vLLM main:
vllm-project/vllm@ad32e3e

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
chenaoxuan pushed a commit to chenaoxuan/vllm-ascend that referenced this pull request Dec 20, 2025
…t#5152)

I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend
committer team.

@zzzzwwjj
---
- Review Quality‌:
He has completed 80+reviews since April. 2025, include
vllm-project#3232 (comment),
vllm-project#4822 (comment),
vllm-project#4768 (comment)
high quality review.

- Sustained Contributions
15+ Valuable bug fix and refactor is very good.

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved
Continuous optimization of code architecture

https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged

- Quality Contribution‌:
vllm-project#1229
vllm-project#1979
vllm-project#4359
vllm-project#4878

- Community Involvement‌: 
He lead the vllm-project#1147, to
refactor AscendFusedMoE at the first time.
He shared topics about large-scale distributed inference and
reinforcement learning on vLLM-Ascend meetup on August 2nd.

@realliujiaxu
---
- Review Quality‌:
He has completed about [40+
reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+)
since September, include
vllm-project#4868 (comment),
vllm-project#2275 (comment).

- Sustained Contributions
He has completed (17
commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged],
continuously optimizing the performance of the MoE model.

- Quality Contribution‌:

Contributed the Flash Comm1 feature to the community, supporting both
eager and aclgraph execution modes, while compatible with multiple MoE
models including DeepSeek and GLM4.5.
  - vllm-project#3334
  - vllm-project#3420
  - vllm-project#3015
  
  co-author:
  - vllm-project#3495
  - vllm-project#4868

- Community Involvement‌: 
1. Completed two major refactors, enabling vllm-ascend to evolve more
rapidly and robustly: [Linear
module](vllm-project#2867) and
[rejection
sampler](vllm-project#4975)
2. [fixed 8
bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+)
in graph mode, spec decoding and async scheduling.

@LCAIZJ
---
- Review Quality‌: He's been the go-to reviewer for virtually all PD
disaggregation and KV Pool related PRs, having completed [30+
reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+)
since May 2025. Notable examples include
[discussion_r2553887360](vllm-project#4345 (comment)),
[issuecomment-3540994801](vllm-project#4161 (comment)),
and
[discussion_r2492593988](vllm-project#3981 (comment)),
all demonstrating thorough and insightful feedback.
- Sustained and Quality Contributions: His contributions reflect a
strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in
prefill-decode disaggregation and KV pool areas ([7 PRs
merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)).
Prefill-Decode Disaggregation: Delivered KV transfer functionality using
Mooncake TransferEngine and enabled layerwise KV transfer
vllm-project#1568
vllm-project#2602
KV Pool: Developed the foundational KV Pool infrastructure and migrated
it to the latest ADXL stack
vllm-project#2913
vllm-project#3350
- Quality Contribution‌:
vllm-project#1568
vllm-project#2602
vllm-project#2913
vllm-project#3350
- Community Involvement‌: 
He actively responds to [community
issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ),
continuously monitors functionality and accuracy issues related to PD
disaggregation and KV Pool, and proactively delivers [bug
fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix).
- vLLM version: v0.12.0
- vLLM main:
vllm-project/vllm@ad32e3e

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
…t#5152)

I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend
committer team.

@zzzzwwjj
---
- Review Quality‌:
He has completed 80+reviews since April. 2025, include
vllm-project#3232 (comment),
vllm-project#4822 (comment),
vllm-project#4768 (comment)
high quality review.

- Sustained Contributions
15+ Valuable bug fix and refactor is very good.

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved
Continuous optimization of code architecture

https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged

- Quality Contribution‌:
vllm-project#1229
vllm-project#1979
vllm-project#4359
vllm-project#4878

- Community Involvement‌:
He lead the vllm-project#1147, to
refactor AscendFusedMoE at the first time.
He shared topics about large-scale distributed inference and
reinforcement learning on vLLM-Ascend meetup on August 2nd.

@realliujiaxu
---
- Review Quality‌:
He has completed about [40+
reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+)
since September, include
vllm-project#4868 (comment),
vllm-project#2275 (comment).

- Sustained Contributions
He has completed (17
commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged],
continuously optimizing the performance of the MoE model.

- Quality Contribution‌:

Contributed the Flash Comm1 feature to the community, supporting both
eager and aclgraph execution modes, while compatible with multiple MoE
models including DeepSeek and GLM4.5.
  - vllm-project#3334
  - vllm-project#3420
  - vllm-project#3015

  co-author:
  - vllm-project#3495
  - vllm-project#4868

- Community Involvement‌:
1. Completed two major refactors, enabling vllm-ascend to evolve more
rapidly and robustly: [Linear
module](vllm-project#2867) and
[rejection
sampler](vllm-project#4975)
2. [fixed 8
bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+)
in graph mode, spec decoding and async scheduling.

@LCAIZJ
---
- Review Quality‌: He's been the go-to reviewer for virtually all PD
disaggregation and KV Pool related PRs, having completed [30+
reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+)
since May 2025. Notable examples include
[discussion_r2553887360](vllm-project#4345 (comment)),
[issuecomment-3540994801](vllm-project#4161 (comment)),
and
[discussion_r2492593988](vllm-project#3981 (comment)),
all demonstrating thorough and insightful feedback.
- Sustained and Quality Contributions: His contributions reflect a
strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in
prefill-decode disaggregation and KV pool areas ([7 PRs
merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)).
Prefill-Decode Disaggregation: Delivered KV transfer functionality using
Mooncake TransferEngine and enabled layerwise KV transfer
vllm-project#1568
vllm-project#2602
KV Pool: Developed the foundational KV Pool infrastructure and migrated
it to the latest ADXL stack
vllm-project#2913
vllm-project#3350
- Quality Contribution‌:
vllm-project#1568
vllm-project#2602
vllm-project#2913
vllm-project#3350
- Community Involvement‌:
He actively responds to [community
issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ),
continuously monitors functionality and accuracy issues related to PD
disaggregation and KV Pool, and proactively delivers [bug
fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix).
- vLLM version: v0.12.0
- vLLM main:
vllm-project/vllm@ad32e3e

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
…t#5152)

I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend
committer team.

@zzzzwwjj
---
- Review Quality‌:
He has completed 80+reviews since April. 2025, include
vllm-project#3232 (comment),
vllm-project#4822 (comment),
vllm-project#4768 (comment)
high quality review.

- Sustained Contributions
15+ Valuable bug fix and refactor is very good.

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved
Continuous optimization of code architecture

https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged

- Quality Contribution‌:
vllm-project#1229
vllm-project#1979
vllm-project#4359
vllm-project#4878

- Community Involvement‌:
He lead the vllm-project#1147, to
refactor AscendFusedMoE at the first time.
He shared topics about large-scale distributed inference and
reinforcement learning on vLLM-Ascend meetup on August 2nd.

@realliujiaxu
---
- Review Quality‌:
He has completed about [40+
reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+)
since September, include
vllm-project#4868 (comment),
vllm-project#2275 (comment).

- Sustained Contributions
He has completed (17
commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged],
continuously optimizing the performance of the MoE model.

- Quality Contribution‌:

Contributed the Flash Comm1 feature to the community, supporting both
eager and aclgraph execution modes, while compatible with multiple MoE
models including DeepSeek and GLM4.5.
  - vllm-project#3334
  - vllm-project#3420
  - vllm-project#3015

  co-author:
  - vllm-project#3495
  - vllm-project#4868

- Community Involvement‌:
1. Completed two major refactors, enabling vllm-ascend to evolve more
rapidly and robustly: [Linear
module](vllm-project#2867) and
[rejection
sampler](vllm-project#4975)
2. [fixed 8
bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+)
in graph mode, spec decoding and async scheduling.

@LCAIZJ
---
- Review Quality‌: He's been the go-to reviewer for virtually all PD
disaggregation and KV Pool related PRs, having completed [30+
reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+)
since May 2025. Notable examples include
[discussion_r2553887360](vllm-project#4345 (comment)),
[issuecomment-3540994801](vllm-project#4161 (comment)),
and
[discussion_r2492593988](vllm-project#3981 (comment)),
all demonstrating thorough and insightful feedback.
- Sustained and Quality Contributions: His contributions reflect a
strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in
prefill-decode disaggregation and KV pool areas ([7 PRs
merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)).
Prefill-Decode Disaggregation: Delivered KV transfer functionality using
Mooncake TransferEngine and enabled layerwise KV transfer
vllm-project#1568
vllm-project#2602
KV Pool: Developed the foundational KV Pool infrastructure and migrated
it to the latest ADXL stack
vllm-project#2913
vllm-project#3350
- Quality Contribution‌:
vllm-project#1568
vllm-project#2602
vllm-project#2913
vllm-project#3350
- Community Involvement‌:
He actively responds to [community
issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ),
continuously monitors functionality and accuracy issues related to PD
disaggregation and KV Pool, and proactively delivers [bug
fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix).
- vLLM version: v0.12.0
- vLLM main:
vllm-project/vllm@ad32e3e

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants