[Feature] Support kv nz feature for DeepSeek decode node in disagg-prefill scenario#3072

Merged

jianzs merged 7 commits intovllm-project:mainfrom

jianzs:gh-kvnz-wo-torchair

Dec 31, 2025

Collaborator

jianzs commented Sep 21, 2025 •

edited

Loading

What this PR does / why we need it?

By converting the KV cache from ND to NZ format when the decode node receives it, this PR ensures that the KV NZ feature works correctly during the decoding phase in disagg-prefill scenario.

Does this PR introduce any user-facing change?

Add enable_kv_nz configuration option in additional_config.

How was this patch tested?

CI pass.

vLLM version: v0.11.0
vLLM main: vllm-project/vllm@83f478b

Contributor

github-actions bot commented Sep 21, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

github-actions bot added the module:core label

gemini-code-assist bot reviewed

View reviewed changes

Contributor

gemini-code-assist bot left a comment

Code Review

This pull request refactors the enable_kv_nz configuration by moving it from TorchairGraphConfig to the higher-level AscendConfig. This change successfully removes the restriction that kv nz can only be used when torchair is enabled. The implementation looks correct and consistent across the modified files. However, a significant issue is that the unit tests in tests/ut/test_ascend_config.py have not been updated to reflect this refactoring, which could lead to a broken test suite and potential regressions. It is crucial to update these tests to align with the new configuration structure and behavior.

vllm_ascend/ascend_config.py Outdated

                           ascend_scheduler_config)
                       # Todo: Once https://github.com/vllm-project/vllm/issues/22246 is merged in vllm. Remove this config
+                      self.enable_kv_nz = additional_config.get("enable_kv_nz", False)

Contributor

gemini-code-assist bot Sep 21, 2025

While moving enable_kv_nz to AscendConfig is a good refactoring to generalize its usage, the corresponding unit tests in tests/ut/test_ascend_config.py appear to be outdated. The tests still reference torchair_graph_config.enable_kv_nz and check for behavior that was removed (i.e., that enable_kv_nz is only valid with torchair). Please update the tests to reflect these changes to ensure correctness and prevent future regressions.

github-actions bot added documentation module:tests labels

jianzs force-pushed the gh-kvnz-wo-torchair branch from 7281669 to ba1c909 Compare

September 21, 2025 13:44

jianzs marked this pull request as draft

September 21, 2025 15:56

github-actions bot added the merge-conflicts label

Contributor

github-actions bot commented Sep 25, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

jianzs force-pushed the gh-kvnz-wo-torchair branch from 05ffcbd to 4cdd047 Compare

November 11, 2025 15:37

github-actions bot removed the merge-conflicts label

jianzs changed the title ~~[Feature] Remove restriction on kv nz usage without torchair~~ [Feature] Support kv nz feature for DeepSeek decode node in disagg-prefill scenario

jianzs marked this pull request as ready for review

November 11, 2025 15:48

jianzs force-pushed the gh-kvnz-wo-torchair branch from 70d23de to 10bd91d Compare

November 11, 2025 15:49

Collaborator

zzzzwwjj commented Nov 17, 2025

Please test the case without torchair and post the performance data, thks!

zzzzwwjj approved these changes

View reviewed changes

github-actions bot added the merge-conflicts label

Contributor

github-actions bot commented Nov 21, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

jianzs force-pushed the gh-kvnz-wo-torchair branch from 244da13 to eabdc89 Compare

December 11, 2025 15:47

github-actions bot removed the merge-conflicts label

jianzs force-pushed the gh-kvnz-wo-torchair branch 2 times, most recently from 24041aa to 51c6cc5 Compare

December 12, 2025 02:59

github-actions bot added merge-conflicts labels

Contributor

github-actions bot commented Dec 12, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

1 similar comment

Contributor

github-actions bot commented Dec 12, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

jianzs force-pushed the gh-kvnz-wo-torchair branch from 51c6cc5 to ff0acc5 Compare

December 13, 2025 13:24

github-actions bot removed the merge-conflicts label

jianzs added ready ready-for-test labels

jianzs force-pushed the gh-kvnz-wo-torchair branch 2 times, most recently from 07132ce to 6783787 Compare

December 30, 2025 07:24

github-actions bot added the merge-conflicts label

Contributor

github-actions bot commented Dec 30, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

jianzs force-pushed the gh-kvnz-wo-torchair branch from 6783787 to d541b16 Compare

December 30, 2025 11:54

github-actions bot removed the merge-conflicts label

jianzs and others added 7 commits

December 30, 2025 20:53


          Add coauthors

478dd82

Co-authored-by: ghphotoframe <854746559@qq.com>
Co-authored-by: alex101-ops <alex1015718386@gmail.com>
Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>


          support kv nz

f9dd993

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>


          update kv nz

a90757c

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>


          lint code

b6ae129

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>


          fix type

f382453

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>


          lint code

8a6f51f

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>


          chore: lint code

1e1ac42

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>

jianzs force-pushed the gh-kvnz-wo-torchair branch from d541b16 to 1e1ac42 Compare

December 30, 2025 12:54

jianzs merged commit 38570cf into vllm-project:main

19 checks passed

wjunLu pushed a commit to wjunLu/vllm-ascend that referenced this pull request


          [Feature] Support kv nz feature for DeepSeek decode node in disagg-pr…

3d41b02

…efill scenario (vllm-project#3072)

By converting the KV cache from ND to NZ format when the decode node
receives it, this PR ensures that the KV NZ feature works correctly
during the decoding phase in disagg-prefill scenario.

- vLLM version: v0.11.0
- vLLM main:
vllm-project/vllm@83f478b

---------

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
Co-authored-by: ghphotoframe <854746559@qq.com>
Co-authored-by: alex101-ops <alex1015718386@gmail.com>
Signed-off-by: wjunLu <wjunlu217@gmail.com>

Rozwel-dx pushed a commit to Rozwel-dx/vllm-ascend that referenced this pull request


          [Feature] Support kv nz feature for DeepSeek decode node in disagg-pr…

7e28969

…efill scenario (vllm-project#3072)

By converting the KV cache from ND to NZ format when the decode node
receives it, this PR ensures that the KV NZ feature works correctly
during the decoding phase in disagg-prefill scenario.

- vLLM version: v0.11.0
- vLLM main:
vllm-project/vllm@83f478b

---------

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
Co-authored-by: ghphotoframe <854746559@qq.com>
Co-authored-by: alex101-ops <alex1015718386@gmail.com>

This was referenced Feb 3, 2026

[Nightly][BugFix] Remove kv_cache nz test case for test_mla_preprocess_nq.py #6505

Merged

[Nightly][BugFix][v0.13.0] Remove kv_cache nz test case for test_mla_preprocess_nq.py #6506

Closed

wangxiyuan pushed a commit that referenced this pull request


          [Nightly][BugFix] Remove kv_cache nz test case for test_mla_preproces…

4d6444d

…s_nq.py (#6505)

### What this PR does / why we need it?
Remove kv_cache nz test case for test_mla_preprocess_nq.py. This case is
added by #3072 but has
not been tested on bf16 scenario. Results show that this is not
currently supported.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed with existing test.


- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

Signed-off-by: whx-sjtu <2952154980@qq.com>

Yikun mentioned this pull request

[v0.13.0rc2] FAQ / Feedback | 问题/反馈 #6186

Closed

chenchuw886 pushed a commit to chenchuw886/vllm-ascend that referenced this pull request


          [Nightly][BugFix] Remove kv_cache nz test case for test_mla_preproces…

9a936b3

…s_nq.py (vllm-project#6505)

### What this PR does / why we need it?
Remove kv_cache nz test case for test_mla_preprocess_nq.py. This case is
added by vllm-project#3072 but has
not been tested on bf16 scenario. Results show that this is not
currently supported.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed with existing test.

- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: momochenchuw <chenchuw@huawei.com>

ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request


          [Feature] Support kv nz feature for DeepSeek decode node in disagg-pr…

417f38b

…efill scenario (vllm-project#3072)

By converting the KV cache from ND to NZ format when the decode node
receives it, this PR ensures that the KV NZ feature works correctly
during the decoding phase in disagg-prefill scenario.

- vLLM version: v0.11.0
- vLLM main:
vllm-project/vllm@83f478b

---------

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
Co-authored-by: ghphotoframe <854746559@qq.com>
Co-authored-by: alex101-ops <alex1015718386@gmail.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request


          [Nightly][BugFix] Remove kv_cache nz test case for test_mla_preproces…

…s_nq.py (vllm-project#6505)

### What this PR does / why we need it?
Remove kv_cache nz test case for test_mla_preprocess_nq.py. This case is
added by vllm-project#3072 but has
not been tested on bf16 scenario. Results show that this is not
currently supported.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed with existing test.

- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request


          [Feature] Support kv nz feature for DeepSeek decode node in disagg-pr…

a1b12fb

…efill scenario (vllm-project#3072)

By converting the KV cache from ND to NZ format when the decode node
receives it, this PR ensures that the KV NZ feature works correctly
during the decoding phase in disagg-prefill scenario.

- vLLM version: v0.11.0
- vLLM main:
vllm-project/vllm@83f478b

---------

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
Co-authored-by: ghphotoframe <854746559@qq.com>
Co-authored-by: alex101-ops <alex1015718386@gmail.com>

maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request


          [Nightly][BugFix] Remove kv_cache nz test case for test_mla_preproces…

acbb9ac

…s_nq.py (vllm-project#6505)

### What this PR does / why we need it?
Remove kv_cache nz test case for test_mla_preprocess_nq.py. This case is
added by vllm-project#3072 but has
not been tested on bf16 scenario. Results show that this is not
currently supported.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed with existing test.


- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

Signed-off-by: whx-sjtu <2952154980@qq.com>

ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request


          [Feature] Support kv nz feature for DeepSeek decode node in disagg-pr…

9e99410

…efill scenario (vllm-project#3072)

By converting the KV cache from ND to NZ format when the decode node
receives it, this PR ensures that the KV NZ feature works correctly
during the decoding phase in disagg-prefill scenario.

- vLLM version: v0.11.0
- vLLM main:
vllm-project/vllm@83f478b

---------

Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>
Co-authored-by: ghphotoframe <854746559@qq.com>
Co-authored-by: alex101-ops <alex1015718386@gmail.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request


          [Nightly][BugFix] Remove kv_cache nz test case for test_mla_preproces…

0db54c8

…s_nq.py (vllm-project#6505)

### What this PR does / why we need it?
Remove kv_cache nz test case for test_mla_preprocess_nq.py. This case is
added by vllm-project#3072 but has
not been tested on bf16 scenario. Results show that this is not
currently supported.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed with existing test.

- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request


          [Nightly][BugFix] Remove kv_cache nz test case for test_mla_preproces…

a91ebdf

…s_nq.py (vllm-project#6505)

### What this PR does / why we need it?
Remove kv_cache nz test case for test_mla_preprocess_nq.py. This case is
added by vllm-project#3072 but has
not been tested on bf16 scenario. Results show that this is not
currently supported.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed with existing test.


- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

Signed-off-by: whx-sjtu <2952154980@qq.com>

jiangyunfan1 pushed a commit to jiangyunfan1/vllm-ascend that referenced this pull request


          [Nightly][BugFix] Remove kv_cache nz test case for test_mla_preproces…

6ad6619

…s_nq.py (vllm-project#6505)

### What this PR does / why we need it?
Remove kv_cache nz test case for test_mla_preprocess_nq.py. This case is
added by vllm-project#3072 but has
not been tested on bf16 scenario. Results show that this is not
currently supported.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed with existing test.


- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

Signed-off-by: whx-sjtu <2952154980@qq.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation module:core module:tests ready ready-for-test