[Feat][UT] Support Deepseekv32 FULL_DECODE_ONLY mode and add unit test of sfa_v1 by 1Fire4 · Pull Request #3763 · vllm-project/vllm-ascend

1Fire4 · 2025-10-25T08:57:45Z

What this PR does / why we need it?

Add support for DeepSeek v3.2 in FULL_DECODE_ONLY mode.
Add unit test for sfa_v1.

Does this PR introduce any user-facing change?

No

How was this patch tested?

vLLM version: v0.11.0
vLLM main: vllm-project/vllm@83f478b

gemini-code-assist

Code Review

This pull request adds support for DeepSeek v3.2 in FULL_DECODE_ONLY mode and includes a new end-to-end test to verify this functionality. Additionally, it introduces a comprehensive suite of unit tests for the sfa_v1 attention mechanism. The changes appear to be well-structured and align with the PR's objectives. However, I've identified a critical issue in the newly added unit tests that needs to be addressed.

gemini-code-assist · 2025-10-25T08:59:10Z

+        self.assertEqual(self.impl.W_UK_T.shape[0], self.impl.num_heads)
+        self.assertEqual(self.impl.W_UK_T.shape[1], self.impl.qk_nope_head_dim)
+        self.assertEqual(self.impl.W_UK_T.shape[2], self.impl.kv_lora_rank)
+
+        self.assertEqual(self.impl.W_UV.shape[0], self.impl.num_heads)
+        self.assertEqual(self.impl.W_UV.shape[1], self.impl.kv_lora_rank)
+        self.assertEqual(self.impl.W_UV.shape[2], self.impl.v_head_dim)


The test asserts attributes W_UK_T and W_UV on the self.impl object. However, looking at the implementation of process_weights_after_loading in vllm_ascend/attention/sfa_v1.py, the attributes being set are kv_b_proj_w_k and kv_b_proj_w_v. This discrepancy will cause the test to fail. The test should be updated to assert against the correct attribute names.

Suggested change

self.assertEqual(self.impl.W_UK_T.shape[0], self.impl.num_heads)

self.assertEqual(self.impl.W_UK_T.shape[1], self.impl.qk_nope_head_dim)

self.assertEqual(self.impl.W_UK_T.shape[2], self.impl.kv_lora_rank)

self.assertEqual(self.impl.W_UV.shape[0], self.impl.num_heads)

self.assertEqual(self.impl.W_UV.shape[1], self.impl.kv_lora_rank)

self.assertEqual(self.impl.W_UV.shape[2], self.impl.v_head_dim)

self.assertEqual(self.impl.kv_b_proj_w_k.shape[0], self.impl.num_heads)

self.assertEqual(self.impl.kv_b_proj_w_k.shape[1], self.impl.qk_nope_head_dim)

self.assertEqual(self.impl.kv_b_proj_w_k.shape[2], self.impl.kv_lora_rank)

self.assertEqual(self.impl.kv_b_proj_w_v.shape[0], self.impl.num_heads)

self.assertEqual(self.impl.kv_b_proj_w_v.shape[1], self.impl.kv_lora_rank)

self.assertEqual(self.impl.kv_b_proj_w_v.shape[2], self.impl.v_head_dim)

github-actions · 2025-10-25T09:01:42Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

github-actions · 2025-10-27T02:00:38Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

yiz-liu · 2025-10-28T07:51:07Z

+        if forward_context.cudagraph_runtime_mode == CUDAGraphMode.FULL \
+            and not self.ascend_config.use_sfa:


Why skip SFA if we are to support DeepSeek-v3.2 in FULL_DECODE_ONLY?

The tiling update of the SFA has already been delegated to the device, so manual updates are no longer required.

github-actions · 2025-10-28T13:00:02Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

whx-sjtu · 2025-10-30T02:53:27Z

There will be a big refactor to sfa_v1 and deepseek v3.2 today. But I think this refactor will actually lighten up your work. Please refer to PR #3769 for more information.

1Fire4 · 2025-10-30T06:21:38Z

There will be a big refactor to sfa_v1 and deepseek v3.2 today. But I think this refactor will actually lighten up your work. Please refer to PR #3769 for more information.

Thank you for the reminder. If your PR gets merged, I’ll update my PR accordingly.

github-actions · 2025-10-30T09:10:52Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: 1Fire4 <wangdingyi2@huawei.com>

…t of sfa_v1 (vllm-project#3763) ### What this PR does / why we need it? - Add support for DeepSeek v3.2 in FULL_DECODE_ONLY mode. - Add unit test for sfa_v1. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: 1Fire4 <wangdingyi2@huawei.com> Signed-off-by: luolun <luolun1995@cmbchina.com>

…t of sfa_v1 (vllm-project#3763) ### What this PR does / why we need it? - Add support for DeepSeek v3.2 in FULL_DECODE_ONLY mode. - Add unit test for sfa_v1. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: 1Fire4 <wangdingyi2@huawei.com> Signed-off-by: hwhaokun <haokun0405@163.com>

…t of sfa_v1 (vllm-project#3763) ### What this PR does / why we need it? - Add support for DeepSeek v3.2 in FULL_DECODE_ONLY mode. - Add unit test for sfa_v1. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: 1Fire4 <wangdingyi2@huawei.com> Signed-off-by: nsdie <yeyifan@huawei.com>

…t of sfa_v1 (vllm-project#3763) ### What this PR does / why we need it? - Add support for DeepSeek v3.2 in FULL_DECODE_ONLY mode. - Add unit test for sfa_v1. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: 1Fire4 <wangdingyi2@huawei.com>

gemini-code-assist Bot reviewed Oct 25, 2025

View reviewed changes

github-actions Bot added the module:tests label Oct 25, 2025

github-actions Bot added the merge-conflicts label Oct 27, 2025

1Fire4 force-pushed the deepseekv32_fullgraph branch 3 times, most recently from 07d98a4 to adb45ed Compare October 27, 2025 12:02

github-actions Bot removed the merge-conflicts label Oct 27, 2025

1Fire4 force-pushed the deepseekv32_fullgraph branch 4 times, most recently from c1e3831 to 48863c6 Compare October 28, 2025 07:23

yiz-liu reviewed Oct 28, 2025

View reviewed changes

yiz-liu added ready read for review ready-for-test start test by label for PR labels Oct 28, 2025

1Fire4 force-pushed the deepseekv32_fullgraph branch from 48863c6 to 38eb602 Compare October 28, 2025 12:59

github-actions Bot added the merge-conflicts label Oct 28, 2025

1Fire4 force-pushed the deepseekv32_fullgraph branch from 38eb602 to 1777c6e Compare October 28, 2025 13:29

github-actions Bot removed the merge-conflicts label Oct 28, 2025

1Fire4 changed the title ~~support Deepseekv32 FULL_DECODE_ONLY mode and add unit test of sfa_v1~~ [Fea] [UT] support Deepseekv32 FULL_DECODE_ONLY mode and add unit test of sfa_v1 Oct 30, 2025

github-actions Bot added the merge-conflicts label Oct 30, 2025

1Fire4 force-pushed the deepseekv32_fullgraph branch from 1777c6e to 363ca90 Compare October 30, 2025 12:43

github-actions Bot removed the merge-conflicts label Oct 30, 2025

1Fire4 force-pushed the deepseekv32_fullgraph branch 2 times, most recently from eb36228 to 6aa77f9 Compare October 31, 2025 01:32

[Fea] support FULL_DECODE_ONLY mode for DeepSeekv3.2-Exp

8b13641

Signed-off-by: 1Fire4 <wangdingyi2@huawei.com>

1Fire4 force-pushed the deepseekv32_fullgraph branch 4 times, most recently from 2b7f5fc to 72ac33a Compare October 31, 2025 04:04

[ut] add unit test of sfa_v1.py

bfb20ab

Signed-off-by: 1Fire4 <wangdingyi2@huawei.com>

1Fire4 force-pushed the deepseekv32_fullgraph branch from 72ac33a to bfb20ab Compare October 31, 2025 06:12

yiz-liu changed the title ~~[Fea] [UT] support Deepseekv32 FULL_DECODE_ONLY mode and add unit test of sfa_v1~~ [Feat][UT] Support Deepseekv32 FULL_DECODE_ONLY mode and add unit test of sfa_v1 Nov 3, 2025

yiz-liu approved these changes Nov 3, 2025

View reviewed changes

yiz-liu merged commit 0b9b6d7 into vllm-project:main Nov 3, 2025
24 checks passed

yiz-liu mentioned this pull request Nov 24, 2025

[RFC]: Support DeepSeek-V3.2-Exp and Qwen3-Next with FULL_DECODE_ONLY mode #4387

Open

Yikun mentioned this pull request Feb 5, 2026

[v0.13.0rc2] FAQ / Feedback | 问题/反馈 #6186

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat][UT] Support Deepseekv32 FULL_DECODE_ONLY mode and add unit test of sfa_v1#3763

[Feat][UT] Support Deepseekv32 FULL_DECODE_ONLY mode and add unit test of sfa_v1#3763
yiz-liu merged 2 commits intovllm-project:mainfrom
1Fire4:deepseekv32_fullgraph

1Fire4 commented Oct 25, 2025 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Oct 25, 2025

Uh oh!

github-actions Bot commented Oct 25, 2025

Uh oh!

github-actions Bot commented Oct 27, 2025

Uh oh!

yiz-liu Oct 28, 2025

Uh oh!

1Fire4 Oct 28, 2025

Uh oh!

github-actions Bot commented Oct 28, 2025

Uh oh!

whx-sjtu commented Oct 30, 2025 •

edited

Loading

Uh oh!

1Fire4 commented Oct 30, 2025

Uh oh!

github-actions Bot commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if forward_context.cudagraph_runtime_mode == CUDAGraphMode.FULL \
		and not self.ascend_config.use_sfa:

Conversation

1Fire4 commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Oct 25, 2025

Uh oh!

github-actions Bot commented Oct 27, 2025

Uh oh!

yiz-liu Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

1Fire4 Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Oct 28, 2025

Uh oh!

whx-sjtu commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

1Fire4 commented Oct 30, 2025

Uh oh!

github-actions Bot commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1Fire4 commented Oct 25, 2025 •

edited

Loading

whx-sjtu commented Oct 30, 2025 •

edited

Loading