Skip to content

[UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat#5474

Merged
wangxiyuan merged 10 commits intovllm-project:mainfrom
ZT-AIA:ops_ut
Jan 5, 2026
Merged

[UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat#5474
wangxiyuan merged 10 commits intovllm-project:mainfrom
ZT-AIA:ops_ut

Conversation

@ZT-AIA
Copy link
Copy Markdown
Contributor

@ZT-AIA ZT-AIA commented Dec 29, 2025

What this PR does / why we need it?

[UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat

Does this PR introduce any user-facing change?

How was this patch tested?

pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py

ZT-AIA and others added 3 commits December 25, 2025 16:20
Signed-off-by: ZT-AIA <1028681969@qq.com>
Signed-off-by: ZT-AIA <1028681969@qq.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new unit test for the fused_qkvzba_split_reshape_cat Triton operation. The test coverage provided by the parameterization is good. However, I've identified a couple of high-severity issues that should be addressed to improve the quality and maintainability of the test code. My feedback includes removing dead code that unnecessarily allocates a tensor and refactoring a fragile testing pattern used for generating the reference implementation to make the test more robust.

Comment on lines +68 to +75
hidden_size= 2048

hidden_states = torch.randn(
seq_len,
hidden_size,
dtype=dtype,
device=device
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The variable hidden_states and its dimension hidden_size are defined and a tensor is allocated on the device, but they are never used in the test. This is dead code that wastes memory and compute resources during test execution and reduces code clarity. Please remove this unused code.

Comment on lines +103 to +108
gdn = Qwen3NextGatedDeltaNet.__new__(Qwen3NextGatedDeltaNet)
gdn.num_k_heads = num_heads_qk
gdn.num_v_heads = num_heads_v
gdn.head_k_dim = head_qk_dim
gdn.head_v_dim = head_v_dim
gdn.tp_size = 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using __new__ to create an uninitialized object and then monkey-patching its attributes is a fragile testing pattern. It makes the test brittle and tightly coupled to the implementation details of Qwen3NextGatedDeltaNet. If the dependencies of fix_query_key_value_ordering change (e.g., it starts relying on an attribute set in __init__), this test will fail in a non-obvious way.

A more robust approach would be to extract the reference logic into a pure, standalone function within this test file. This would make the test self-contained and resilient to changes in the model class. If that's not feasible, Qwen3NextGatedDeltaNet should be instantiated properly, using mocks if necessary to avoid the overhead of a full model initialization.

@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

ZT-AIA and others added 2 commits December 29, 2025 17:10
Signed-off-by: ZT-AIA <1028681969@qq.com>
@@ -0,0 +1,100 @@
import pytest
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vllm-ascend-ci vllm-ascend-ci added ready read for review ready-for-test start test by label for PR labels Jan 4, 2026
@wangxiyuan wangxiyuan merged commit 58e8d19 into vllm-project:main Jan 5, 2026
38 of 39 checks passed
zyz111222 pushed a commit to zyz111222/vllm-ascend that referenced this pull request Jan 5, 2026
…ject#5474)

### What this PR does / why we need it?
[UT]add triton ops ut :  test_fused_qkvzba_split_reshape_cat
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@5326c89

---------

Signed-off-by: ZT-AIA <1028681969@qq.com>
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Jan 6, 2026
…to FIA_rebase

* 'main' of https://github.com/vllm-project/vllm-ascend: (58 commits)
  [Main2Main] Upgrade vllm commit to 0106 (vllm-project#5617)
  [CI]update bisheng version (vllm-project#5621)
  [UT][PCP&DCP] UT for block_table.py (vllm-project#5032)
  [Main2Main] Upgrade vllm commit to 0105 (vllm-project#5595)
  [CI] mv ops to correct path (vllm-project#5615)
  [BugFix] Fix Smoke Testing Bug for DSR1 longseq (vllm-project#5613)
  Revert "[Feat] enable hierarchical mc2 ops on A2 by default (vllm-project#5545)" (vllm-project#5611)
  [TRITON][TEST]Add nightly test for triton split_qkv_rmsnorm_rope (vllm-project#5267)
  [perf] Fix MLAPO weight disposal for KV-consumer MLA in PD-mix deploy... (vllm-project#5192)
  [docs] Correct image about prefill phase of PCP (vllm-project#5598)
  [CI] update triton-ascend version (vllm-project#5584)
  [P/D]Remove mooncake kvpool unused parameter `local_hostname` (vllm-project#5574)
  [Bugfix] record cos and sin cache in AscendRotaryEmbedding (vllm-project#5516)
  [bugfix] fix test_camem failed with triton-ascend (vllm-project#5492)
  [UT]add triton ops ut :  test_fused_qkvzba_split_reshape_cat (vllm-project#5474)
  [CI] Download models from ms (vllm-project#5405)
  Docs: Add A3 Docker image guidance for Atlas A3 machines (vllm-project#5256)
  [Doc] Add NNAL installation guide and requirements (vllm-project#5235)
  Add the requirement of arctic-inference which  speculative decoding with suffix_decode  (vllm-project#5045)
  [BugFix][Fusion] Fix graph fusion failure problem (vllm-project#5253)
  ...
Rozwel-dx pushed a commit to Rozwel-dx/vllm-ascend that referenced this pull request Jan 8, 2026
…ject#5474)

### What this PR does / why we need it?
[UT]add triton ops ut :  test_fused_qkvzba_split_reshape_cat
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@5326c89

---------

Signed-off-by: ZT-AIA <1028681969@qq.com>
aipaes pushed a commit to aipaes/vllm-ascend that referenced this pull request Jan 15, 2026
…ject#5474)

### What this PR does / why we need it?
[UT]add triton ops ut :  test_fused_qkvzba_split_reshape_cat
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@5326c89

---------

Signed-off-by: ZT-AIA <1028681969@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
…ject#5474)

### What this PR does / why we need it?
[UT]add triton ops ut :  test_fused_qkvzba_split_reshape_cat
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@5326c89

---------

Signed-off-by: ZT-AIA <1028681969@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
…ject#5474)

### What this PR does / why we need it?
[UT]add triton ops ut :  test_fused_qkvzba_split_reshape_cat
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@5326c89

---------

Signed-off-by: ZT-AIA <1028681969@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
…ject#5474)

### What this PR does / why we need it?
[UT]add triton ops ut :  test_fused_qkvzba_split_reshape_cat
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@5326c89

---------

Signed-off-by: ZT-AIA <1028681969@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
…ject#5474)

### What this PR does / why we need it?
[UT]add triton ops ut :  test_fused_qkvzba_split_reshape_cat
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@5326c89

---------

Signed-off-by: ZT-AIA <1028681969@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants