[UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat#5474
[UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat#5474wangxiyuan merged 10 commits intovllm-project:mainfrom
Conversation
Signed-off-by: ZT-AIA <1028681969@qq.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a new unit test for the fused_qkvzba_split_reshape_cat Triton operation. The test coverage provided by the parameterization is good. However, I've identified a couple of high-severity issues that should be addressed to improve the quality and maintainability of the test code. My feedback includes removing dead code that unnecessarily allocates a tensor and refactoring a fragile testing pattern used for generating the reference implementation to make the test more robust.
| hidden_size= 2048 | ||
|
|
||
| hidden_states = torch.randn( | ||
| seq_len, | ||
| hidden_size, | ||
| dtype=dtype, | ||
| device=device | ||
| ) |
There was a problem hiding this comment.
| gdn = Qwen3NextGatedDeltaNet.__new__(Qwen3NextGatedDeltaNet) | ||
| gdn.num_k_heads = num_heads_qk | ||
| gdn.num_v_heads = num_heads_v | ||
| gdn.head_k_dim = head_qk_dim | ||
| gdn.head_v_dim = head_v_dim | ||
| gdn.tp_size = 1 |
There was a problem hiding this comment.
Using __new__ to create an uninitialized object and then monkey-patching its attributes is a fragile testing pattern. It makes the test brittle and tightly coupled to the implementation details of Qwen3NextGatedDeltaNet. If the dependencies of fix_query_key_value_ordering change (e.g., it starts relying on an attribute set in __init__), this test will fail in a non-obvious way.
A more robust approach would be to extract the reference logic into a pure, standalone function within this test file. This would make the test self-contained and resilient to changes in the model class. If that's not feasible, Qwen3NextGatedDeltaNet should be instantiated properly, using mocks if necessary to avoid the overhead of a full model initialization.
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
| @@ -0,0 +1,100 @@ | |||
| import pytest | |||
There was a problem hiding this comment.
Signed-off-by: ZT-AIA <1028681969@qq.com>
…ject#5474) ### What this PR does / why we need it? [UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: ZT-AIA <1028681969@qq.com>
…to FIA_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (58 commits) [Main2Main] Upgrade vllm commit to 0106 (vllm-project#5617) [CI]update bisheng version (vllm-project#5621) [UT][PCP&DCP] UT for block_table.py (vllm-project#5032) [Main2Main] Upgrade vllm commit to 0105 (vllm-project#5595) [CI] mv ops to correct path (vllm-project#5615) [BugFix] Fix Smoke Testing Bug for DSR1 longseq (vllm-project#5613) Revert "[Feat] enable hierarchical mc2 ops on A2 by default (vllm-project#5545)" (vllm-project#5611) [TRITON][TEST]Add nightly test for triton split_qkv_rmsnorm_rope (vllm-project#5267) [perf] Fix MLAPO weight disposal for KV-consumer MLA in PD-mix deploy... (vllm-project#5192) [docs] Correct image about prefill phase of PCP (vllm-project#5598) [CI] update triton-ascend version (vllm-project#5584) [P/D]Remove mooncake kvpool unused parameter `local_hostname` (vllm-project#5574) [Bugfix] record cos and sin cache in AscendRotaryEmbedding (vllm-project#5516) [bugfix] fix test_camem failed with triton-ascend (vllm-project#5492) [UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat (vllm-project#5474) [CI] Download models from ms (vllm-project#5405) Docs: Add A3 Docker image guidance for Atlas A3 machines (vllm-project#5256) [Doc] Add NNAL installation guide and requirements (vllm-project#5235) Add the requirement of arctic-inference which speculative decoding with suffix_decode (vllm-project#5045) [BugFix][Fusion] Fix graph fusion failure problem (vllm-project#5253) ...
…ject#5474) ### What this PR does / why we need it? [UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: ZT-AIA <1028681969@qq.com>
…ject#5474) ### What this PR does / why we need it? [UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: ZT-AIA <1028681969@qq.com>
…ject#5474) ### What this PR does / why we need it? [UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: ZT-AIA <1028681969@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…ject#5474) ### What this PR does / why we need it? [UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: ZT-AIA <1028681969@qq.com>
…ject#5474) ### What this PR does / why we need it? [UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: ZT-AIA <1028681969@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…ject#5474) ### What this PR does / why we need it? [UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@5326c89 --------- Signed-off-by: ZT-AIA <1028681969@qq.com>
What this PR does / why we need it?
[UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat
Does this PR introduce any user-facing change?
How was this patch tested?
pytest -sv tests/ut/ops/test_fused_qkvzba_split_reshape_cat.py