Skip to content

[Ops] update causal_conv1d_update#5984

Merged
wangxiyuan merged 5 commits intovllm-project:mainfrom
SunnyLee151064:add_update
Jan 21, 2026
Merged

[Ops] update causal_conv1d_update#5984
wangxiyuan merged 5 commits intovllm-project:mainfrom
SunnyLee151064:add_update

Conversation

@SunnyLee151064
Copy link
Copy Markdown
Collaborator

@SunnyLee151064 SunnyLee151064 commented Jan 19, 2026

What this PR does / why we need it?

Update causal_conv1d_update ops for better perf.

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: SunnyLee219 <3294305115@qq.com>
@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the causal_conv1d_update operation for better performance by changing the data layout of several tensors (weight, conv_state, x) to be more cache-friendly for the Triton kernel. The changes throughout the causal_conv1d_update_npu function are consistent with these new data layouts. My main feedback is to remove a leftover debugging print statement.

out: (batch, dim) or (batch, dim, seqlen) or (num_tokens, dim), same shape as `x`
"""
weight = weight.transpose(0, 1).contiguous()
print("weight's shape: ", weight.size())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This print statement appears to be a debugging artifact. It should be removed before merging to avoid polluting logs and to prevent potential performance degradation in production environments, as I/O operations can be costly.

Signed-off-by: SunnyLee219 <3294305115@qq.com>
@wangxiyuan wangxiyuan added ready read for review ready-for-test start test by label for PR labels Jan 19, 2026
@wangxiyuan wangxiyuan enabled auto-merge (squash) January 19, 2026 08:27
auto-merge was automatically disabled January 21, 2026 03:34

Head branch was pushed to by a user without write access

@wangxiyuan wangxiyuan merged commit 2a618d2 into vllm-project:main Jan 21, 2026
20 checks passed
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Jan 22, 2026
…to FIA_rebase

* 'main' of https://github.com/vllm-project/vllm-ascend:
  [CI] Upgrade CANN to 8.5.0 (vllm-project#6070)
  Default enable MLAPO (vllm-project#5952)
  [Doc] Supplement PD separation parameters of DeepSeek V3.1 (vllm-project#6053)
  [Ascend] perf: optimize rope embedding with triton kernel for huge performance gain (vllm-project#5918)
  [Ops] update causal_conv1d_update (vllm-project#5984)
  [CI]Update triton ascend version in 3.2.0 (vllm-project#6067)
  [bugfix] fix the complex and potentially problematic generate_kv_idx. (vllm-project#5957)
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
### What this PR does / why we need it?
Update causal_conv1d_update ops for better perf.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2c24bc6

---------

Signed-off-by: SunnyLee219 <3294305115@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
### What this PR does / why we need it?
Update causal_conv1d_update ops for better perf.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2c24bc6

---------

Signed-off-by: SunnyLee219 <3294305115@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
### What this PR does / why we need it?
Update causal_conv1d_update ops for better perf.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2c24bc6

---------

Signed-off-by: SunnyLee219 <3294305115@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
### What this PR does / why we need it?
Update causal_conv1d_update ops for better perf.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2c24bc6

---------

Signed-off-by: SunnyLee219 <3294305115@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
### What this PR does / why we need it?
Update causal_conv1d_update ops for better perf.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2c24bc6

---------

Signed-off-by: SunnyLee219 <3294305115@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:ops ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants