Skip to content

[Bugfix] Fix the input constraints checks for the mlapo and bmm_transpose operators#5764

Merged
wangxiyuan merged 1 commit intovllm-project:mainfrom
rjg-lyh:pr-bugfix-token-limit
Jan 16, 2026
Merged

[Bugfix] Fix the input constraints checks for the mlapo and bmm_transpose operators#5764
wangxiyuan merged 1 commit intovllm-project:mainfrom
rjg-lyh:pr-bugfix-token-limit

Conversation

@rjg-lyh
Copy link
Copy Markdown
Collaborator

@rjg-lyh rjg-lyh commented Jan 9, 2026

What this PR does / why we need it?

This PR fix the input constraints checks for the mlapo and bmm_transpose operators.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

CI passed with new added/existing test.

Perf

64K/3K,1P1D,bs=32

before this pr:
TPOT 29ms, TTFT 47s,TPS 606 token/s

after this pr:
TPOT 29ms, TTFT 48s,TPS 636 token/s

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 9, 2026

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes the input constraint checks for the mlapo and bmm_transpose operators by replacing the has_prefill flag with a direct check on the number of input tokens. This change addresses a bug and improves the correctness of operator selection. The associated code simplification by removing the has_prefill logic is also a good improvement. However, this refactoring appears to break existing unit tests, which must be updated to reflect the changes.

Comment thread vllm_ascend/attention/sfa_v1.py
@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@rjg-lyh rjg-lyh force-pushed the pr-bugfix-token-limit branch from 56d8584 to c2a4d80 Compare January 15, 2026 02:31
@rjg-lyh rjg-lyh requested a review from weijinqian0 as a code owner January 15, 2026 02:31
@rjg-lyh rjg-lyh force-pushed the pr-bugfix-token-limit branch from c2a4d80 to d2c4609 Compare January 15, 2026 06:25
@rjg-lyh rjg-lyh requested a review from wangxiyuan as a code owner January 15, 2026 06:25
@rjg-lyh rjg-lyh added ready read for review ready-for-test start test by label for PR and removed merge-conflicts labels Jan 15, 2026
Comment thread vllm_ascend/attention/sfa_v1.py Outdated
Comment thread vllm_ascend/attention/sfa_v1.py Outdated
@rjg-lyh rjg-lyh force-pushed the pr-bugfix-token-limit branch 2 times, most recently from fd0113d to 186e916 Compare January 16, 2026 04:47
…pose operators

Signed-off-by: rjg-lyh <1318825571@qq.com>
@rjg-lyh rjg-lyh force-pushed the pr-bugfix-token-limit branch from 186e916 to dbea536 Compare January 16, 2026 05:02
@wangxiyuan wangxiyuan enabled auto-merge (squash) January 16, 2026 09:51
@wangxiyuan wangxiyuan merged commit 3af91e5 into vllm-project:main Jan 16, 2026
20 checks passed
rjg-lyh added a commit to rjg-lyh/vllm-ascend that referenced this pull request Jan 21, 2026
…bmm_transpose operators (vllm-project#5764)

This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

No.

CI passed with new added/existing test.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

64K/3K,1P1D,bs=32

before this pr:
TPOT 29ms, TTFT 47s,TPS 606 token/s

after this pr:
TPOT 29ms, TTFT 48s,TPS 636 token/s

Signed-off-by: rjg-lyh <1318825571@qq.com>
rjg-lyh added a commit to rjg-lyh/vllm-ascend that referenced this pull request Jan 21, 2026
…bmm_transpose operators (vllm-project#5764)

This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

No.

CI passed with new added/existing test.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

64K/3K,1P1D,bs=32

before this pr:
TPOT 29ms, TTFT 47s,TPS 606 token/s

after this pr:
TPOT 29ms, TTFT 48s,TPS 636 token/s

Signed-off-by: rjg-lyh <1318825571@qq.com>
rjg-lyh added a commit to rjg-lyh/vllm-ascend that referenced this pull request Jan 21, 2026
…bmm_transpose operators (vllm-project#5764)

This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

No.

CI passed with new added/existing test.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

64K/3K,1P1D,bs=32

before this pr:
TPOT 29ms, TTFT 47s,TPS 606 token/s

after this pr:
TPOT 29ms, TTFT 48s,TPS 636 token/s

Signed-off-by: rjg-lyh <1318825571@qq.com>
rjg-lyh added a commit to rjg-lyh/vllm-ascend that referenced this pull request Jan 22, 2026
…bmm_transpose operators (vllm-project#5764)

This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

No.

CI passed with new added/existing test.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

64K/3K,1P1D,bs=32

before this pr:
TPOT 29ms, TTFT 47s,TPS 606 token/s

after this pr:
TPOT 29ms, TTFT 48s,TPS 636 token/s

Signed-off-by: rjg-lyh <1318825571@qq.com>
wangxiyuan pushed a commit that referenced this pull request Jan 22, 2026
…bmm_transpose operators (#5764) (#6088)

### What this PR does / why we need it?
This PR cherry-pick #5764

This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
CI passed with new added/existing test.

Signed-off-by: rjg-lyh <1318825571@qq.com>
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Jan 22, 2026
…lm-ascend into FIA_v0.13.0

* 'releases/v0.13.0' of https://github.com/vllm-project/vllm-ascend:
  [EPLB] Config Rename wrapper (vllm-project#6111)
  [v0.13.0][Bugfix] Fix the input constraints checks for the mlapo and bmm_transpose operators (vllm-project#5764) (vllm-project#6088)
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
…pose operators (vllm-project#5764)

### What this PR does / why we need it?
This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
CI passed with new added/existing test.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

### Perf
64K/3K,1P1D,bs=32

before this pr:
TPOT 29ms, TTFT 47s,TPS 606 token/s

after this pr:
TPOT 29ms, TTFT 48s,TPS 636 token/s

Signed-off-by: rjg-lyh <1318825571@qq.com>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
…bmm_transpose operators (vllm-project#5764) (vllm-project#6088)

### What this PR does / why we need it?
This PR cherry-pick vllm-project#5764

This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
CI passed with new added/existing test.

Signed-off-by: rjg-lyh <1318825571@qq.com>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
…pose operators (vllm-project#5764)

### What this PR does / why we need it?
This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
CI passed with new added/existing test.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

### Perf
64K/3K,1P1D,bs=32

before this pr:
TPOT 29ms, TTFT 47s,TPS 606 token/s

after this pr:
TPOT 29ms, TTFT 48s,TPS 636 token/s

Signed-off-by: rjg-lyh <1318825571@qq.com>
tangtiangu pushed a commit to tangtiangu/jiusi-vllm-ascend that referenced this pull request Feb 24, 2026
…bmm_transpose operators (vllm-project#5764) (vllm-project#6088)

### What this PR does / why we need it?
This PR cherry-pick vllm-project#5764

This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
CI passed with new added/existing test.

Signed-off-by: rjg-lyh <1318825571@qq.com>
tangtiangu pushed a commit to tangtiangu/jiusi-vllm-ascend that referenced this pull request Feb 24, 2026
…bmm_transpose operators (vllm-project#5764) (vllm-project#6088)

### What this PR does / why we need it?
This PR cherry-pick vllm-project#5764

This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
CI passed with new added/existing test.

Signed-off-by: rjg-lyh <1318825571@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
…pose operators (vllm-project#5764)

### What this PR does / why we need it?
This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
CI passed with new added/existing test.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

### Perf
64K/3K,1P1D,bs=32

before this pr:
TPOT 29ms, TTFT 47s,TPS 606 token/s

after this pr:
TPOT 29ms, TTFT 48s,TPS 636 token/s

Signed-off-by: rjg-lyh <1318825571@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
…pose operators (vllm-project#5764)

### What this PR does / why we need it?
This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
CI passed with new added/existing test.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

### Perf
64K/3K,1P1D,bs=32

before this pr:
TPOT 29ms, TTFT 47s,TPS 606 token/s

after this pr:
TPOT 29ms, TTFT 48s,TPS 636 token/s

Signed-off-by: rjg-lyh <1318825571@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
…pose operators (vllm-project#5764)

### What this PR does / why we need it?
This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
CI passed with new added/existing test.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

### Perf
64K/3K,1P1D,bs=32

before this pr:
TPOT 29ms, TTFT 47s,TPS 606 token/s

after this pr:
TPOT 29ms, TTFT 48s,TPS 636 token/s

Signed-off-by: rjg-lyh <1318825571@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
…pose operators (vllm-project#5764)

### What this PR does / why we need it?
This PR fix the input constraints checks for the mlapo and bmm_transpose
operators.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
CI passed with new added/existing test.

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2f4e654

### Perf
64K/3K,1P1D,bs=32

before this pr:
TPOT 29ms, TTFT 47s,TPS 606 token/s

after this pr:
TPOT 29ms, TTFT 48s,TPS 636 token/s

Signed-off-by: rjg-lyh <1318825571@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants