Skip to content

[BUGFIX] Fix degenerate strides in TRTLLM query tensors for FlashInfer backend. Fixes issue #32353#32417

Merged
pavanimajety merged 2 commits intovllm-project:mainfrom
CentML:vadim/issue32353
Jan 19, 2026
Merged

[BUGFIX] Fix degenerate strides in TRTLLM query tensors for FlashInfer backend. Fixes issue #32353#32417
pavanimajety merged 2 commits intovllm-project:mainfrom
CentML:vadim/issue32353

Conversation

@vadiklyutiy
Copy link
Copy Markdown
Collaborator

@vadiklyutiy vadiklyutiy commented Jan 15, 2026

Summary

This PR fixes an issue #32353 with degenerate strides in query tensors when using the TRTLLM kernels in the FlashInfer attention backend. The .contiguous() call alone doesn't fix degenerate strides when a dimension has size 1, which can cause issues with kernel execution.

Problem

Query tensors can have degenerate strides and .contiguous() doesn't fix it. In #32353:

Shape: torch.Size([1, 32, 128])
Stride: (4608, 128, 1)

Test

vllm serve nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 --trust-remote-code

starts successfully

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
@mergify mergify bot added nvidia v1 bug Something isn't working labels Jan 15, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the issue of degenerate strides in TRTLLM query tensors for the FlashInfer backend. The addition of .reshape(tensor.shape) after .contiguous() correctly forces non-degenerate strides, resolving the reported bug. The accompanying comments clearly explain the rationale behind this change, enhancing code clarity and maintainability.

@mgoin mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 15, 2026
@vadiklyutiy
Copy link
Copy Markdown
Collaborator Author

According to ci-health v1-test-attention-b200 fails on top of the tree

@vadiklyutiy
Copy link
Copy Markdown
Collaborator Author

previously failed test was workarounded in another commit. Now CI passed successfully

Copy link
Copy Markdown
Collaborator

@pavanimajety pavanimajety left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes @vadiklyutiy

@github-project-automation github-project-automation bot moved this to Ready in NVIDIA Jan 19, 2026
@pavanimajety pavanimajety merged commit 6101a26 into vllm-project:main Jan 19, 2026
53 checks passed
@github-project-automation github-project-automation bot moved this from Ready to Done in NVIDIA Jan 19, 2026
@vadiklyutiy vadiklyutiy deleted the vadim/issue32353 branch January 20, 2026 13:54
gopalsarda pushed a commit to gopalsarda/vllm that referenced this pull request Jan 20, 2026
…r backend. Fixes issue vllm-project#32353 (vllm-project#32417)

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
…r backend. Fixes issue vllm-project#32353 (vllm-project#32417)

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
…r backend. Fixes issue vllm-project#32353 (vllm-project#32417)

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working nvidia ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants