Skip to content

[MRV2] Skip hidden states allocation for PW CUDA graphs#37818

Merged
WoosukKwon merged 1 commit intomainfrom
woosuk/mrv2-cudgraph-fix
Mar 22, 2026
Merged

[MRV2] Skip hidden states allocation for PW CUDA graphs#37818
WoosukKwon merged 1 commit intomainfrom
woosuk/mrv2-cudgraph-fix

Conversation

@WoosukKwon
Copy link
Copy Markdown
Collaborator

No description provided.

Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>
@WoosukKwon WoosukKwon requested a review from njhill as a code owner March 22, 2026 17:40
@WoosukKwon WoosukKwon added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 22, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an optimization to avoid allocating memory for hidden states when capturing piecewise (PW) CUDA graphs. The changes involve adding a condition to bypass hidden state handling for CUDAGraphMode.PIECEWISE during graph capture and adding a clarifying comment. The implementation appears correct and effectively reduces memory usage for PW graphs without affecting the functionality of full CUDA graphs.

@WoosukKwon WoosukKwon merged commit ce9b1d7 into main Mar 22, 2026
57 of 60 checks passed
@WoosukKwon WoosukKwon deleted the woosuk/mrv2-cudgraph-fix branch March 22, 2026 18:47
yzong-rh pushed a commit to yzong-rh/vllm that referenced this pull request Mar 23, 2026
RhizoNymph pushed a commit to RhizoNymph/vllm that referenced this pull request Mar 26, 2026
HenryTangDev pushed a commit to HenryTangMain/vllm that referenced this pull request Mar 27, 2026
SouthWest7 pushed a commit to SouthWest7/vllm that referenced this pull request Mar 27, 2026
khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026
Monishver11 pushed a commit to Monishver11/vllm that referenced this pull request Mar 27, 2026
…#37818)

Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>
Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
nithinvc pushed a commit to nithinvc/vllm that referenced this pull request Mar 27, 2026
…#37818)

Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>

Signed-off-by: Nithin Chalapathi <nithin.ch10@gmail.com>
JiantaoXu pushed a commit to JiantaoXu/vllm that referenced this pull request Mar 28, 2026
vrdn-23 pushed a commit to vrdn-23/vllm that referenced this pull request Mar 30, 2026
…#37818)

Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>
Signed-off-by: Vinay Damodaran <vrdn@hey.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

nvidia ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant