[BugFix] Potential bug fix for `test_async_tp_pass_correctness` by LucasWilkinson · Pull Request #33854 · vllm-project/vllm

LucasWilkinson · 2026-02-05T01:23:14Z

Potential fix for: #33802

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>

gemini-code-assist

Code Review

This pull request applies a bug fix for CUDA graph correctness in FlashAttention by adjusting the range of scheduler metadata that is zeroed out. The change ensures a semaphore is correctly reset. My review identifies a potential edge case where n could be 0, which might lead to incorrect behavior. I've suggested a more robust implementation to handle this case gracefully in both modified files.

gemini-code-assist · 2026-02-05T01:24:38Z

vllm/v1/attention/backends/flash_attn.py

+            # forward pass because when num_splits == 1, FA3's internal
+            # semaphore reset uses PyTorch zero_() which isn't captured in
+            # CUDA graphs.
+            self.scheduler_metadata[n - 1 :] = 0


This change seems correct, but it might introduce a bug if n can be 0. If n=0, n-1 becomes -1, and self.scheduler_metadata[-1:] = 0 will only zero out the last element of the buffer. The previous behavior for n=0 was to zero out the entire buffer (self.scheduler_metadata[0:] = 0), which seems safer for resetting state when there are no requests.

While n is likely always >= 1 (since scheduler_metadata size is batch_size * 4 + 1), it's safer to handle the n=0 case explicitly to prevent potential issues. A more robust implementation would be:

Suggested change

self.scheduler_metadata[n - 1 :] = 0

self.scheduler_metadata[max(0, n - 1):] = 0

gemini-code-assist · 2026-02-05T01:24:39Z

vllm/v1/attention/backends/mla/flashattn_mla.py

+            # forward pass because when num_splits == 1, FA3's internal
+            # semaphore reset uses PyTorch zero_() which isn't captured in
+            # CUDA graphs.
+            self.scheduler_metadata[n - 1 :] = 0


This change seems correct, but it might introduce a bug if n can be 0. If n=0, n-1 becomes -1, and self.scheduler_metadata[-1:] = 0 will only zero out the last element of the buffer. The previous behavior for n=0 was to zero out the entire buffer (self.scheduler_metadata[0:] = 0), which seems safer for resetting state when there are no requests.

While n is likely always >= 1 (since scheduler_metadata size is batch_size * 4 + 1), it's safer to handle the n=0 case explicitly to prevent potential issues. A more robust implementation would be:

Suggested change

self.scheduler_metadata[n - 1 :] = 0

self.scheduler_metadata[max(0, n - 1):] = 0

ProExpertProg

Fingers crossed!

LucasWilkinson · 2026-02-08T23:25:45Z

see: #34043

fix

5704f34

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>

LucasWilkinson requested a review from pavanimajety as a code owner February 5, 2026 01:23

mergify bot added v1 bug Something isn't working labels Feb 5, 2026

gemini-code-assist bot reviewed Feb 5, 2026

View reviewed changes

LucasWilkinson added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 5, 2026

ProExpertProg approved these changes Feb 5, 2026

View reviewed changes

ProExpertProg mentioned this pull request Feb 5, 2026

Revert "[Attention][FA3] Update FA3 to include new swizzle optimization" #33841

Merged

LucasWilkinson closed this Feb 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BugFix] Potential bug fix for `test_async_tp_pass_correctness`#33854

[BugFix] Potential bug fix for `test_async_tp_pass_correctness`#33854
LucasWilkinson wants to merge 1 commit intovllm-project:mainfrom
neuralmagic:lwilkinson/fix-async-tp

LucasWilkinson commented Feb 5, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 5, 2026

Uh oh!

gemini-code-assist bot Feb 5, 2026

Uh oh!

ProExpertProg left a comment

Uh oh!

LucasWilkinson commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	self.scheduler_metadata[n - 1 :] = 0
	self.scheduler_metadata[max(0, n - 1):] = 0

Uh oh!

Conversation

LucasWilkinson commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

LucasWilkinson commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LucasWilkinson commented Feb 5, 2026 •

edited

Loading