[BugFix] Fix AssertionError: DCP not support reorder_batch_threshold > 1 now. #28751
Merged
LucasWilkinson merged 1 commit intovllm-project:mainfrom Nov 15, 2025
Merged
Conversation
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Contributor
There was a problem hiding this comment.
Code Review
This pull request addresses an AssertionError related to reorder_batch_threshold in Decode Context Parallelism (DCP) mode by removing the problematic assertion from GPUModelRunner. Your reasoning that this check is brittle due to its reliance on a global environment variable (VLLM_ATTENTION_BACKEND) is correct. The responsibility for advertising backend capabilities should indeed lie with the backends themselves, and this change aligns with that principle. The logic within each AttentionMetadataBuilder appears to be the more robust place for such enforcement. This removal simplifies the code, improves modularity, and resolves the issue of false positive assertion failures. The change is approved.
mgoin
approved these changes
Nov 15, 2025
ym820
pushed a commit
to ym820/vllm
that referenced
this pull request
Nov 16, 2025
… > 1 now.` (vllm-project#28751) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: ym820 <yikai.mao@outlook.com>
geodavic
pushed a commit
to geodavic/vllm
that referenced
this pull request
Nov 16, 2025
… > 1 now.` (vllm-project#28751) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: George D. Torres <gdavtor@gmail.com>
devpatelio
pushed a commit
to SumanthRH/vllm
that referenced
this pull request
Nov 29, 2025
… > 1 now.` (vllm-project#28751) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
kitaekatt
pushed a commit
to kitaekatt/vllm
that referenced
this pull request
Dec 1, 2025
… > 1 now.` (vllm-project#28751) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix
AssertionError: DCP not support reorder_batch_threshold > 1 now.caused by #27363Simply removing the assert; this assert has resulted in more false positives than true positives causing unnecessary thrash. The GPU model runner should not be responsible for tracking support of attention backends and ensuring they are advertising correct reorder batch thresholds; this would be better enforced via something like: #28750