[BugFix] Don’t compute reorder threshold when there are no attention groups#27861
Merged
LucasWilkinson merged 1 commit intovllm-project:mainfrom Oct 31, 2025
Merged
Conversation
LucasWilkinson
approved these changes
Oct 31, 2025
Collaborator
LucasWilkinson
left a comment
There was a problem hiding this comment.
Thanks for the contribution! Overall LGTM, left a couple nits
Signed-off-by: Huamin Li <3ericli@gmail.com>
a92ec36 to
2e0dc6a
Compare
Contributor
Author
|
Thanks @LucasWilkinson for reviewing! I just updated this PR to address the comments, please take another look! |
This was referenced Oct 31, 2025
ZhengHongming888
pushed a commit
to ZhengHongming888/vllm
that referenced
this pull request
Nov 8, 2025
rtourgeman
pushed a commit
to rtourgeman/vllm
that referenced
this pull request
Nov 10, 2025
devpatelio
pushed a commit
to SumanthRH/vllm
that referenced
this pull request
Nov 29, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
This PR fixes a startup crash in the v1 runtime for attention‑free models (e.g., Terratorch) introduced after #27809. The engine unconditionally computed the batch reorder threshold even when no attention backends were created, leading to:
from the nightly run (https://buildkite.com/vllm/ci/builds/37041/steps/canvas?sid=019a386d-1b25-4c07-9a9b-085c1e07ea05, https://buildkite.com/vllm/ci/builds/37041/steps/canvas?sid=019a386d-1b26-4f42-b55f-f0125da20368)
This PR (1) skip the calculation when there are no attention groups, and (2) make calculate_reorder_batch_threshold() defensive by resolving an empty list to None.
Test Plan
CI
Test Result
Basic Models Tests (Extra Initialization) 1 + Basic Models Tests (Extra Initialization) 2
https://buildkite.com/vllm/ci/builds/37052/steps/canvas?sid=019a3901-5ee6-45ba-bace-2ccb858b53a1
Basic Models Tests (Initialization)
https://buildkite.com/vllm/ci/builds/37052/steps/canvas?sid=019a3901-5ee5-48b7-ac1d-0cf8db1b054d
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.