-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
[Bugfix][CI] Move resolving cudagraph_mode before initializing attn_metadata_builder #27427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: fhl2000 <[email protected]>
Signed-off-by: fhl2000 <[email protected]>
Signed-off-by: fhl2000 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly addresses a bug where cudagraph_mode was resolved after the initialization of attn_metadata_builder, which could lead to incorrect behavior. The fix refactors the initialization logic in GPUModelRunner to ensure that cudagraph_mode is determined based on attention backend capabilities before the metadata builders are instantiated. This is achieved by first collecting all attention backend classes, determining the minimum supported CUDA graph level, and only then initializing the attention groups. The changes are logical, well-implemented, and effectively resolve the issue. The associated refactoring improves the code's clarity and correctness. The changes look solid.
Signed-off-by: fhl2000 <[email protected]>
|
Documentation preview: https://vllm--27427.org.readthedocs.build/en/27427/ |
…etadata_builder (vllm-project#27427) Signed-off-by: fhl2000 <[email protected]>
…etadata_builder (vllm-project#27427) Signed-off-by: fhl2000 <[email protected]>
…etadata_builder (vllm-project#27427) Signed-off-by: fhl2000 <[email protected]>
…etadata_builder (vllm-project#27427) Signed-off-by: fhl2000 <[email protected]> Signed-off-by: 0xrushi <[email protected]>
…etadata_builder (vllm-project#27427) Signed-off-by: fhl2000 <[email protected]> Signed-off-by: 0xrushi <[email protected]>
Purpose
Previously, resolving cudagraph_mode happened after attn_metadata_builder initialisation, which means attn_metadata_builder is not aware of the potential changed cudagraph_mode. (could lead to querying invalid cudagraph_capturing_sizes list if the resolved mode is NONE).
This should also fix the CI for #26016
Test Plan
To see if the CI is fixed in #26016 after merging this.
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.