Skip to content

[Model Runner V2] Do not error on attention backends#32820

Merged
WoosukKwon merged 1 commit intomainfrom
woosuk/v2-attn-backends
Jan 22, 2026
Merged

[Model Runner V2] Do not error on attention backends#32820
WoosukKwon merged 1 commit intomainfrom
woosuk/v2-attn-backends

Conversation

@WoosukKwon
Copy link
Copy Markdown
Collaborator

No description provided.

Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
@mergify mergify Bot added the v1 label Jan 22, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes a hardcoded check in GPUModelRunner that restricted the supported attention backends to FLASH_ATTN, FLASHINFER, and FLASHINFER_MLA. By deleting this validation, the model runner becomes more generic and can now accommodate other attention backends. This change is consistent with the removed TODO comment which indicated the need to support more backends. The modification is clean, correct, and improves the flexibility of the system. I approve this change.

@WoosukKwon WoosukKwon merged commit 5e00b56 into main Jan 22, 2026
12 of 13 checks passed
@WoosukKwon WoosukKwon deleted the woosuk/v2-attn-backends branch January 22, 2026 01:02
monajafi-amd pushed a commit to monajafi-amd/vllm that referenced this pull request Jan 23, 2026
)

Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: mohammad najafi <mohammad.najafi@amd.com>
cwazai pushed a commit to cwazai/vllm that referenced this pull request Jan 25, 2026
)

Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: 陈建华 <1647430658@qq.com>
lapy pushed a commit to lapy/vllm that referenced this pull request Jan 27, 2026
)

Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants