[Attention] Refactor check_and_update_config#33600
[Attention] Refactor check_and_update_config#33600vllm-bot merged 17 commits intovllm-project:mainfrom
check_and_update_config#33600Conversation
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
This pull request has merge conflicts that must be resolved before it can be |
There was a problem hiding this comment.
Code Review
This pull request refactors the attention backend selection logic in check_and_update_config and get_attn_backend_cls. The changes significantly improve code clarity and maintainability by centralizing the backend selection logic into a new select_attention_backend method and introducing get_preferred_block_size for determining block sizes. This is a great improvement. I've found one issue with a type hint that should be addressed.
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
check_and_update_configcheck_and_update_config
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
Hi @MatthewBonanni, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
pavanimajety
left a comment
There was a problem hiding this comment.
Thanks for the PR, @MatthewBonanni!
Looks good to me, pending clean CI and minor nits!
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
Hi @MatthewBonanni, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
I think this PR broke model initialization and V1 Others tests, checking in https://buildkite.com/vllm/ci/builds/52009/steps/canvas The build for the previous PR for comparison: https://buildkite.com/vllm/ci/builds/52001/steps/canvas |
|
Do we understand how this was merged with so many breakages? Have we given up on the rule that we don't force merge without certainty that CI failures are unrelated (even if it seems obvious that they are)? Has @vllm-bot gone rogue? |
|
I went rogue, sorry. I looked at the failures but thought they were not related |
|
My fault too, sorry. I asked @mgoin yesterday morning whether we should consider a force merge on this. I also assumed failures were unrelated. |
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Jason Ozuzu <jasonozuzu@cohere.com>
This reverts commit 7743152.
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Andrii Skliar <askliar@nvidia.com>
Purpose
check_and_update_configunnecessarily duplicates much of the logic from the attention selector in order to set an approproate block size. This PR refactorscheck_and_update_configto use the selector, which will be simpler to maintain going forward.--block-sizeand--attention-backendand the backend doesn't support the block size, we raise an error rather than overriding a user selection--block-sizebut not the backend, the backend selector respects that block size choice and tries to find a backend which is compatible, raising an error if no valid backends are found--attention-backendonly, an appropriate block size is selectedTest Plan
Automatic selection
yields
Setting bad block size
yields
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.