Skip to content

Use AttentionSelectorConfig in get_attn_backend_cls#1313

Merged
karan merged 2 commits intomainfrom
attnselector
Dec 15, 2025
Merged

Use AttentionSelectorConfig in get_attn_backend_cls#1313
karan merged 2 commits intomainfrom
attnselector

Conversation

@karan
Copy link
Copy Markdown
Collaborator

@karan karan commented Dec 15, 2025

Description

vllm-project/vllm#30212 refactored individual attention parameters into AttentionSelectorConfig. That broke our platform implementation - this PR fixes that.

I also added **kwargs to the signature. IMO, we should keep that there as a safety net if vllm adds more platform-specific args to the selector call later, it won't immediately break us.

Tests

Ran a model locally:

vllm serve openai/gpt-oss-120b --max-model-len=9216 --max-num-batched-tokens=1024 --max-num-seqs=128 --no-enable-prefix-caching --gpu-memory-utilization=0.98 --tensor-parallel-size=4 --kv-cache-dtype=fp8 --async-scheduling

Checklist

Before submitting this PR, please make sure:

  • I have performed a self-review of my code.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have made or will make corresponding changes to any relevant documentation.

@github-actions
Copy link
Copy Markdown

Description

Start with a short description of what the PR does and how this is a change from
the past.

The rest of the description includes relevant details and context, examples:

  • why is this change being made,
  • the problem being solved and any relevant context,
  • why this is a good solution,
  • some information about the specific implementation,
  • shortcomings of the solution and possible future improvements.

If the change fixes a Github issue, please include a link, e.g.,:
FIXES: #123456

Tests

Please describe how you tested this change, and include any instructions and/or
commands to reproduce.

Checklist

Before submitting this PR, please make sure:

  • I have performed a self-review of my code.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have made or will make corresponding changes to any relevant documentation.

Signed-off-by: karan <karangoel@google.com>
@QiliangCui QiliangCui added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 15, 2025
Signed-off-by: karan <karangoel@google.com>
@karan karan merged commit ddf8b6c into main Dec 15, 2025
40 checks passed
@wdhongtw wdhongtw deleted the attnselector branch April 7, 2026 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants