Use AttentionSelectorConfig in get_attn_backend_cls by karan · Pull Request #1313 · vllm-project/tpu-inference

karan · 2025-12-15T18:05:19Z

Description

vllm-project/vllm#30212 refactored individual attention parameters into AttentionSelectorConfig. That broke our platform implementation - this PR fixes that.

I also added **kwargs to the signature. IMO, we should keep that there as a safety net if vllm adds more platform-specific args to the selector call later, it won't immediately break us.

Tests

Ran a model locally:

vllm serve openai/gpt-oss-120b --max-model-len=9216 --max-num-batched-tokens=1024 --max-num-seqs=128 --no-enable-prefix-caching --gpu-memory-utilization=0.98 --tensor-parallel-size=4 --kv-cache-dtype=fp8 --async-scheduling

Checklist

Before submitting this PR, please make sure:

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have made or will make corresponding changes to any relevant documentation.

github-actions · 2025-12-15T18:07:22Z

Description

Start with a short description of what the PR does and how this is a change from
the past.

The rest of the description includes relevant details and context, examples:

why is this change being made,
the problem being solved and any relevant context,
why this is a good solution,
some information about the specific implementation,
shortcomings of the solution and possible future improvements.

If the change fixes a Github issue, please include a link, e.g.,:
FIXES: #123456

Tests

Please describe how you tested this change, and include any instructions and/or
commands to reproduce.

Checklist

Before submitting this PR, please make sure:

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have made or will make corresponding changes to any relevant documentation.

Signed-off-by: karan <karangoel@google.com>

karan requested review from mrjunwan-lang and sixiang-google as code owners December 15, 2025 18:05

QiliangCui approved these changes Dec 15, 2025

View reviewed changes

Use AttentionSelectorConfig in get_attn_backend_cls

f0eed21

Signed-off-by: karan <karangoel@google.com>

QiliangCui added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 15, 2025

karan force-pushed the attnselector branch from 1968bd5 to f0eed21 Compare December 15, 2025 18:09

Fix pre-commit

1dd001e

Signed-off-by: karan <karangoel@google.com>

karan force-pushed the attnselector branch from 74fe5fd to 1dd001e Compare December 15, 2025 18:12

karan merged commit ddf8b6c into main Dec 15, 2025
40 checks passed

wdhongtw deleted the attnselector branch April 7, 2026 09:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use AttentionSelectorConfig in get_attn_backend_cls#1313

Use AttentionSelectorConfig in get_attn_backend_cls#1313
karan merged 2 commits intomainfrom
attnselector

karan commented Dec 15, 2025

Uh oh!

github-actions Bot commented Dec 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

karan commented Dec 15, 2025

Description

Tests

Checklist

Uh oh!

github-actions Bot commented Dec 15, 2025

Description

Tests

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants