Skip to content

[Fix] Remove divisibility requirement between num_kv_heads and tp_size in bailing_moe#26876

Merged
youkaichao merged 2 commits intovllm-project:mainfrom
ant-yy:ling_v2_fix
Oct 15, 2025
Merged

[Fix] Remove divisibility requirement between num_kv_heads and tp_size in bailing_moe#26876
youkaichao merged 2 commits intovllm-project:mainfrom
ant-yy:ling_v2_fix

Conversation

@ant-yy
Copy link
Copy Markdown
Contributor

@ant-yy ant-yy commented Oct 15, 2025

Purpose

The previous implementation in bailing_moe.py required that num_kv_heads be divisible by tp_size, which restricted the valid options for tp_size. This PR addresses and fixes that limitation.

Signed-off-by: vito.yy <vito.yy@antgroup.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to remove the divisibility requirement between num_kv_heads and tp_size in bailing_moe.py. While the change correctly removes the assertion and adjusts the calculation of num_kv_heads in BailingAttention, it introduces an inconsistency with the QKVParallelLinear layer. This layer is used by BailingAttention and still enforces a divisibility constraint, which will likely lead to an assertion failure at runtime if total_kv_heads is not divisible by tp_size (or vice-versa, depending on which is larger). This makes the fix incomplete and could cause crashes.

@youkaichao youkaichao enabled auto-merge (squash) October 15, 2025 06:39
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 15, 2025
@youkaichao youkaichao disabled auto-merge October 15, 2025 08:43
@youkaichao youkaichao merged commit 5c3bae1 into vllm-project:main Oct 15, 2025
49 of 50 checks passed
bbartels pushed a commit to bbartels/vllm that referenced this pull request Oct 16, 2025
…e in bailing_moe (vllm-project#26876)

Signed-off-by: vito.yy <vito.yy@antgroup.com>
Signed-off-by: bbartels <benjamin@bartels.dev>
albertoperdomo2 pushed a commit to albertoperdomo2/vllm that referenced this pull request Oct 16, 2025
…e in bailing_moe (vllm-project#26876)

Signed-off-by: vito.yy <vito.yy@antgroup.com>
Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
…e in bailing_moe (vllm-project#26876)

Signed-off-by: vito.yy <vito.yy@antgroup.com>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
…e in bailing_moe (vllm-project#26876)

Signed-off-by: vito.yy <vito.yy@antgroup.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…e in bailing_moe (vllm-project#26876)

Signed-off-by: vito.yy <vito.yy@antgroup.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…e in bailing_moe (vllm-project#26876)

Signed-off-by: vito.yy <vito.yy@antgroup.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
…e in bailing_moe (vllm-project#26876)

Signed-off-by: vito.yy <vito.yy@antgroup.com>
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
…e in bailing_moe (vllm-project#26876)

Signed-off-by: vito.yy <vito.yy@antgroup.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants