[BugFix] Mistakenly passing num_reqs_padded as num_reqs in _dummy_run by Selkh · Pull Request #34121 · vllm-project/vllm

Selkh · 2026-02-09T06:38:41Z

Purpose

In "_dummy_run", "num_tokens_padded" was mistakenly passed as num_tokens, leading to an Assertion Error when "split_decodes_and_prefills" for an attention backend "requires uniform" when "enable_sp" is ON.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request fixes a bug in _dummy_run where num_reqs_padded was incorrectly passed as the num_reqs argument to _build_attention_metadata. While this change is correct, the call is still incomplete as it's missing the num_tokens_padded and num_reqs_padded arguments, which can lead to incorrect behavior when CUDA graph padding is enabled. I've suggested a more complete fix to ensure that padded values are used correctly when pad_attn is true.

mergify · 2026-02-10T05:35:06Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Selkh.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Selkh · 2026-02-10T06:26:45Z

@LucasWilkinson Still not fixed after #34187 when SP & uniform batch backend

LucasWilkinson

thanks for the contribution! lets just make this fully match execute model, i.e.

                    num_reqs=num_reqs,
                    num_reqs_padded=num_reqs_padded if pad_attn else None,

Selkh · 2026-02-10T09:00:04Z

thanks for the contribution! lets just make this fully match execute model, i.e.
                    num_reqs=num_reqs,
                    num_reqs_padded=num_reqs_padded if pad_attn else None,

"dummy_run" may need unpadded num_reqs and num_tokens, rather than matching execute_model. Consider the following two scenarios:

In dummy_run, the query_start_loc is constructed from a fake scheduled_tokens_list rather than real values. This may cause an assertion failure during speculative decoding: assert num_reqs * query_lens[0] == num_tokens, "tokens not padded correctly".
The num_tokens_padded and num_reqs_padded values after _determine_batch_execution_and_padding may become inconsistent. For example, when capturing the execution graph with CUDAGraphMode.NONE, num_reqs may not get padded, while num_tokens might be padded by speculative decoding logic, leading to a mismatch.

LucasWilkinson · 2026-02-10T14:55:39Z

the query_start_loc is constructed from a fake scheduled_tokens_list rather than real values. This may cause an assertion failure during speculative decoding: assert num_reqs * query_lens[0] == num_tokens, "tokens not padded correctly".

can you please provide a reproducer?

vadiklyutiy · 2026-02-26T20:38:18Z

maybe similar to #35243

mergify bot added v1 bug Something isn't working labels Feb 9, 2026

gemini-code-assist bot reviewed Feb 9, 2026

View reviewed changes

DarkLight1337 requested a review from LucasWilkinson February 9, 2026 17:30

mergify bot added needs-rebase and removed needs-rebase labels Feb 10, 2026

LucasWilkinson requested changes Feb 10, 2026

View reviewed changes

Selkh closed this Mar 6, 2026

Selkh force-pushed the main branch from ed61c36 to a1ffa56 Compare March 6, 2026 05:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BugFix] Mistakenly passing num_reqs_padded as num_reqs in _dummy_run#34121

[BugFix] Mistakenly passing num_reqs_padded as num_reqs in _dummy_run#34121
Selkh wants to merge 0 commit intovllm-project:mainfrom
Selkh:main

Selkh commented Feb 9, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

mergify bot commented Feb 10, 2026

Uh oh!

Selkh commented Feb 10, 2026

Uh oh!

LucasWilkinson left a comment

Uh oh!

Selkh commented Feb 10, 2026

Uh oh!

LucasWilkinson commented Feb 10, 2026

Uh oh!

vadiklyutiy commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Selkh commented Feb 9, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

mergify bot commented Feb 10, 2026

Uh oh!

Selkh commented Feb 10, 2026

Uh oh!

LucasWilkinson left a comment

Choose a reason for hiding this comment

Uh oh!

Selkh commented Feb 10, 2026

Uh oh!

LucasWilkinson commented Feb 10, 2026

Uh oh!

vadiklyutiy commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Selkh commented Feb 9, 2026 •

edited by github-actions bot

Loading