[Bugfix] Fix error with penalties when speculative decoding and structural output are enabled by southfreebird · Pull Request #26586 · vllm-project/vllm

southfreebird · 2025-10-10T13:18:21Z

Fix an error that appears after #19482 when logit processors (such as penalties) are enabled together with speculative decoding and structural output. The example of the error:

File "/vllm/model_executor/layers/utils.py", line 45, in get_token_bin_counts_and_mask
     bin_counts.scatter_add_(1, tokens, torch.ones_like(tokens))
RuntimeError: Expected index [24, 4162] to be no larger than self [21, 201089] apart from dimension 1 and to be no larger size than src [24, 4162]

Purpose

Test Plan

Test Result

…put are enabled Signed-off-by: southfreebird <yvorott@gmail.com>

gemini-code-assist

Code Review

This pull request addresses a critical bug that causes a RuntimeError when speculative decoding and structured output are used together with logit processors. The root cause is that stale speculative token data could persist in InputBatch if the scheduler drops all draft tokens for a request, leading to out-of-bounds errors in subsequent penalty calculations. The fix correctly ensures that InputBatch.spec_token_ids is always updated, even with an empty list of tokens, thus preventing state corruption. The change is logical, well-commented, and effectively resolves the issue. The implementation looks correct.

benchislett · 2025-10-10T13:27:38Z

vllm/v1/worker/gpu_model_runner.py

+            # meet the structural schema. This means that
+            # scheduler_output.scheduled_spec_decode_tokens might be empty,
+            # even when speculative decoding is enabled. So, we moved this line
+            # from the 'if' block above.


Please rephrase the comment so that it explains the state of the code and not the change to the code. Comments about moved lines can become less meaningful over time with refactoring

vllm/v1/worker/gpu_model_runner.py

Signed-off-by: southfreebird <yvorott@gmail.com>

vllm/v1/worker/gpu_model_runner.py

benchislett

LGTM, Thanks!

njhill · 2025-10-18T22:13:58Z

@southfreebird could you rebase on latest main?

…dec-and-structural-output

…tural output are enabled (vllm-project#26586) Signed-off-by: southfreebird <yvorott@gmail.com>

…tural output are enabled (vllm-project#26586) Signed-off-by: southfreebird <yvorott@gmail.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

…tural output are enabled (vllm-project#26586) Signed-off-by: southfreebird <yvorott@gmail.com>

Fix error with penalties when speculative decoding and structural out…

e773ad5

…put are enabled Signed-off-by: southfreebird <yvorott@gmail.com>

southfreebird requested review from WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners October 10, 2025 13:18

mergify bot added the v1 label Oct 10, 2025

gemini-code-assist bot reviewed Oct 10, 2025

View reviewed changes

benchislett reviewed Oct 10, 2025

View reviewed changes

vllm/v1/worker/gpu_model_runner.py Show resolved Hide resolved

Fix comments + pre-commit

e4cc330

Signed-off-by: southfreebird <yvorott@gmail.com>

benchislett reviewed Oct 16, 2025

View reviewed changes

vllm/v1/worker/gpu_model_runner.py Show resolved Hide resolved

benchislett approved these changes Oct 16, 2025

View reviewed changes

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 18, 2025

njhill approved these changes Oct 18, 2025

View reviewed changes

njhill enabled auto-merge (squash) October 18, 2025 18:16

Merge branch 'vllm-project:main' into fix/logit-processors-with-spec-…

1c3d127

…dec-and-structural-output

njhill merged commit f6fdacd into vllm-project:main Oct 19, 2025
46 checks passed

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

[Bugfix] Fix error with penalties when speculative decoding and struc…

5f388e9

…tural output are enabled (vllm-project#26586) Signed-off-by: southfreebird <yvorott@gmail.com>

adabeyta pushed a commit to adabeyta/vllm that referenced this pull request Oct 20, 2025

[Bugfix] Fix error with penalties when speculative decoding and struc…

447e3d1

…tural output are enabled (vllm-project#26586) Signed-off-by: southfreebird <yvorott@gmail.com>

ilmarkov pushed a commit to neuralmagic/vllm that referenced this pull request Nov 7, 2025

[Bugfix] Fix error with penalties when speculative decoding and struc…

ad28c6e

…tural output are enabled (vllm-project#26586) Signed-off-by: southfreebird <yvorott@gmail.com>

rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025

[Bugfix] Fix error with penalties when speculative decoding and struc…

2761fe7

…tural output are enabled (vllm-project#26586) Signed-off-by: southfreebird <yvorott@gmail.com>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[Bugfix] Fix error with penalties when speculative decoding and struc…

9067136

…tural output are enabled (vllm-project#26586) Signed-off-by: southfreebird <yvorott@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Fix error with penalties when speculative decoding and structural output are enabled#26586

[Bugfix] Fix error with penalties when speculative decoding and structural output are enabled#26586
njhill merged 3 commits intovllm-project:mainfrom
southfreebird:fix/logit-processors-with-spec-dec-and-structural-output

southfreebird commented Oct 10, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

benchislett Oct 10, 2025

Uh oh!

Uh oh!

Uh oh!

benchislett left a comment

Uh oh!

njhill commented Oct 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

southfreebird commented Oct 10, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

benchislett Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

benchislett left a comment

Choose a reason for hiding this comment

Uh oh!

njhill commented Oct 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

southfreebird commented Oct 10, 2025 •

edited by github-actions bot

Loading