[bugfix] fix bug when top_logprobs=0 with spec decoding by realliujiaxu · Pull Request #30059 · vllm-project/vllm

realliujiaxu · 2025-12-04T12:46:23Z

Purpose

#26060 adds support for returning logprobs for v1 spec decoding. However, if top_logprobs is not set or set to 0, an error will occur:

{"error":{"message":"list index out of range","type":"Internal Server Error","param":null,"code":500}}

The reason is that the rejection sampler returns logprobs as None. The fix is to follow the sampler's approach: even when top_logprobs=0, logprobs should still be collected.

Test Plan

Run server

vllm serve Qwen/Qwen3-8B \
  --speculative-config '{
    "model": "RedHatAI/Qwen3-8B-speculator.eagle3",
    "num_speculative_tokens": 3,
    "method": "eagle3"
  }'

test with `top_logprobs=0`

curl --location 'http://127.0.0.1:8000/v1/chat/completions' --header 'Content-Type: application/json' --data '{
    "temperature": 0,
    "max_tokens": 10,
    "messages": [
        {
        "role": "user",
        "content": "who are you"
        }
    ],
    "logprobs": true,
    "top_logprobs": 0
    }'

Test Result

{"id":"chatcmpl-961ba782ebf17147","object":"chat.completion","created":1764852213,"model":"Qwen/Qwen3-8B","choices":[{"index":0,"message":{"role":"assistant","content":"<think>\nOkay, the user asked \"who are","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning":null,"reasoning_content":null},"logprobs":{"content":[{"token":"<think>","logprob":-2.3841855067985307e-07,"bytes":[60,116,104,105,110,107,62],"top_logprobs":[]},{"token":"\n","logprob":-1.0132738680113107e-05,"bytes":[10],"top_logprobs":[]},{"token":"Okay","logprob":-0.0039456626400351524,"bytes":[79,107,97,121],"top_logprobs":[]},{"token":",","logprob":-6.270212179515511e-05,"bytes":[44],"top_logprobs":[]},{"token":" the","logprob":-0.017913110554218292,"bytes":[32,116,104,101],"top_logprobs":[]},{"token":" user","logprob":-1.645074735279195e-05,"bytes":[32,117,115,101,114],"top_logprobs":[]},{"token":" asked","logprob":-0.19733361899852753,"bytes":[32,97,115,107,101,100],"top_logprobs":[]},{"token":" \"","logprob":-0.578165590763092,"bytes":[32,34],"top_logprobs":[]},{"token":"who","logprob":-0.0009124883217737079,"bytes":[119,104,111],"top_logprobs":[]},{"token":" are","logprob":-4.768360213347478e-06,"bytes":[32,97,114,101],"top_logprobs":[]}]},"finish_reason":"length","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":11,"total_tokens":21,"completion_tokens":10,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null}

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request effectively resolves a bug in speculative decoding where setting top_logprobs=0 would cause a server error. The fix, which changes the condition for checking sampling_metadata.max_num_logprobs from a truthiness check to an explicit is not None check, is correct and idiomatic. This ensures that logprobs are computed when top_logprobs is 0, aligning the rejection sampler's behavior with the standard sampler and preventing the reported server error. The change is minimal and precisely targets the bug.

Signed-off-by: realliujiaxu <realliujiaxu@163.com>

njhill · 2025-12-04T16:44:11Z

Thank you @realliujiaxu! Would you be willing to add a simple test for this?

Signed-off-by: realliujiaxu <realliujiaxu@163.com>

realliujiaxu · 2025-12-05T09:36:07Z

Thank you @realliujiaxu! Would you be willing to add a simple test for this?

Done. I've added a simple test, and tested it locally. Thanks! @njhill

Before the fix, the newly added unit test failed：

======================================================== short test summary info ========================================================
FAILED tests/v1/sample/test_logprobs.py::test_spec_decode_logprobs[model_setup0-raw_logits] - AssertionError: assert 10 == 0
FAILED tests/v1/sample/test_logprobs.py::test_spec_decode_logprobs[model_setup0-raw_logprobs] - AssertionError: assert 10 == 0
FAILED tests/v1/sample/test_logprobs.py::test_spec_decode_logprobs[model_setup0-processed_logits] - AssertionError: assert 10 == 0
FAILED tests/v1/sample/test_logprobs.py::test_spec_decode_logprobs[model_setup0-processed_logprobs] - AssertionError: assert 10 == 0

tests/v1/sample/test_logprobs.py

Signed-off-by: realliujiaxu <realliujiaxu@163.com>

realliujiaxu · 2025-12-09T07:19:55Z

@njhill Can we merge this PR?

njhill · 2025-12-10T00:33:19Z

@realliujiaxu looks like there is a related test failure: https://buildkite.com/vllm/ci/builds/42680#019b0423-41b0-45c7-94f4-c9f2473a6183

Signed-off-by: realliujiaxu <realliujiaxu@163.com>

realliujiaxu · 2025-12-10T06:29:26Z

@njhill the test failure if fixed, please review again

njhill · 2025-12-10T18:26:32Z

Thanks @realliujiaxu ... re the test fix, it looks like you changed max_num_logprobs from 0 to None in the test itself. But it seems that having it set to 0 caused this error in the rejection sampler which seems wrong?

[2025-12-09T18:17:38Z]                 logits,
[2025-12-09T18:17:38Z]                 target_logits if self.is_processed_logprobs_mode else raw_target_logits,
[2025-12-09T18:17:38Z] >               bonus_sampler_output.logprobs_tensors.logprobs,
[2025-12-09T18:17:38Z]                 output_token_ids,
[2025-12-09T18:17:38Z]             )
[2025-12-09T18:17:38Z] E           AttributeError: 'NoneType' object has no attribute 'logprobs'

njhill

Just blocking until answer to prior question is understood!

realliujiaxu · 2025-12-11T02:12:52Z

Thanks @realliujiaxu ... re the test fix, it looks like you changed from to in the test itself. But it seems that having it set to caused this error in the rejection sampler which seems wrong?max_num_logprobs``0``None``0
[2025-12-09T18:17:38Z]                 logits,
[2025-12-09T18:17:38Z]                 target_logits if self.is_processed_logprobs_mode else raw_target_logits,
[2025-12-09T18:17:38Z] >               bonus_sampler_output.logprobs_tensors.logprobs,
[2025-12-09T18:17:38Z]                 output_token_ids,
[2025-12-09T18:17:38Z]             )
[2025-12-09T18:17:38Z] E           AttributeError: 'NoneType' object has no attribute 'logprobs'

(If I understand correctly, you are wondering why I changed the test and if havingmax_num_logprobs set to 0 still caused error in the rejection sampler.)

bonus_sampler_output.logprobs_tensors is None because mock_sampler_output is used to stub the return value of rejection_sampler.sampler.return_value, where logprobs_tensors is hardcoded as None.

def mock_sampler_output(
    rejection_sampler: RejectionSampler, bonus_token_ids: torch.Tensor
):
    rejection_sampler.sampler.return_value = SamplerOutput(
        sampled_token_ids=bonus_token_ids, logprobs_tensors=None
    )

If the actual sampler were executed, logprobs_tensors would not be None. Therefore, the rejection sampler code itself is not at fault—the test case should be modified instead. To make mock_sampler_output consistent with the actual execution result, there are two ways to modify the test:

(The approach I’m currently taking) Set max_num_logprobs to None – the actual sampler execution will also produce logprobs_tensors = None.
Modify mock_sampler_output by preparing a fake value for logprobs_tensors, following the same logic as the sampler.

Perhaps option 2 is more reasonable? Looking forward to your further advice. @njhill

njhill · 2025-12-11T19:04:20Z

Thanks @realliujiaxu, makes sense, I'm happy with whichever you think is best.

realliujiaxu · 2025-12-12T05:08:10Z

@njhill I think the current approach is simple and clear enough. Can we merge this PR?

…#30059) Signed-off-by: realliujiaxu <realliujiaxu@163.com>

…#30059) Signed-off-by: realliujiaxu <realliujiaxu@163.com> Signed-off-by: Ubuntu <mjtaheri68@gmail.com>

…#30059) Signed-off-by: realliujiaxu <realliujiaxu@163.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

realliujiaxu requested review from 22quinn, houseroad and njhill as code owners December 4, 2025 12:46

mergify bot added the v1 label Dec 4, 2025

gemini-code-assist bot reviewed Dec 4, 2025

View reviewed changes

fix bug when top_logprobs=0 with spec decoding

b667710

Signed-off-by: realliujiaxu <realliujiaxu@163.com>

realliujiaxu force-pushed the fix-top-logprobs-0 branch from 05339da to b667710 Compare December 4, 2025 13:01

add ut

732140b

Signed-off-by: realliujiaxu <realliujiaxu@163.com>

njhill reviewed Dec 5, 2025

View reviewed changes

tests/v1/sample/test_logprobs.py Outdated Show resolved Hide resolved

LucasWilkinson assigned njhill Dec 5, 2025

realliujiaxu added 2 commits December 6, 2025 10:01

revert model

cf895a2

Signed-off-by: realliujiaxu <realliujiaxu@163.com>

Merge branch 'main' into fix-top-logprobs-0

5562a24

realliujiaxu requested a review from njhill December 9, 2025 07:17

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 9, 2025

njhill approved these changes Dec 9, 2025

View reviewed changes

njhill enabled auto-merge (squash) December 9, 2025 17:22

fix CI

7cacafd

Signed-off-by: realliujiaxu <realliujiaxu@163.com>

auto-merge was automatically disabled December 10, 2025 03:39
Head branch was pushed to by a user without write access

Merge branch 'main' into fix-top-logprobs-0

d65df50

njhill requested changes Dec 10, 2025

View reviewed changes

njhill approved these changes Dec 12, 2025

View reviewed changes

njhill merged commit d2c919d into vllm-project:main Dec 12, 2025
44 checks passed

Lucaskabela pushed a commit to Lucaskabela/vllm that referenced this pull request Dec 15, 2025

[bugfix] fix bug when top_logprobs=0 with spec decoding (vllm-project…

3f5ba41

…#30059) Signed-off-by: realliujiaxu <realliujiaxu@163.com>

Majid-Taheri pushed a commit to Majid-Taheri/vllm that referenced this pull request Dec 23, 2025

[bugfix] fix bug when top_logprobs=0 with spec decoding (vllm-project…

7ec5e47

…#30059) Signed-off-by: realliujiaxu <realliujiaxu@163.com> Signed-off-by: Ubuntu <mjtaheri68@gmail.com>

dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026

[bugfix] fix bug when top_logprobs=0 with spec decoding (vllm-project…

cd6a5fc

…#30059) Signed-off-by: realliujiaxu <realliujiaxu@163.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bugfix] fix bug when top_logprobs=0 with spec decoding#30059

[bugfix] fix bug when top_logprobs=0 with spec decoding#30059
njhill merged 6 commits intovllm-project:mainfrom
realliujiaxu:fix-top-logprobs-0

realliujiaxu commented Dec 4, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

njhill commented Dec 4, 2025

Uh oh!

realliujiaxu commented Dec 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

realliujiaxu commented Dec 9, 2025

Uh oh!

njhill commented Dec 10, 2025

Uh oh!

realliujiaxu commented Dec 10, 2025

Uh oh!

njhill commented Dec 10, 2025

Uh oh!

njhill left a comment

Uh oh!

realliujiaxu commented Dec 11, 2025

Uh oh!

njhill commented Dec 11, 2025

Uh oh!

realliujiaxu commented Dec 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

realliujiaxu commented Dec 4, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Run server

test with top_logprobs=0

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

njhill commented Dec 4, 2025

Uh oh!

realliujiaxu commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

realliujiaxu commented Dec 9, 2025

Uh oh!

njhill commented Dec 10, 2025

Uh oh!

realliujiaxu commented Dec 10, 2025

Uh oh!

njhill commented Dec 10, 2025

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

realliujiaxu commented Dec 11, 2025

Uh oh!

njhill commented Dec 11, 2025

Uh oh!

realliujiaxu commented Dec 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

realliujiaxu commented Dec 4, 2025 •

edited by github-actions bot

Loading

test with `top_logprobs=0`

realliujiaxu commented Dec 5, 2025 •

edited

Loading