Skip to content

[BugFix] Fix returned logprobs with spec decode + prefill chunking#29216

Merged
DarkLight1337 merged 1 commit intovllm-project:mainfrom
njhill:fix-sd-logprobs-cpf
Nov 22, 2025
Merged

[BugFix] Fix returned logprobs with spec decode + prefill chunking#29216
DarkLight1337 merged 1 commit intovllm-project:mainfrom
njhill:fix-sd-logprobs-cpf

Conversation

@njhill
Copy link
Member

@njhill njhill commented Nov 22, 2025

The produced logprobs lists in this case were incorrect due incorrectly taking into account sampled token "discards" when computing offsets into the logprobs tensors.

Also:

  • Fix error with bf16 models and "raw_logits" mode
  • Use smaller model for test

Original PR which had this mistake: #26060

cc @TheEpicDolphin

Also:
- Fix error with bf16 models and "raw_logits" mode
- Use smaller model for test

Signed-off-by: Nick Hill <nhill@redhat.com>
@mergify mergify bot added the v1 label Nov 22, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug where logprobs were incorrectly calculated during speculative decoding with prefill chunking. The main fix in vllm/v1/worker/gpu_model_runner.py correctly computes token offsets for logprobs before any tokens are discarded, which resolves the issue. Additionally, the PR includes a fix for handling bfloat16 models in raw_logits mode and updates the relevant test to use a smaller model and reliably trigger the chunking behavior. The changes are well-implemented and the logic is sound. I approve of this pull request.

@njhill njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 22, 2025
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

@DarkLight1337 DarkLight1337 merged commit d44a63c into vllm-project:main Nov 22, 2025
51 checks passed
@njhill njhill deleted the fix-sd-logprobs-cpf branch November 22, 2025 18:34
@rasmith
Copy link
Contributor

rasmith commented Nov 22, 2025

Yes, this works, thanks for fixing @njhill !

ywang96 pushed a commit to ywang96/vllm that referenced this pull request Nov 23, 2025
lpapavassiliou pushed a commit to lpapavassiliou/vllm that referenced this pull request Nov 24, 2025
RunkaiTao pushed a commit to RunkaiTao/vllm that referenced this pull request Nov 24, 2025
…llm-project#29216)

Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: Runkai Tao <rt572@physics.rutgers.edu>
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
kitaekatt pushed a commit to kitaekatt/vllm that referenced this pull request Dec 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants