[bugfix] Fix prompt logprobs on request eviction during chunked prefill by joa-stdn · Pull Request #41411 · vllm-project/vllm

joa-stdn · 2026-04-30T20:39:26Z

Summary

Fix prompt logprobs computation for chunked prefill: the computed_prefill < prompt_lens - 1 check incorrectly skipped the last prompt token, causing prompt logprobs to not be computed when they should be. Changed to computed_prefill < prompt_lens.
Move in_progress_prompt_logprobs_cpu state from InputBatch dict to CachedRequestState, ensuring prompt logprobs accumulation is tied to the request lifecycle rather than the batch.
Include prompt_logprobs in VllmRunner test helper output and add None handling to _logprobs_match for prompt logprob entries.
Add prompt_logprobs=2 test cases to async scheduling e2e tests.

Test Plan

Added dict(prompt_logprobs=2) and dict(prompt_logprobs=2, logprobs=2) to both test_without_spec_decoding and test_with_eagle3_spec_decoding
pytest tests/v1/e2e/general/test_async_scheduling.py -v

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

gemini-code-assist

Code Review

This pull request refactors the management of prompt logprobs by moving the in_progress_prompt_logprobs_cpu state from a dictionary within InputBatch to the CachedRequestState object. This change streamlines how logprob tensors are tracked across prefill steps and ensures proper cleanup during request removal or updates. I have no feedback to provide.

njhill

Thanks @joa-stdn, very clean fix!

Hopefully this same bug isn't in model runner v2, maybe you could check that too? (gpu/model_runner.py)

It would also be great to add or extend a test to cover this. You could look for example at https://github.com/vllm-project/vllm/blob/main/tests/v1/e2e/general/test_async_scheduling.py which forces preemptions.

njhill

Thanks again @joa-stdn!

njhill · 2026-05-01T18:18:21Z

The test changes are failing I think because we need to modify the test to also take prompt logprobs into account when comparing outputs.

joa-stdn · 2026-05-01T19:46:13Z

Thanks a lot for your review!

Hopefully this same bug isn't in model runner v2

I just checked my repro in model runner v2 and everything is fine there!

The test changes are failing I think because we need to modify the test to also take prompt logprobs into account when comparing outputs.

Yeah thanks I'll look into it!

mergify · 2026-05-02T01:56:05Z

Hi @joa-stdn, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

mergify · 2026-05-02T02:00:18Z

Documentation preview: https://vllm--41411.org.readthedocs.build/en/41411/

mergify · 2026-05-04T00:07:38Z

Documentation preview: https://vllm--41411.org.readthedocs.build/en/41411/

Signed-off-by: Joachim Studnia <joachim@mistral.ai>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Joachim Studnia <joachim@mistral.ai>

Signed-off-by: Joachim Studnia <joachim@mistral.ai>

njhill

Thanks again @joa-stdn, especially for also exposing and fixing the MRV2 bug!

joa-stdn requested a review from njhill as a code owner April 30, 2026 20:39

claude Bot reviewed Apr 30, 2026

View reviewed changes

mergify Bot added v1 bug Something isn't working labels Apr 30, 2026

joa-stdn force-pushed the joachim/fix_prompt_logprobs branch from 8d24122 to 6bd85f8 Compare April 30, 2026 20:40

gemini-code-assist Bot reviewed Apr 30, 2026

View reviewed changes

joa-stdn force-pushed the joachim/fix_prompt_logprobs branch from 2fba291 to f445e2d Compare April 30, 2026 22:09

njhill reviewed Apr 30, 2026

View reviewed changes

Comment thread vllm/v1/worker/gpu_input_batch.py Outdated

njhill added the verified Run pre-commit for new contributors without triggering other tests label Apr 30, 2026

joa-stdn force-pushed the joachim/fix_prompt_logprobs branch from b85cf89 to 271f156 Compare May 1, 2026 01:28

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label May 1, 2026

njhill approved these changes May 1, 2026

View reviewed changes

njhill enabled auto-merge (squash) May 1, 2026 14:51

auto-merge was automatically disabled May 1, 2026 21:05
Head branch was pushed to by a user without write access

joa-stdn force-pushed the joachim/fix_prompt_logprobs branch 2 times, most recently from d4c74bb to cae7f8f Compare May 2, 2026 01:50

joa-stdn requested review from ApostaC, WoosukKwon, alexm-redhat, gshtras, heheda12345, hmellor, markmc, noooop, orozery, robertgshaw2-redhat, tjtanaa and ywang96 as code owners May 2, 2026 01:50

joa-stdn force-pushed the joachim/fix_prompt_logprobs branch from cae7f8f to d4c74bb Compare May 2, 2026 01:55

joa-stdn force-pushed the joachim/fix_prompt_logprobs branch from a27e561 to f4b4cfa Compare May 2, 2026 01:59

joa-stdn force-pushed the joachim/fix_prompt_logprobs branch 2 times, most recently from 071cace to cc9d9e1 Compare May 4, 2026 00:07

joa-stdn requested review from 22quinn and sighingnow as code owners May 4, 2026 00:07

mergify Bot added qwen Related to Qwen models intel-gpu Related to Intel GPU cpu Related to CPU backends labels May 4, 2026

joa-stdn and others added 7 commits May 4, 2026 00:09

wip

96b9a65

Signed-off-by: Joachim Studnia <joachim@mistral.ai>

wip

3c10b9c

Signed-off-by: Joachim Studnia <joachim@mistral.ai>

wip

1db02a2

Signed-off-by: Joachim Studnia <joachim@mistral.ai>

wip

1f8bbfb

Signed-off-by: Joachim Studnia <joachim@mistral.ai>

wip

69dc01e

Signed-off-by: Joachim Studnia <joachim@mistral.ai>

Fix mypy type error: wrap logprobs error string in AssertionError

7ab6dd8

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Joachim Studnia <joachim@mistral.ai>

wip

5bb1613

Signed-off-by: Joachim Studnia <joachim@mistral.ai>

joa-stdn force-pushed the joachim/fix_prompt_logprobs branch from 815ef4b to 5bb1613 Compare May 4, 2026 00:09

Revert error handling refactor, keep only prompt logprobs changes

79f8d01

Signed-off-by: Joachim Studnia <joachim@mistral.ai>

ywang96 enabled auto-merge (squash) May 4, 2026 00:24

njhill approved these changes May 4, 2026

View reviewed changes

Merge branch 'main' into joachim/fix_prompt_logprobs

5adc7a6

ywang96 merged commit 422dd02 into vllm-project:main May 4, 2026
63 checks passed

haosdent mentioned this pull request May 8, 2026

[CI] De-flake failure CI test "test_async_scheduling::test_without_spec_decoding" #42041

Closed

rishaps mentioned this pull request May 8, 2026

[Bug]: prompt_logprobs depends on request order when prefix caching is enabled #42019

Open

1 task

factnn mentioned this pull request May 10, 2026

[Bugfix] Fix prompt_logprobs non-determinism with prefix caching (issue #42019) #42245

Open

This was referenced May 12, 2026

Revert "[Model Runner V2] Bug fix: logprob dtype int64/int32 issue" (#41761) #42418

Closed

[CI] Fix test_async_scheduling.py flakiness #42455

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bugfix] Fix prompt logprobs on request eviction during chunked prefill#41411

[bugfix] Fix prompt logprobs on request eviction during chunked prefill#41411
ywang96 merged 9 commits into
vllm-project:mainfrom
joa-stdn:joachim/fix_prompt_logprobs

joa-stdn commented Apr 30, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

njhill left a comment

Uh oh!

Uh oh!

njhill left a comment

Uh oh!

njhill commented May 1, 2026

Uh oh!

joa-stdn commented May 1, 2026 •

edited

Loading

Uh oh!

mergify Bot commented May 2, 2026

Uh oh!

mergify Bot commented May 2, 2026

Uh oh!

mergify Bot commented May 4, 2026

Uh oh!

njhill left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

joa-stdn commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

njhill commented May 1, 2026

Uh oh!

joa-stdn commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify Bot commented May 2, 2026

Uh oh!

mergify Bot commented May 2, 2026

Uh oh!

mergify Bot commented May 4, 2026

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

joa-stdn commented Apr 30, 2026 •

edited

Loading

joa-stdn commented May 1, 2026 •

edited

Loading