[Core][Bookkeeping] Update cu_num_accepted_tokens for all req_index by Jialin · Pull Request #27629 · vllm-project/vllm

Jialin · 2025-10-28T05:30:43Z

Purpose

In this PR, we ensure cu_num_accepted_tokens is updated for all request_index. To avoid position skipped / shifted due to empty or None sampled_ids.

Test Plan & Test Result

> pytest tests/v1/sample/test_logprobs.py -k test_spec_decode_logprobs
> pytest tests/v1/sample/test_rejection_sampler.py

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request correctly addresses a bug in the bookkeeping logic for speculative decoding. By moving the update of cu_num_accepted_tokens to before the check for empty sampled_ids, it ensures that the cumulative token counts are accurate for all requests, including those that don't produce new tokens in a given step. This prevents potential indexing errors and position shifts. The implementation is clean and effectively resolves the described issue. I have no further suggestions.

Jialin · 2025-10-28T05:46:20Z

Include authors and reviewers of #26060 to confirm if it's the right update.

CC @TheEpicDolphin @22quinn @njhill

TheEpicDolphin · 2025-10-28T16:41:28Z

Thanks for catching this @Jialin, looks good to me!

Jialin · 2025-10-28T17:26:32Z

Gentle nudge @njhill @22quinn @yeqcharlotte @houseroad

22quinn

👍

njhill

Thanks for catching @Jialin!

Any chance we could extend test(s) added in #26060 to catch this case?

vllm/v1/worker/gpu_model_runner.py

Jialin · 2025-10-28T18:13:57Z

Thanks for catching @Jialin!

Any chance we could extend test(s) added in #26060 to catch this case?

@njhill Skimmed throughput the tests added in #26060, but seems most of them are e2e testing, so it might be a bit hard to add unit test to cover this.

However, after this change, as we're only updating numpy arrays in bookkeeping now. A potential followup step is to introduce a numba jit function to further speed up bookkeeping. (And I could definitely add unit tests to cover this case after with the more self-contained jit function. WDYT?

njhill · 2025-10-28T18:21:49Z

Thanks for catching @Jialin!
Any chance we could extend test(s) added in #26060 to catch this case?

@njhill Skimmed throughput the tests added in #26060, but seems most of them are e2e testing, so it might be a bit hard to add unit test to cover this.

However, after this change, as we're only updating numpy arrays in bookkeeping now. A potential followup step is to introduce a numba jit function to further speed up bookkeeping. (And I could definitely add unit tests to cover this case after with the more self-contained jit function. WDYT?

@Jialin fair enough, I wasn't sure whether we could tweak the e2e test to trigger the bug. If that's nontrivial then sounds ok re leaving to future unit tests.

Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

…llm-project#27629) Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

mergify bot added the v1 label Oct 28, 2025

gemini-code-assist bot reviewed Oct 28, 2025

View reviewed changes

Jialin requested review from 22quinn and njhill October 28, 2025 05:45

Jialin requested review from houseroad and yeqcharlotte October 28, 2025 17:26

22quinn approved these changes Oct 28, 2025

View reviewed changes

22quinn enabled auto-merge (squash) October 28, 2025 17:45

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 28, 2025

njhill reviewed Oct 28, 2025

View reviewed changes

vllm/v1/worker/gpu_model_runner.py Outdated Show resolved Hide resolved

auto-merge was automatically disabled October 28, 2025 18:07
Head branch was pushed to by a user without write access

Jialin force-pushed the logprob branch from e8c7ff2 to 7212385 Compare October 29, 2025 17:50

njhill approved these changes Oct 29, 2025

View reviewed changes

Jialin added 3 commits October 29, 2025 15:07

[Core][Bookkeeping] Update cu_num_accepted_tokens for all req_index

db7d81a

Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

fix pyre complains

481a6a4

Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

sampled_id_cnt -> num_sampled_ids

535721c

Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

Jialin force-pushed the logprob branch from 7212385 to 535721c Compare October 29, 2025 22:07

njhill merged commit 4574d48 into vllm-project:main Oct 30, 2025
46 checks passed

ZhengHongming888 pushed a commit to ZhengHongming888/vllm that referenced this pull request Nov 8, 2025

[Core][Bookkeeping] Update cu_num_accepted_tokens for all req_index (v…

f5955ef

…llm-project#27629) Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025

[Core][Bookkeeping] Update cu_num_accepted_tokens for all req_index (v…

de7b374

…llm-project#27629) Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[Core][Bookkeeping] Update cu_num_accepted_tokens for all req_index (v…

1357d5a

…llm-project#27629) Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Core][Bookkeeping] Update cu_num_accepted_tokens for all req_index#27629

[Core][Bookkeeping] Update cu_num_accepted_tokens for all req_index#27629
njhill merged 3 commits intovllm-project:mainfrom
Jialin:logprob

Jialin commented Oct 28, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Jialin commented Oct 28, 2025

Uh oh!

TheEpicDolphin commented Oct 28, 2025

Uh oh!

Jialin commented Oct 28, 2025

Uh oh!

22quinn left a comment

Uh oh!

njhill left a comment

Uh oh!

Uh oh!

Jialin commented Oct 28, 2025 •

edited

Loading

Uh oh!

njhill commented Oct 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

Jialin commented Oct 28, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan & Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Jialin commented Oct 28, 2025

Uh oh!

TheEpicDolphin commented Oct 28, 2025

Uh oh!

Jialin commented Oct 28, 2025

Uh oh!

22quinn left a comment

Choose a reason for hiding this comment

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Jialin commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

njhill commented Oct 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Jialin commented Oct 28, 2025 •

edited by github-actions bot

Loading

Jialin commented Oct 28, 2025 •

edited

Loading