Skip to content

[scheduler] fix: correcting extend_logprob_start_len calculation#15922

Merged
ch-wan merged 11 commits intomainfrom
cheng/dev/simplify-logprob_start_len
Dec 28, 2025
Merged

[scheduler] fix: correcting extend_logprob_start_len calculation#15922
ch-wan merged 11 commits intomainfrom
cheng/dev/simplify-logprob_start_len

Conversation

@ch-wan
Copy link
Copy Markdown
Collaborator

@ch-wan ch-wan commented Dec 27, 2025

Motivation

The original calculation for extend_logprob_start_len was incorrect (code). When kv retraction happens, it should be determined based on len(fill_ids) (the total number of original input ids and output ids) when return_logprob is False but logprob_start_len is set as len(origin_input_ids) - 1.

Modifications

We set logprob_start_len as -1 by default when return_logprob is False so that extend_logprob_start_len can be set as len(fill_ids) - 1 during prepare_for_extend. We also reverted changes in #12115 as it's no longer needed.

Accuracy Tests, Benchmarking and Profiling

Tests in #9748 and #12115 are reproduced for verifying the robustness, accuracy and performance of this PR.

Test test_logprobs.py:

max Δ=0
mean Δ=0
logprobs returned for 500 samples (expected: 500)

✅ Test completed successfully!

Checklist

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@ch-wan
Copy link
Copy Markdown
Collaborator Author

ch-wan commented Dec 27, 2025

/tag-and-rerun-ci

@ch-wan ch-wan force-pushed the cheng/dev/simplify-logprob_start_len branch from b3fa171 to 49734da Compare December 27, 2025 08:14
@ch-wan ch-wan force-pushed the cheng/dev/simplify-logprob_start_len branch from 49734da to 6ade519 Compare December 27, 2025 10:08
# When return_logprob = False, only need last token per request
num_tokens_for_logprob = local_batch.batch_size()
)
if not local_batch.return_logprob:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be removed before this PR getting merged

@ch-wan ch-wan changed the title refactor: simplify logprob_start_len handling [scheduler] fix: correcting extend_logprob_start_len calculation Dec 28, 2025
@ch-wan
Copy link
Copy Markdown
Collaborator Author

ch-wan commented Dec 28, 2025

/rerun-failed-ci

f"extend_logprob_start_lens={local_batch.extend_logprob_start_lens}"
)
print(f"extend_lens={local_batch.extend_lens}")
assert (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a test case so the old code will fail / crash

@ch-wan ch-wan merged commit 6f9d0a8 into main Dec 28, 2025
138 of 159 checks passed
@ch-wan ch-wan deleted the cheng/dev/simplify-logprob_start_len branch December 28, 2025 22:57
# If return_logprob is True, return the logprobs for output tokens by default
req.logprob_start_len = len(req.origin_input_ids) - 1
else:
# If return_logprob is False, only the last token requires logprob computation
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous code, we didn't propogate logprob_start_len = -1 beyond the Scheduler. The value was reset to len(req.origin_input_ids) - 1. Hence, for prefill-only, we followed the same way.

Given, with this change we are propogating logprob_start_len = -1 into scheduler_batch, we can do the same logic for prefill_only as well instead of resetting to len(req.origin_input_ids) in line: 1531.

But, looks like, the current change doesn't break the prefill-only logprobs computation, so we are good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants