Fix: PrefillAdder.add_chunked_req with negative rem_total_tokens with pp by strgrb · Pull Request #13698 · sgl-project/sglang

strgrb · 2025-11-21T04:19:57Z

Motivation

I need to run Ring-1T and Ling-1T model with tp8pp4, and met up with an error

The error may differ, but new-token is negative every time. It does not appear without pp.
This pr try to fix this problem.

Finally I found self.rem_total_tokens become negative in PrefillAdder.add_chunked_req , it's computed by available_and_evictable - self.rem_total_token_offset , and budgets for running requests are added to self.rem_total_token_offset .
After merging of last_batch and running_batch, this budget will increase, and if this increase to an amount larger than available size, PrefillAdder.add_chunked_req will calculate a negative extend input len.

Modifications

Since chunked_req's req_pool_idx is freed in Scheduler.get_next_batch_to_run , we should check remaining tokens excluding budgets for running requests here, and avoid freeing chunked_req's req_pool_idx. Since budget is not enough for decoding now, it's time for decoding, and chunked_req is hanged, waiting for budget is enough.

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.
Work with maintainers to merge your PR. See the PR Merge Process

gemini-code-assist · 2025-11-21T04:20:01Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

ShangmingCai · 2025-11-21T05:31:22Z

Will review this PR today. @strgrb Can you test whether this bug is still happening in #11852?

cc: @XucSh @whybeyoung

strgrb · 2025-11-21T05:46:37Z

Will review this PR today. @strgrb Can you test whether this bug is still happening in #11852?

cc: @XucSh @whybeyoung

OK, I'll try it.

ShangmingCai

Changes don't look like it is related to PP, more likely related to bybrid memory.

CC: @xiezhq-hermann @yizhang2077

strgrb · 2025-11-24T02:48:25Z

Changes don't look like it is related to PP, more likely related to bybrid memory.

CC: @xiezhq-hermann @yizhang2077

It's not related to hybrid memory, it's just for budget calculation where hybrid is used. The real reason is about budget, i.e. budget for chunked request is negative with pp situation.

strgrb · 2025-11-24T03:25:45Z

@ShangmingCai Should I move this budget check logic to PrefillAdder , which rem_total_tokens() can be used directly?

ShangmingCai · 2025-11-25T02:48:51Z

@ShangmingCai Should I move this budget check logic to PrefillAdder , which rem_total_tokens() can be used directly?

@strgrb If it is related to pp only, maybe you could try reverting this commit locally: #13144. If the bug still exists, then it might not be related to the pp. Also, you can try changing the attention backend to test whether this is a bug of flashinfer?

xiezhq-hermann · 2025-11-27T23:35:44Z

let's get this problem fixed after this refactoring to simplify the logics

fix new-token is negative without mem leak.

d7f9b82

strgrb requested review from Ying1123, hnyls2002, merrymercy, xiezhq-hermann and zhyncs as code owners November 21, 2025 04:19

zhyncs added run-ci high priority labels Nov 21, 2025

zhyncs assigned Fridge003 and ShangmingCai Nov 21, 2025

Merge branch 'main' into fix/pp-negative-rem-tokens

99b4f39

ShangmingCai reviewed Nov 21, 2025

View reviewed changes

xiezhq-hermann assigned yizhang2077 and xiezhq-hermann Nov 21, 2025

strgrb mentioned this pull request Dec 1, 2025

Simplify schedule policy implementation #13939

Open

6 tasks

strgrb closed this Feb 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: PrefillAdder.add_chunked_req with negative rem_total_tokens with pp#13698

Fix: PrefillAdder.add_chunked_req with negative rem_total_tokens with pp#13698
strgrb wants to merge 2 commits intosgl-project:mainfrom
antgroup:fix/pp-negative-rem-tokens

strgrb commented Nov 21, 2025

Uh oh!

gemini-code-assist bot commented Nov 21, 2025

Uh oh!

ShangmingCai commented Nov 21, 2025 •

edited

Loading

Uh oh!

strgrb commented Nov 21, 2025

Uh oh!

ShangmingCai left a comment

Uh oh!

strgrb commented Nov 24, 2025

Uh oh!

strgrb commented Nov 24, 2025

Uh oh!

ShangmingCai commented Nov 25, 2025

Uh oh!

xiezhq-hermann commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

strgrb commented Nov 21, 2025

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gemini-code-assist bot commented Nov 21, 2025

Uh oh!

ShangmingCai commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

strgrb commented Nov 21, 2025

Uh oh!

ShangmingCai left a comment

Choose a reason for hiding this comment

Uh oh!

strgrb commented Nov 24, 2025

Uh oh!

strgrb commented Nov 24, 2025

Uh oh!

ShangmingCai commented Nov 25, 2025

Uh oh!

xiezhq-hermann commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ShangmingCai commented Nov 21, 2025 •

edited

Loading