[sgl] improve accuracy of additional page requirement during spec decode by 2022tgoel · Pull Request #22406 · sgl-project/sglang

2022tgoel · 2026-04-09T02:12:45Z

Motivation

The new_tokens_required_next_decode calculation was very conservative in calculating how many pages a batch will consume. I would like to replace it with a more realistic estimate that mimics the logic in eagle_info_v2.py

Modifications

Accuracy Tests

Speed Tests and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review and Merge Process

Ping Merge Oncalls to start the process. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

gemini-code-assist · 2026-04-09T02:12:49Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

ispobock · 2026-04-10T08:13:45Z

@2022tgoel could you fix the lint first?

Qiaolin-Yu

I feel like it's correct. But also cc @hnyls2002 for another check in case I might be missing some context.

hnyls2002 · 2026-04-16T22:26:48Z

/tag-and-rerun-ci

hnyls2002 · 2026-04-16T22:27:00Z

/rerun-test test_eagle_infer_a.py test_eagle_infer_b.py test_eagle_infer_beta.py test_eagle3_basic.py test_specv2_kvcache_offloading.py test_swa_radix_cache_kl.py test_swa_unittest.py test_eagle_dp_attention.py

github-actions · 2026-04-16T22:28:19Z

✅ 1-gpu-h100: View workflow run

cd test/ && python3 registered/spec/eagle/test_eagle_infer_a.py

✅ 1-gpu-h100: View workflow run

cd test/ && python3 registered/spec/eagle/test_eagle_infer_b.py

✅ 1-gpu-5090: View workflow run

cd test/ && python3 registered/spec/eagle/test_eagle_infer_beta.py

✅ 1-gpu-5090: View workflow run

cd test/ && python3 registered/spec/eagle/test_eagle3_basic.py

✅ 1-gpu-5090: View workflow run

cd test/ && python3 registered/disaggregation/test_specv2_kvcache_offloading.py

✅ 1-gpu-h100: View workflow run

cd test/ && python3 registered/radix_cache/test_swa_radix_cache_kl.py

✅ 1-gpu-h100: View workflow run

cd test/ && python3 registered/unit/mem_cache/test_swa_unittest.py

✅ 4-gpu-h100: View workflow run

cd test/ && python3 registered/spec/eagle/test_eagle_dp_attention.py

…ode (sgl-project#22406)

fix

9ae89c7

2022tgoel requested review from Ying1123, hanming-lu, hnyls2002, hzh0425, ispobock, merrymercy, xiezhq-hermann and yizhang2077 as code owners April 9, 2026 02:12

ispobock assigned Qiaolin-Yu Apr 10, 2026

Qiaolin-Yu reviewed Apr 10, 2026

View reviewed changes

2022tgoel and others added 2 commits April 10, 2026 19:13

fix lint

04337a6

simplify with ceil_align

e88823c

hnyls2002 approved these changes Apr 16, 2026

View reviewed changes

github-actions Bot added the run-ci label Apr 16, 2026

hnyls2002 merged commit 2211b4d into sgl-project:main Apr 16, 2026
82 of 131 checks passed

jmamou pushed a commit to jmamou/sglang that referenced this pull request Apr 20, 2026

[sgl] improve accuracy of additional page requirement during spec dec…

c5175f7

…ode (sgl-project#22406)

yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026

[sgl] improve accuracy of additional page requirement during spec dec…

8d78acb

…ode (sgl-project#22406)

zhangying098 pushed a commit to zhangying098/sglang that referenced this pull request Apr 23, 2026

[sgl] improve accuracy of additional page requirement during spec dec…

9b64b07

…ode (sgl-project#22406)

kyx1999 pushed a commit to KMSorSMS/sglang that referenced this pull request Apr 27, 2026

[sgl] improve accuracy of additional page requirement during spec dec…

e640448

…ode (sgl-project#22406)

hnyls2002 mentioned this pull request Apr 29, 2026

Deepseek V4 #23882

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[sgl] improve accuracy of additional page requirement during spec decode#22406

[sgl] improve accuracy of additional page requirement during spec decode#22406
hnyls2002 merged 3 commits intosgl-project:mainfrom
2022tgoel:tarushii/mtp-6

2022tgoel commented Apr 9, 2026

Uh oh!

gemini-code-assist Bot commented Apr 9, 2026

Uh oh!

ispobock commented Apr 10, 2026

Uh oh!

Qiaolin-Yu left a comment

Uh oh!

hnyls2002 commented Apr 16, 2026

Uh oh!

hnyls2002 commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

2022tgoel commented Apr 9, 2026

Motivation

Modifications

Accuracy Tests

Speed Tests and Profiling

Checklist

Review and Merge Process

Uh oh!

gemini-code-assist Bot commented Apr 9, 2026

Uh oh!

ispobock commented Apr 10, 2026

Uh oh!

Qiaolin-Yu left a comment

Choose a reason for hiding this comment

Uh oh!

hnyls2002 commented Apr 16, 2026

Uh oh!

hnyls2002 commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants