[core] Introduce `MemoryPoolConfigurator` class hierarchy by hnyls2002 · Pull Request #22389 · sgl-project/sglang

hnyls2002 · 2026-04-08T23:09:14Z

Summary

Introduce MemoryPoolConfigurator base class with unified coeff+bias interface (calculate_pool_sizes / calculate_pool_sizes_from_max_tokens)
Add DefaultPoolConfigurator for MHA/MLA/NSA/FP4 — absorbs get_cell_size_per_token, num_layers deduction, DFLASH scaling
Add HybridSWAPoolConfigurator for Gemma2/Command-R/MiMo — absorbs resolve_hybrid_swa_tokens with full/swa pool splitting
Add create_memory_pool_configurator() factory
Rewrite _resolve_memory_pool_config to use configurator flow
Delete absorbed methods: profile_max_num_token, _resolve_hybrid_swa_tokens
Move MemoryPoolConfig from model_runner_kv_cache_mixin.py to pool_configurator.py
Page alignment now owned by configurator; removed from _apply_token_constraints

Follows up on #22384. Mamba configurator is a separate follow-up.

Behavioral changes

Fix hybrid SWA _cell_size to use ratio-weighted formula (F*nf + r*S*ns), so --max-total-tokens correctly constrains full_tokens rather than inflating it through a memory-budget round-trip (pre-existing issue in the old code)
Configurator returns MemoryPoolConfig directly; max_running_requests default changed from required int to Optional[int] = None (filled by consumer after configurator runs)

Test plan

/rerun-stage stage-a-test-1
/rerun-stage stage-b-test-small-1-gpu
/rerun-stage stage-b-test-large-1-gpu

gemini-code-assist · 2026-04-08T23:09:18Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

…configurator

…tes returns int

…traints

hnyls2002 · 2026-04-09T04:52:06Z

/rerun-test test_swa_unittest.py test_mimo_models.py test_deepseek_v3_mtp.py test_dsa_models_mtp.py test_qwen3_next_models_mtp.py test_qwen35_models.py test_triton_sliding_window.py test_mamba_unittest.py test_mamba2_mixer.py test_nvidia_nemotron_nano_v2.py test_nvidia_nemotron_3_super_bf16.py test_mla_deepseek_v3.py test_generation_models.py

github-actions · 2026-04-09T04:53:13Z

✅ 1-gpu-h100 (4 tests): View workflow run

cd test/ && python3 registered/unit/mem_cache/test_swa_unittest.py
cd test/ && python3 registered/attention/test_triton_sliding_window.py
cd test/ && python3 registered/mla/test_mla_deepseek_v3.py
cd test/ && python3 registered/models/test_generation_models.py

✅ 8-gpu-h200 (3 tests): View workflow run

cd test/ && python3 registered/8-gpu-models/test_mimo_models.py
cd test/ && python3 registered/8-gpu-models/test_dsa_models_mtp.py
cd test/ && python3 registered/8-gpu-models/test_nvidia_nemotron_3_super_bf16.py

✅ 4-gpu-h100 (1 test): View workflow run

cd test/ && python3 registered/4-gpu-models/test_qwen3_next_models_mtp.py

✅ 4-gpu-b200 (1 test): View workflow run

cd test/ && python3 registered/4-gpu-models/test_qwen35_models.py

✅ 1-gpu-5090 (1 test): View workflow run

cd test/ && python3 registered/unit/mem_cache/test_mamba_unittest.py

✅ 2-gpu-h100 (2 tests): View workflow run

cd test/ && python3 registered/layers/mamba/test_mamba2_mixer.py
cd test/ && python3 registered/models/test_nvidia_nemotron_nano_v2.py

❌ test_deepseek_v3_mtp.py: Ambiguous filename test_deepseek_v3_mtp.py — matched 2 files:

test/registered/8-gpu-models/test_deepseek_v3_mtp.py
test/registered/amd/test_deepseek_v3_mtp.py

Please provide the full path, e.g. /rerun-test test/registered/8-gpu-models/test_deepseek_v3_mtp.py

…t#22389)

hnyls2002 requested review from Fridge003, Ying1123, ispobock and merrymercy as code owners April 8, 2026 23:09

Base automatically changed from lsyin/pool-configurator to main April 8, 2026 23:13

introduce MemoryPoolConfigurator class hierarchy

d235d6d

hnyls2002 force-pushed the lsyin/pool-configurator-v2 branch from 85165a2 to d235d6d Compare April 8, 2026 23:16

hnyls2002 added the run-ci label Apr 8, 2026

hnyls2002 added 3 commits April 8, 2026 16:24

unify all-SWA and hybrid SWA pool sizing

03fa3fd

configurator returns MemoryPoolConfig; move MemoryPoolConfig to pool_…

726d6bf

…configurator

import MemoryPoolConfig from pool_configurator directly

893146a

hnyls2002 added the high priority label Apr 8, 2026

hnyls2002 requested a review from xiezhq-hermann as a code owner April 8, 2026 23:38

hnyls2002 added 5 commits April 8, 2026 17:14

cleanup: _compute_cell_size reads from mr only; _profile_available_by…

478132d

…tes returns int

fix docstring; remove redundant page alignment from _apply_token_cons…

393d338

…traints

fix SWA cell_size to include ratio; simplify _solve_pool_sizes

2027e46

handle all-SWA edge case; int() for float cell_size division

545ae79

fix all-SWA _cell_size: use S*ns without ratio

6fde4f6

hnyls2002 mentioned this pull request Apr 9, 2026

[Test] Add CPU unit tests for MemoryPoolConfigurator #22420

Merged

1 task

ispobock reviewed Apr 9, 2026

View reviewed changes

Comment thread python/sglang/srt/model_executor/pool_configurator.py

Comment thread python/sglang/srt/model_executor/pool_configurator.py

ispobock approved these changes Apr 9, 2026

View reviewed changes

ispobock merged commit de441ac into main Apr 9, 2026
194 of 247 checks passed

ispobock deleted the lsyin/pool-configurator-v2 branch April 9, 2026 07:29

yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026

[core] Introduce MemoryPoolConfigurator class hierarchy (sgl-projec…

179dace

…t#22389)

hnyls2002 mentioned this pull request Apr 29, 2026

Deepseek V4 #23882

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] Introduce `MemoryPoolConfigurator` class hierarchy#22389

[core] Introduce `MemoryPoolConfigurator` class hierarchy#22389
ispobock merged 9 commits into
mainfrom
lsyin/pool-configurator-v2

hnyls2002 commented Apr 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Apr 8, 2026

Uh oh!

hnyls2002 commented Apr 9, 2026

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hnyls2002 commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Behavioral changes

Test plan

Uh oh!

gemini-code-assist Bot commented Apr 8, 2026

Uh oh!

hnyls2002 commented Apr 9, 2026

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hnyls2002 commented Apr 8, 2026 •

edited

Loading