[EPLB] Simplify EPLB rearrange by only returning one map by SageMoore · Pull Request #36267 · vllm-project/vllm

SageMoore · 2026-03-06T17:26:24Z

Purpose

On main EPLB rearrange returns 3 maps: physical expert -> logical expert, logical expert -> physical expert, and logical expert -> replica count. The latter two are derivable from the physical expert -> logical expert mapping.

With this change, we only have to keep track of the physical expert -> logical expert map in when running EPLB. Once weight transfer has finished and EPLB is ready to make the changes visible to the model, it will generate the other two maps and write them directly into the model state.

This change deletes a lot of code and reduces the amount of state that EPLB has to maintain when running async. Functionally the only change is "delaying" the creation of the second two maps until after weight transfer has completed when running async. The rest is purely cosmetic.

It also opens the door for post-processing the result of EPLB rearrange without having to re-update all of the maps.

Test Plan

Sync EPLB Server Command

vllm serve deepseek-ai/DeepSeek-V2-Lite -dp 2 --enable-expert-parallel --all2all_backend deepep_low_latency --enable-eplb --eplb-config '{"window_size":20,"step_interval":20}'

Async EPLB Server Command

vllm serve deepseek-ai/DeepSeek-V2-Lite -dp 2 --enable-expert-parallel --all2all_backend deepep_low_latency --enable-eplb --eplb-config '{"window_size":20,"step_interval":20,"use_async":"true"}'

Test Result

Sync EPLB

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.3867|±  |0.0282|
|     |       |strict-match    |     5|exact_match|↑  |0.3833|±  |0.0281|

Async EPLB

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.3733|±  | 0.028|
|     |       |strict-match    |     5|exact_match|↑  |0.3733|±  | 0.028|

Signed-off-by: Sage Moore <sage@neuralmagic.com>

gemini-code-assist

Code Review

This pull request refactors the EPLB rearrange logic to simplify state management by deriving two expert mapping tensors from a single one, instead of computing and storing all three. This is a good simplification that reduces complexity. The overall changes look correct and align with the stated purpose. However, I've found a couple of critical issues in the new compute_logical_maps helper function where it doesn't correctly handle negative indices for unused experts, which could lead to incorrect behavior or crashes. I've provided detailed comments and suggestions to address these bugs.

_{Note: Security Review is unavailable for this PR.}

vllm/distributed/eplb/eplb_state.py

Signed-off-by: Sage Moore <sage@neuralmagic.com>

ilmarkov

Makes sense. Thank you!

…t#36267) Signed-off-by: Sage Moore <sage@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>

…t#36267) Signed-off-by: Sage Moore <sage@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>

…t#36267) Signed-off-by: Sage Moore <sage@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>

…t#36267) Signed-off-by: Sage Moore <sage@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>

SageMoore added 5 commits March 5, 2026 23:08

sync eplb works, async hangs

686b024

Signed-off-by: Sage Moore <sage@neuralmagic.com>

more cleanup

7e7db8f

Signed-off-by: Sage Moore <sage@neuralmagic.com>

padding fix

d6f5445

Signed-off-by: Sage Moore <sage@neuralmagic.com>

rebalance cleanup

765b836

Signed-off-by: Sage Moore <sage@neuralmagic.com>

more rebalance cleanup

42f4b4b

Signed-off-by: Sage Moore <sage@neuralmagic.com>

gemini-code-assist bot reviewed Mar 6, 2026

View reviewed changes

vllm/distributed/eplb/eplb_state.py Show resolved Hide resolved

vllm/distributed/eplb/eplb_state.py Outdated Show resolved Hide resolved

SageMoore added 5 commits March 6, 2026 17:44

fix test_eplb_algo unit test

83e7219

Signed-off-by: Sage Moore <sage@neuralmagic.com>

added support for -1 logical expert ids

faba44f

Signed-off-by: Sage Moore <sage@neuralmagic.com>

comments

933e70c

Signed-off-by: Sage Moore <sage@neuralmagic.com>

test cleanup

a0534f3

Signed-off-by: Sage Moore <sage@neuralmagic.com>

Merge branch 'main' into sage/eplb-rearrange-refactor

a7df38e

SageMoore marked this pull request as ready for review March 11, 2026 15:40

Merge branch 'main' into sage/eplb-rearrange-refactor

b197841

ilmarkov approved these changes Mar 16, 2026

View reviewed changes

tlrmchlsmth added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 18, 2026

tlrmchlsmth approved these changes Mar 18, 2026

View reviewed changes

Merge branch 'main' into sage/eplb-rearrange-refactor

ded557f

tlrmchlsmth merged commit c32a58c into vllm-project:main Mar 19, 2026
56 checks passed

SageMoore deleted the sage/eplb-rearrange-refactor branch March 19, 2026 13:01

fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026

[EPLB] Simplify EPLB rearrange by only returning one map (vllm-projec…

e4b9683

…t#36267) Signed-off-by: Sage Moore <sage@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>

JiantaoXu pushed a commit to JiantaoXu/vllm that referenced this pull request Mar 28, 2026

[EPLB] Simplify EPLB rearrange by only returning one map (vllm-projec…

8c6dddf

…t#36267) Signed-off-by: Sage Moore <sage@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[EPLB] Simplify EPLB rearrange by only returning one map#36267

[EPLB] Simplify EPLB rearrange by only returning one map#36267
tlrmchlsmth merged 12 commits intovllm-project:mainfrom
neuralmagic:sage/eplb-rearrange-refactor

SageMoore commented Mar 6, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

ilmarkov left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

SageMoore commented Mar 6, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

ilmarkov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SageMoore commented Mar 6, 2026 •

edited by github-actions bot

Loading