Skip to content

[dsv4] support eplb#25948

Merged
ch-wan merged 2 commits into
sgl-project:mainfrom
SYChen123:dsv4-eplb
May 24, 2026
Merged

[dsv4] support eplb#25948
ch-wan merged 2 commits into
sgl-project:mainfrom
SYChen123:dsv4-eplb

Conversation

@SYChen123
Copy link
Copy Markdown
Contributor

@SYChen123 SYChen123 commented May 21, 2026

Motivation

Support EPLB for deepseek v4 (with megaMoE disabled).

Modifications

Add the eplb context in the forward of DeepseekV4Model. Without the context, the layer_idx used in eplb distribution gatherer is None which causes crash in prefill node and stats inaccuracy issue in decode node.

Accuracy Tests

Not relevant.

Speed Tests and Profiling

With this pr, eplb works normally on both prefill and decode.

Server launch command
SGLANG_OPT_SWA_SPLIT_LEAF_ON_INSERT=1 SGLANG_DEEPEP_NUM_MAX_DISPATCH_TOKENS_PER_RANK=128 python3 -m sglang.launch_server --trust-remote-code --model-path ./models/deepseek-ai/DeepSeek-V4-Pro/ --disable-cuda-graph --tp 8 --dp 8 --enable-dp-attention --enable-dp-lm-head --moe-a2a-backend deepep --chunked-prefill-size 32768 --enable-prefill-delayer --swa-full-tokens-ratio 0.1 --mem-fraction-static 0.87 --max-running-requests 512 --deepep-mode normal --disaggregation-mode prefill --disaggregation-transfer-backend mooncake --disaggregation-ib-device mlx5_4 --dist-init-addr 192.168.184.178:5757 --nnodes 1 --node-rank 0 --host 0.0.0.0 --port 9091 --enable-expert-distribution-metrics --enable-eplb --eplb-min-rebalancing-utilization-threshold 0.6 --eplb-rebalance-num-iterations 5000 --ep-num-redundant-experts 32 --ep-dispatch-algorithm dynamic --eplb-algorithm deepseek --init-expert-location ./expert_distribution_recorder.pt

image

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

CI States

Latest PR Test (Base): ❌ Run #26363979763
Latest PR Test (Extra): ❌ Run #26363979729

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@fzyzcjy
Copy link
Copy Markdown
Collaborator

fzyzcjy commented May 21, 2026

general direction LGTM, cc @ch-wan for this assigned bug to double check

@Kangyan-Zhou Kangyan-Zhou requested a review from ch-wan May 22, 2026 05:09
@ch-wan ch-wan self-assigned this May 24, 2026
@ch-wan ch-wan merged commit 7f45bcd into sgl-project:main May 24, 2026
75 of 87 checks passed
Shunkangz pushed a commit to Shunkangz/sglang that referenced this pull request May 27, 2026
Co-authored-by: xutizhou <xutingz@nvidia.com>
mqhc2020 pushed a commit to mqhc2020/sglang that referenced this pull request Jun 2, 2026
Co-authored-by: xutizhou <xutingz@nvidia.com>
alphabetc1 pushed a commit to alphabetc1/sglang that referenced this pull request Jun 4, 2026
Co-authored-by: xutizhou <xutingz@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants