Draft: FP4 Disagg + MTP Configs by trevor-m · Pull Request #48 · ishandhanani/srt-slurm

trevor-m · 2025-12-02T23:35:10Z

Adds two configs:

1p2d-mtp - Required minimal changes from original 1p2d config. I just had to add the speculative args and reduce the memory fraction on the decode nodes.
max-tpt-2-mtp - This one required a lot of workarounds:
- To avoid SGLANG_DEEPEP_NUM_MAX_DISPATCH_TOKENS_PER_RANK cap of 1024, I basically halved all of the settings related to tokens to account for speculative-num-steps=1.
  - max running requests: 67584->33792
  - cuda graph max bs: 1024->512 (later reduced to 256)
  - num reserved decode tokens 112->224
- Use shell script patch for --speculative-moe-a2a-backend (remove once support mtp with deepseek r1 nvfp4 model sgl-project/sglang#13115 is merged)
- After those changes I encountered OOMs on the decode side during draft mode cuda capture. I reduced the mem fraction from 0.83->0.73 but it didn't appear to make a difference. I ultimately just reduced the cuda graph bs to 256.
- On the prefill side, there is a bug with single batch overlap and the speculative layer. I just disabled SBO for now

ishandhanani · 2025-12-02T23:38:41Z

Can you rename that shell script and add a comment for why its used and when we should remove it. Im assumign we wil remove once we ship sgl > 0.5.6

trevor-m · 2025-12-02T23:45:43Z

Thanks, yes it can be removed once sgl-project/sglang#13115 is merged.

I also updated the PR description with all of the changes I made to the configs.

Fridge003 · 2025-12-03T18:30:32Z

@trevor-m Is there any accuracy or performance data

ishandhanani · 2025-12-04T07:09:31Z

configs/checkout-branch.sh

@@ -0,0 +1,8 @@
+#!/bin/bash


can we rename this file to gb200-fp4-mtp-setup.sh

i dont have a good way of organizing these lol so descriptive naming is probably the best

What about checkout-pr-13115.sh?
I also found out we need to disable the engine patch since this PR is based on main. I tried rebasing to 0.5.5 but it might have some dependencies on other PRs.

I see. I can assist with this tomorrow

trevor-m · 2025-12-04T22:58:44Z

@Fridge003

@trevor-m Is there any accuracy or performance data

This is the pareto with the low latency config:

Let me try to check the accuracy.
The high throughput one is still not ready.

Fridge003 · 2025-12-06T09:01:35Z

@trevor-m @ishandhanani
sgl-project/sglang#13115 is merged

FP4 disagg + MTP recipe

c399d70

ishandhanani reviewed Dec 4, 2025

View reviewed changes

ishandhanani approved these changes Dec 4, 2025

View reviewed changes

Fridge003 merged commit e526265 into ishandhanani:main Dec 6, 2025

ishandhanani mentioned this pull request Dec 17, 2025

sgl-router and docs #61

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: FP4 Disagg + MTP Configs#48

Draft: FP4 Disagg + MTP Configs#48
Fridge003 merged 1 commit intoishandhanani:mainfrom
trevor-m:disagg-mtp

trevor-m commented Dec 2, 2025 •

edited

Loading

Uh oh!

ishandhanani commented Dec 2, 2025

Uh oh!

trevor-m commented Dec 2, 2025 •

edited

Loading

Uh oh!

Fridge003 commented Dec 3, 2025

Uh oh!

ishandhanani Dec 4, 2025

Uh oh!

trevor-m Dec 4, 2025

Uh oh!

ishandhanani Dec 5, 2025

Uh oh!

trevor-m commented Dec 4, 2025 •

edited

Loading

Uh oh!

Fridge003 commented Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

trevor-m commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ishandhanani commented Dec 2, 2025

Uh oh!

trevor-m commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fridge003 commented Dec 3, 2025

Uh oh!

ishandhanani Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

trevor-m Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

ishandhanani Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

trevor-m commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fridge003 commented Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

trevor-m commented Dec 2, 2025 •

edited

Loading

trevor-m commented Dec 2, 2025 •

edited

Loading

trevor-m commented Dec 4, 2025 •

edited

Loading