Draft: FP4 Disagg + MTP Configs#48
Conversation
|
Can you rename that shell script and add a comment for why its used and when we should remove it. Im assumign we wil remove once we ship sgl > 0.5.6 |
|
Thanks, yes it can be removed once sgl-project/sglang#13115 is merged. I also updated the PR description with all of the changes I made to the configs. |
|
@trevor-m Is there any accuracy or performance data |
| @@ -0,0 +1,8 @@ | |||
| #!/bin/bash | |||
There was a problem hiding this comment.
can we rename this file to gb200-fp4-mtp-setup.sh
i dont have a good way of organizing these lol so descriptive naming is probably the best
There was a problem hiding this comment.
What about checkout-pr-13115.sh?
I also found out we need to disable the engine patch since this PR is based on main. I tried rebasing to 0.5.5 but it might have some dependencies on other PRs.
There was a problem hiding this comment.
I see. I can assist with this tomorrow
This is the pareto with the low latency config: |
|
@trevor-m @ishandhanani |

Adds two configs:
SGLANG_DEEPEP_NUM_MAX_DISPATCH_TOKENS_PER_RANKcap of 1024, I basically halved all of the settings related to tokens to account for speculative-num-steps=1.--speculative-moe-a2a-backend(remove once support mtp with deepseek r1 nvfp4 model sgl-project/sglang#13115 is merged)