Skip to content

Update Qwen3.5 B200 FP8 MTP SGLang recipe#263

Open
faradawn wants to merge 2 commits intosgl-project:mainfrom
faradawn:qwen35-b200-fp8-mtp-1065
Open

Update Qwen3.5 B200 FP8 MTP SGLang recipe#263
faradawn wants to merge 2 commits intosgl-project:mainfrom
faradawn:qwen35-b200-fp8-mtp-1065

Conversation

@faradawn
Copy link
Copy Markdown
Contributor

@faradawn faradawn commented May 1, 2026

Add B200 FP8 + MTP support — prepend SGLANG_ENABLE_SPEC_V2=1 and emit --tokenizer-worker-num 6 when MTP is enabled. Stacks on top of #262 which added the B200 FP8 base block. Based on SemiAnalysisAI/InferenceX#1065.

Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>
Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant