Skip to content

Update Qwen3.5 B200 FP4 MTP SGLang recipe#266

Open
faradawn wants to merge 5 commits into
sgl-project:mainfrom
faradawn:qwen35-b200-fp4-mtp-1257
Open

Update Qwen3.5 B200 FP4 MTP SGLang recipe#266
faradawn wants to merge 5 commits into
sgl-project:mainfrom
faradawn:qwen35-b200-fp4-mtp-1257

Conversation

@faradawn
Copy link
Copy Markdown
Contributor

@faradawn faradawn commented May 4, 2026

Extend the SGLANG_ENABLE_SPEC_V2=1 env prefix to cover B200 FP4 + MTP. The other flags (--enable-symm-mem, --speculative-* EAGLE, prefill 16384, stream 50, --tokenizer-worker-num, etc.) are already handled by the FP4 base block (#264) and the speculative option. Stacks on #263 (envPrefix scaffold) and #264 (FP4 base flags). Based on SemiAnalysisAI/InferenceX#1257.

Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>
Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>
Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>
Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>
Comment thread docs/autoregressive/Qwen/Qwen3.5.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants