Update Qwen3.5 B200 FP4 SGLang recipe by faradawn · Pull Request #264 · sgl-project/sgl-cookbook

faradawn · 2026-05-01T17:39:41Z

Update the Qwen3.5 B200 FP4 SGLang recipe to match the latest validated benchmark — switch B200 FP4 to tp=2 mem=0.8, drop --fp4-gemm-backend and --max-running-requests, add --enable-symm-mem and --mamba-ssm-dtype bfloat16, lower prefill/chunked sizes to 16384, and bump --stream-interval to 50. Based on SemiAnalysisAI/InferenceX#1018.

Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>

Update Qwen3.5 B200 FP4 SGLang recipe

930a1ac

Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>

functionstackx mentioned this pull request May 1, 2026

[NV] Update Qwen3.5 FP4 B200 SGLang SemiAnalysisAI/InferenceX#1018

Merged

hshrivastava-droid mentioned this pull request May 4, 2026

[NV] qwen35 b200 MTP update sglang config SemiAnalysisAI/InferenceX#1065

Merged

faradawn mentioned this pull request May 4, 2026

Update Qwen3.5 B200 FP4 MTP SGLang recipe #266

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Qwen3.5 B200 FP4 SGLang recipe#264

Update Qwen3.5 B200 FP4 SGLang recipe#264
faradawn wants to merge 1 commit into
sgl-project:mainfrom
faradawn:qwen35-b200-fp4-1018

faradawn commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

faradawn commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant