Update Qwen3.5 B200 FP8 SGLang recipe by faradawn · Pull Request #262 · sgl-project/sgl-cookbook

faradawn · 2026-04-30T18:06:31Z

Update the Qwen3.5 B200 FP8 SGLang recipe to match the latest validated benchmark — emit --enable-symm-mem, --disable-radix-cache, --moe-runner-backend flashinfer_trtllm, --max-prefill-tokens 16384, --chunked-prefill-size 16384, --stream-interval 50, and drop --enable-flashinfer-allreduce-fusion. Based on SemiAnalysisAI/InferenceX#1027.

Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>

Update Qwen3.5 B200 FP8 SGLang recipe

0fb0081

Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>

hshrivastava-droid mentioned this pull request Apr 30, 2026

Update Qwen3.5 FP8 B200 SGLang SemiAnalysisAI/InferenceX#1027

Merged

faradawn mentioned this pull request May 1, 2026

Update Qwen3.5 B200 FP8 MTP SGLang recipe #263

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Qwen3.5 B200 FP8 SGLang recipe#262

Update Qwen3.5 B200 FP8 SGLang recipe#262
faradawn wants to merge 1 commit intosgl-project:mainfrom
faradawn:qwen35-b200-fp8

faradawn commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

faradawn commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant