Add SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 MTP by faradawn · Pull Request #2 · faradawn/sgl-cookbook

faradawn · 2026-04-14T16:41:03Z

Enable SGLang speculative decoding v2 engine for the Qwen3.5 FP8 H200 + speculative decoding (MTP) combination by prepending `SGLANG_ENABLE_SPEC_V2=1` to the serve command. Based on SemiAnalysisAI/InferenceX#1017.

Enable SGLang speculative decoding v2 engine for the Qwen3.5 FP8 H200 + MTP combination, as validated in SemiAnalysisAI/InferenceX#1017. Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>

faradawn · 2026-04-14T16:42:36Z

Closing in favor of a PR directly against sgl-project/sgl-cookbook.

faradawn closed this Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 MTP#2

Add SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 MTP#2
faradawn wants to merge 1 commit into
mainfrom
qwen35-h200-fp8-mtp-specv2

faradawn commented Apr 14, 2026

Uh oh!

faradawn commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

faradawn commented Apr 14, 2026

Uh oh!

faradawn commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant