Skip to content

Add SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 MTP#2

Closed
faradawn wants to merge 1 commit into
mainfrom
qwen35-h200-fp8-mtp-specv2
Closed

Add SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 MTP#2
faradawn wants to merge 1 commit into
mainfrom
qwen35-h200-fp8-mtp-specv2

Conversation

@faradawn
Copy link
Copy Markdown
Owner

Enable SGLang speculative decoding v2 engine for the Qwen3.5 FP8 H200 + speculative decoding (MTP) combination by prepending `SGLANG_ENABLE_SPEC_V2=1` to the serve command. Based on SemiAnalysisAI/InferenceX#1017.

Enable SGLang speculative decoding v2 engine for the Qwen3.5 FP8 H200 + MTP combination, as validated in SemiAnalysisAI/InferenceX#1017.

Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>
@faradawn
Copy link
Copy Markdown
Owner Author

Closing in favor of a PR directly against sgl-project/sgl-cookbook.

@faradawn faradawn closed this Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant