Skip to content

Add SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 MTP#240

Open
faradawn wants to merge 1 commit into
sgl-project:mainfrom
faradawn:qwen35-h200-fp8-mtp-specv2
Open

Add SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 MTP#240
faradawn wants to merge 1 commit into
sgl-project:mainfrom
faradawn:qwen35-h200-fp8-mtp-specv2

Conversation

@faradawn
Copy link
Copy Markdown
Contributor

Enable SGLang speculative decoding v2 engine for the Qwen3.5 FP8 H200 + speculative decoding (MTP) combination by prepending `SGLANG_ENABLE_SPEC_V2=1` to the serve command in the ConfigGenerator. Based on SemiAnalysisAI/InferenceX#1017.

Enable SGLang speculative decoding v2 engine for the Qwen3.5 FP8 H200 + MTP combination, as validated in SemiAnalysisAI/InferenceX#1017.

Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant