Add SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 MTP by faradawn · Pull Request #240 · sgl-project/sgl-cookbook

faradawn · 2026-04-14T16:42:44Z

Enable SGLang speculative decoding v2 engine for the Qwen3.5 FP8 H200 + speculative decoding (MTP) combination by prepending `SGLANG_ENABLE_SPEC_V2=1` to the serve command in the ConfigGenerator. Based on SemiAnalysisAI/InferenceX#1017.

Enable SGLang speculative decoding v2 engine for the Qwen3.5 FP8 H200 + MTP combination, as validated in SemiAnalysisAI/InferenceX#1017. Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>

hshrivastava-droid mentioned this pull request Apr 14, 2026

[NV] Update: sglang v2 Qwen3.5 h200 MTP SemiAnalysisAI/InferenceX#1017

Merged

faradawn mentioned this pull request Apr 16, 2026

Add Qwen3.5 FP8 B300 MTP config with spec v2 #247

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 MTP#240

Add SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 MTP#240
faradawn wants to merge 1 commit into
sgl-project:mainfrom
faradawn:qwen35-h200-fp8-mtp-specv2

faradawn commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

faradawn commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant