docs: add AMD MI300X/MI325X/MI350X/MI355X to Hunyuan 3 Preview cookbook by andyluo7 · Pull Request #23582 · sgl-project/sglang

andyluo7 · 2026-04-23T19:31:18Z

Summary

Adds AMD MI300X / MI325X / MI350X / MI355X to the Hunyuan 3 Preview cookbook entry that landed in #23532. End-to-end validated on a single 8×MI300X (gfx942) node and an 8×MI355X (gfx950) node with TP=8, BF16, both with and without MTP speculative decoding. MI325X (gfx942) and MI350X (gfx950) are listed by hardware parity; the same image, file overlay, and flags apply.

Changes

docs_new/cookbook/autoregressive/Tencent/Hunyuan3-Preview.mdx

metatags.description: mention NVIDIA + AMD instead of NVIDIA only.
§2 SGLang Installation: add two AMD rows to the Docker-image table:
- AMD MI300X / MI325X → rocm/sgl-dev:v0.5.10.post1-rocm720-mi30x-20260423
- AMD MI350X / MI355X → rocm/sgl-dev:v0.5.10.post1-rocm720-mi35x-20260423
- Caption updated to point AMD users to the Configuration Tips AMD subsection until support Hy3 preview #23533 (Hy3-preview model code) and [AMD] Default SGLANG_USE_AITER_AR to false to avoid HIP graph capture invalidation #23581 (HIP CUDA-graph fix) ship in rocm/sgl-dev.
§3.2 Configuration Tips: add a Hardware Requirements: AMD BF16 block mirroring the NVIDIA one, plus a new AMD MI300X / MI325X / MI350X / MI355X subsection with the three-step recipe:
1. Pull a current AMD nightly image
2. Overlay the 11 model/config files from PR support Hy3 preview #23533 onto the editable /sgl-workspace/sglang install + pip install -U "transformers>=5.6.0"
3. Set SGLANG_USE_AITER_AR=0 (the default-flip is filed in [AMD] Default SGLANG_USE_AITER_AR to false to avoid HIP graph capture invalidation #23581)
Followed by full no-MTP and MTP python3 -m sglang.launch_server … examples and a forward-looking note that none of the workaround steps will be needed once support Hy3 preview #23533 and [AMD] Default SGLANG_USE_AITER_AR to false to avoid HIP graph capture invalidation #23581 ship in rocm/sgl-dev.

docs_new/src/snippets/autoregressive/hunyuan3-preview-deployment.jsx

hardware items: add MI300X, MI325X, MI350X, MI355X.
modelConfigs: add the four AMD platforms with tp=8, mem=0.85 (matches the in-recipe text).
generateCommand:
- Detects AMD and prepends SGLANG_USE_AITER_AR=0 to the generated command.
- On AMD MTP, uses --speculative-num-steps 1 --speculative-num-draft-tokens 2 (model card's MTP recipe), versus 3 / 4 on NVIDIA.

Why no AMD perf table in this PR

Performance on AMD will improve once #23581 lands (the workaround forces a non-AITER all-reduce path). Posting AMD perf numbers under the workaround configuration would be a snapshot of a transient state. We can add an AMD perf section in a follow-up PR after #23581 merges and the AMD-tuned MTP defaults stabilize.

Validation

The recipe text was sanity-tested against a real deployment: server boots cleanly at TP=8 with cuda-graph capture, smoke and concurrent inference produce coherent output on both MI300X and MI355X nodes.

Refs

docs: add Hunyuan 3 Preview cookbook #23532 — original Hunyuan 3 Preview cookbook PR (NVIDIA-only)
support Hy3 preview #23533 — Hy3-preview model code support (in progress)
[Bug] Hy3-preview cuda-graph crash on AMD MI300X/MI355X due to AITER custom all-reduce stream invalidation #23580 — bug report: HIP CUDA-graph capture invalidation by AITER custom all-reduce
[AMD] Default SGLANG_USE_AITER_AR to false to avoid HIP graph capture invalidation #23581 — fix for the bug (SGLANG_USE_AITER_AR default flipped on HIP)

End-to-end validated on a single 8xMI300X (gfx942) node and an 8xMI355X (gfx950) node with TP=8, BF16, both with and without MTP speculative decoding. MI325X (gfx942) and MI350X (gfx950) are listed by hardware parity; the same image, file overlay, and flags apply to those GPUs. Changes: docs_new/cookbook/autoregressive/Tencent/Hunyuan3-Preview.mdx: - metatag description: mention NVIDIA + AMD. - Section 2 (SGLang Installation): add two AMD rows to the Docker image table (rocm/sgl-dev mi30x and mi35x nightlies). Updated the table caption to reference PR sgl-project#23533 (model code) for AMD users. - Section 3.2 (Configuration Tips): add a new "AMD MI300X / MI325X / MI350X / MI355X" subsection with the three-step recipe (pull AMD nightly image -> overlay PR sgl-project#23533 model files -> set SGLANG_USE_AITER_AR=0), full sglang serve commands for both no-MTP and MTP, and a forward-looking note that the workaround will be unnecessary once PRs sgl-project#23533 and sgl-project#23581 ship in rocm/sgl-dev. - Section 3.2 also adds a "Hardware Requirements: AMD BF16" block mirroring the existing NVIDIA one. docs_new/src/snippets/autoregressive/hunyuan3-preview-deployment.jsx: - hardware items: add MI300X, MI325X, MI350X, MI355X. - modelConfigs: add the four AMD platforms with tp=8, mem=0.85 (matching the recipe in the .mdx). - generateCommand: detect AMD and prepend SGLANG_USE_AITER_AR=0 to the generated command. Also tweak MTP defaults to (num-steps=1, num-draft-tokens=2) on AMD, matching the model card's MTP recipe. Refs: sgl-project#23533 (Hy3-preview model code support) sgl-project#23580 (HIP CUDA-graph capture invalidation bug) sgl-project#23581 (fix for the bug)

gemini-code-assist · 2026-04-23T19:31:22Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

andyluo7 requested a review from wisclmy0611 as a code owner April 23, 2026 19:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add AMD MI300X/MI325X/MI350X/MI355X to Hunyuan 3 Preview cookbook#23582

docs: add AMD MI300X/MI325X/MI350X/MI355X to Hunyuan 3 Preview cookbook#23582
andyluo7 wants to merge 1 commit into
sgl-project:mainfrom
andyluo7:add-amd-hy3-preview-cookbook

andyluo7 commented Apr 23, 2026

Uh oh!

gemini-code-assist Bot commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andyluo7 commented Apr 23, 2026

Summary

Changes

Why no AMD perf table in this PR

Validation

Refs

Uh oh!

gemini-code-assist Bot commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants