[Hy3-preview] Add AMD MI300X/MI325X/MI350X/MI355X support by andyluo7 · Pull Request #368 · vllm-project/recipes

andyluo7 · 2026-04-23T19:26:52Z

Summary

Adds AMD MI300X / MI325X / MI350X / MI355X to the verified hardware list for the tencent/Hy3-preview recipe.

End-to-end validated on a single 8×MI300X (gfx942) node (SVR08) and an 8×MI355X (gfx950) node (Tensorwave mia1-p01-g07) with TP=8, BF16, both with and without MTP speculative decoding. MI325X (gfx942) and MI350X (gfx950) are listed as verified by hardware parity; the same image, build, and flags apply to those GPUs.

Changes

models/tencent/Hy3-preview.yaml only. No code changes elsewhere.

meta.hardware: add mi300x: verified, mi325x: verified, mi350x: verified, mi355x: verified.
meta.performance_headline: extended to mention AMD platforms.
hardware_overrides.amd:
- install_note explaining that until vLLM PR #40681 (Hy3-preview model code) merges, AMD users must build vLLM editable from the PR branch into the published rocm/vllm-dev:nightly image. Includes the canonical reproducer (docker run + pip install) and the PYTHONPATH=/work/build/vllm workaround for the /app/vllm namespace conflict in the base image (silently breaks from vllm import SamplingParams in subprocesses otherwise).
- extra_env enables the AITER fast paths used during validation: VLLM_ROCM_USE_AITER=1, VLLM_ROCM_USE_AITER_MOE=1, VLLM_ROCM_USE_AITER_MHA=1, VLLM_ROCM_USE_AITER_RMSNORM=1, VLLM_ROCM_USE_AITER_LINEAR=1.
guide: adds a Serving on 8×AMD MI300X / MI325X / MI350X / MI355X section with both no-MTP and MTP command examples. The existing NVIDIA section is preserved unchanged.

Validation

$ node scripts/build-recipes-api.mjs
✓ JSON API: 78 models, 8 strategies
  /models.json
  /{hf_org}/{hf_repo}.json  (e.g. /moonshotai/Kimi-K2.5.json)
  /strategies.json
  /taxonomy.json

YAML parses cleanly:

$ python3 -c "import yaml; d=yaml.safe_load(open('models/tencent/Hy3-preview.yaml')); print(d['meta']['hardware'])"
{'h200': 'verified', 'mi300x': 'verified', 'mi325x': 'verified', 'mi350x': 'verified', 'mi355x': 'verified'}

Refs

[Model] Support Hy3 preview vllm#40681 — Hy3-preview model code support in vLLM (in progress)
This recipe follows the AMD support pattern from models/tencent/Hunyuan-A13B-Instruct.yaml.

vercel · 2026-04-23T19:27:04Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
vllm-recipes	Ready	Preview, Comment	Apr 28, 2026 3:36am

gemini-code-assist

Code Review

This pull request adds support for AMD MI300 and MI350 series GPUs to the Hy3-preview model, updating hardware metadata and providing comprehensive installation and deployment instructions. The review feedback identifies a path mismatch in the installation script that would prevent the environment from being set up correctly and suggests improvements for consistency, such as using the official model ID in examples and standardizing the CLI flag format for speculative configuration.

gemini-code-assist · 2026-04-23T19:28:53Z

+          -e PYTHONPATH=/work/build/vllm rocm/vllm-dev:nightly bash
+        git clone -b feature/support_hy_v3 \


The PYTHONPATH environment variable is set to /work/build/vllm, but the git clone command on the following line creates the repository at /work/vllm. This mismatch will prevent the workaround for the /app/vllm namespace conflict from working correctly. Please ensure the paths are consistent (e.g., by changing the PYTHONPATH to /work/vllm).

-e PYTHONPATH=/work/vllm rocm/vllm-dev:nightly bash git clone -b feature/support_hy_v3 \

gemini-code-assist · 2026-04-23T19:28:53Z

+  export VLLM_ROCM_USE_AITER_RMSNORM=1
+  export VLLM_ROCM_USE_AITER_LINEAR=1
+
+  vllm serve /path/to/Hy3-preview \


For consistency with the NVIDIA section and to provide a better user experience, consider using the Hugging Face model ID tencent/Hy3-preview instead of the placeholder path /path/to/Hy3-preview.

vllm serve tencent/Hy3-preview \

gemini-code-assist · 2026-04-23T19:28:53Z

+  MTP (recommended on AMD for lower latency, same flags as the NVIDIA path):
+
+  ```bash
+  vllm serve /path/to/Hy3-preview \


Consider using the model ID tencent/Hy3-preview here as well for consistency and ease of use.

vllm serve tencent/Hy3-preview \

gemini-code-assist · 2026-04-23T19:28:53Z

+  ```bash
+  vllm serve /path/to/Hy3-preview \
+    --tensor-parallel-size 8 \
+    --speculative-config '{"method":"mtp","num_speculative_tokens":1}' \


The AMD section uses a JSON string for --speculative-config, while the NVIDIA section (lines 208-209) uses the dot-notation (--speculative-config.method). For consistency across the guide, it is recommended to use the same format.

--speculative-config.method mtp \ --speculative-config.num_speculative_tokens 1 \

- Fix PYTHONPATH path mismatch: clone target is /work/vllm (not /work/build/vllm), so PYTHONPATH must point at /work/vllm to make the editable install actually shadow the empty /app/vllm namespace. - Use the Hugging Face model id 'tencent/Hy3-preview' in both AMD serve commands instead of '/path/to/Hy3-preview', matching the style of the existing NVIDIA section. - Switch the AMD MTP example to the dot-notation form (--speculative-config.method mtp / --speculative-config.num_speculative_tokens 1) to match the NVIDIA section's format. Refs: vllm-project#368 (review)

andyluo7 · 2026-04-23T20:35:34Z

Thanks @gemini-code-assist! Addressed all four comments in commit b7f1a48:

PYTHONPATH mismatch (high) — fixed: changed PYTHONPATH=/work/build/vllm → PYTHONPATH=/work/vllm to match the actual clone path.
Use HF model id (medium, AMD no-MTP block) — fixed: replaced /path/to/Hy3-preview with tencent/Hy3-preview.
Use HF model id (medium, AMD MTP block) — fixed: same as above.
Speculative-config format consistency (medium) — fixed: switched the AMD MTP example to the dot-notation form (--speculative-config.method mtp / --speculative-config.num_speculative_tokens 1) to match the NVIDIA section.

node scripts/build-recipes-api.mjs still passes: ✓ JSON API: 78 models, 8 strategies.

gemini-code-assist · 2026-04-23T20:35:53Z

Thanks for the update, @andyluo7. The changes in commit b7f1a48 correctly address the identified issues, and the consistency in the speculative-config format is appreciated. Since the node scripts/build-recipes-api.mjs check still passes, this looks good to go.

esmeetu · 2026-04-28T02:41:20Z

@andyluo7 LGTM! Can you signoff your commits?

Tencent Hy3-preview works on AMD ROCm via vLLM PR #40681 (stevenkuang-tencent/vllm@feature/support_hy_v3). End-to-end validated on a single 8xMI300X (gfx942) node and an 8xMI355X (gfx950) node with TP=8, BF16, both with and without MTP speculative decoding. MI325X and MI350X are listed as verified by hardware parity (gfx942 / gfx950 respectively); the same image and flags apply. Changes: meta.hardware: + mi300x: verified + mi325x: verified + mi350x: verified + mi355x: verified meta.performance_headline: extended to mention AMD platforms. hardware_overrides.amd: install_note explaining that until PR #40681 merges, AMD users must build vLLM editable from the PR branch into the published rocm/vllm-dev:nightly image. Includes the canonical reproducer (docker run + pip install) and the PYTHONPATH workaround for the /app/vllm namespace conflict in the base image. extra_env enables the AITER fast paths used during validation: VLLM_ROCM_USE_AITER=1 VLLM_ROCM_USE_AITER_MOE=1 VLLM_ROCM_USE_AITER_MHA=1 VLLM_ROCM_USE_AITER_RMSNORM=1 VLLM_ROCM_USE_AITER_LINEAR=1 guide: Adds a 'Serving on 8xAMD MI300X / MI325X / MI350X / MI355X' section with the standalone serve commands (with and without MTP). The existing NVIDIA section is preserved unchanged. Refs: vllm-project/vllm#40681 Validated with: node scripts/build-recipes-api.mjs Result: '✓ JSON API: 78 models, 8 strategies' with no errors. Signed-off-by: Andy Luo <andy.linluo@gmail.com>

- Fix PYTHONPATH path mismatch: clone target is /work/vllm (not /work/build/vllm), so PYTHONPATH must point at /work/vllm to make the editable install actually shadow the empty /app/vllm namespace. - Use the Hugging Face model id 'tencent/Hy3-preview' in both AMD serve commands instead of '/path/to/Hy3-preview', matching the style of the existing NVIDIA section. - Switch the AMD MTP example to the dot-notation form (--speculative-config.method mtp / --speculative-config.num_speculative_tokens 1) to match the NVIDIA section's format. Refs: vllm-project#368 (review) Signed-off-by: Andy Luo <andy.linluo@gmail.com>

andyluo7 · 2026-04-28T03:35:31Z

Done — signed off both commits and force-pushed (b7f1a48..cd7b6da). Thanks for the review!

vercel Bot deployed to Preview April 23, 2026 19:28 View deployment

gemini-code-assist Bot reviewed Apr 23, 2026

View reviewed changes

vercel Bot deployed to Preview April 23, 2026 20:36 View deployment

andyluo7 added 2 commits April 28, 2026 03:35

andyluo7 force-pushed the add-amd-mi300x-mi355x-hy3-preview branch from b7f1a48 to cd7b6da Compare April 28, 2026 03:35

vercel Bot deployed to Preview April 28, 2026 03:36 View deployment

esmeetu merged commit e033d7c into vllm-project:main Apr 28, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hy3-preview] Add AMD MI300X/MI325X/MI350X/MI355X support#368

[Hy3-preview] Add AMD MI300X/MI325X/MI350X/MI355X support#368
esmeetu merged 2 commits into
vllm-project:mainfrom
andyluo7:add-amd-mi300x-mi355x-hy3-preview

andyluo7 commented Apr 23, 2026

Uh oh!

vercel Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Uh oh!

andyluo7 commented Apr 23, 2026

Uh oh!

gemini-code-assist Bot commented Apr 23, 2026

Uh oh!

esmeetu commented Apr 28, 2026

Uh oh!

andyluo7 commented Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		-e PYTHONPATH=/work/build/vllm rocm/vllm-dev:nightly bash
		git clone -b feature/support_hy_v3 \

Conversation

andyluo7 commented Apr 23, 2026

Summary

Changes

Validation

Refs

Uh oh!

vercel Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

andyluo7 commented Apr 23, 2026

Uh oh!

gemini-code-assist Bot commented Apr 23, 2026

Uh oh!

esmeetu commented Apr 28, 2026

Uh oh!

andyluo7 commented Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented Apr 23, 2026 •

edited

Loading