Skip to content

fix(rocm): Enable non-gated MoE (is_act_and_mul=False) support on ROCm#32244

Merged
tjtanaa merged 1 commit intovllm-project:mainfrom
rabi:fix_rocm
Jan 16, 2026
Merged

fix(rocm): Enable non-gated MoE (is_act_and_mul=False) support on ROCm#32244
tjtanaa merged 1 commit intovllm-project:mainfrom
rabi:fix_rocm

Conversation

@rabi
Copy link
Copy Markdown
Contributor

@rabi rabi commented Jan 13, 2026

Purpose

Models like NemotronH use non-gated MoE with activations like relu2_no_mul. Previously, this was blocked on ROCm because the platform check only allowed CUDA.

  • Updates platform check from is_cuda() to is_cuda_alike() to allow ROCm
  • Disables AITER kernel for non-gated MoE since AITER only supports gated activations (silu/gelu)
  • Falls back to Triton implementation which properly handles non-gated activations via apply_moe_activation()

Test Plan

Tested on AMD MI210 GPU with NemotronH model.

Test Result

Model loads and serves successfully.

@mergify mergify bot added the rocm Related to AMD ROCm label Jan 13, 2026
@DarkLight1337 DarkLight1337 requested a review from tjtanaa January 13, 2026 08:28
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly enables non-gated MoE support on ROCm. The changes are well-targeted and logical. You've correctly updated the platform check from is_cuda() to is_cuda_alike() to include ROCm. Additionally, you've properly disabled the AITER kernel for non-gated MoE, as it only supports gated activations, allowing the system to fall back to the Triton implementation which handles this case. The changes appear correct and align with the stated purpose. I have no further comments.

Copy link
Copy Markdown
Collaborator

@tjtanaa tjtanaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tjtanaa tjtanaa added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 14, 2026
@mergify
Copy link
Copy Markdown

mergify bot commented Jan 16, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @rabi.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify
Copy link
Copy Markdown

mergify bot commented Jan 16, 2026

Hi @rabi, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Models like NemotronH use non-gated MoE with activations like relu2_no_mul.
Previously, this was blocked on ROCm because the platform check only
allowed CUDA.

- Updates platform check from is_cuda() to is_cuda_alike() to allow ROCm
- Disables AITER kernel for non-gated MoE since AITER only supports
  gated activations (silu/gelu)
- Falls back to Triton implementation which properly handles non-gated
  activations via apply_moe_activation()

Signed-off-by: rabi <ramishra@redhat.com>
Copy link
Copy Markdown
Collaborator

@tjtanaa tjtanaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix.

@tjtanaa tjtanaa merged commit b66b0d6 into vllm-project:main Jan 16, 2026
52 of 53 checks passed
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
vllm-project#32244)

Signed-off-by: rabi <ramishra@redhat.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants