feat(moe): Add is_act_and_mul=False support for Triton MoE kernels by rabi · Pull Request #31645 · vllm-project/vllm

rabi · 2026-01-03T11:27:55Z

Purpose

Add support for non-fused activations (relu2_no_mul, silu_no_mul, gelu_no_mul) in Triton MoE kernels for models like Nemotron-H that use non-SwiGLU activations.

Add is_act_and_mul flag to FusedMoEQuantConfig
Implement non-fused activations in modular_kernel.py
Fix workspace sizes in TritonExperts for is_act_and_mul=False
Enable on ROCm when AITER is disabled

Test Plan

Add test_triton_moe_no_act_mul.py for CUDA and ROCm

Test Result

Tests pass successfully on local env and would be tested in CI

gemini-code-assist

Code Review

This pull request enables MoE models with is_act_and_mul=False to run on ROCm by leveraging Triton kernels. The changes are well-structured, introducing is_act_and_mul to FusedMoEQuantConfig, updating workspace sizing calculations, and adding support for non-fused activations. The inclusion of a new test file for ROCm is a great addition for ensuring correctness. I have one suggestion to enhance the performance of the new activation function implementations by minimizing intermediate tensor allocations.

vllm/model_executor/layers/fused_moe/modular_kernel.py

tests/kernels/moe/test_rocm_moe_no_act_mul.py

tests/kernels/moe/test_triton_moe_no_act_mul.py

mergify · 2026-01-06T08:41:44Z

Hi @rabi, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Add support for non-fused activations (relu2_no_mul, silu_no_mul, gelu_no_mul) in Triton MoE kernels for models like Nemotron-H that use non-SwiGLU activations. - Add is_act_and_mul flag to FusedMoEQuantConfig - Implement non-fused activations in modular_kernel.py - Fix workspace sizes in TritonExperts for is_act_and_mul=False - Enable on ROCm when AITER is disabled - Add test_triton_moe_no_act_mul.py for CUDA and ROCm Signed-off-by: rabi <ramishra@redhat.com>

tjtanaa

LGTM

mgoin · 2026-01-08T16:11:20Z

@tjtanaa @rabi @danielafrimi can we actually revert this PR and land #31528 instead? I feel this fix adding is_act_and_mul to the quant config and the activation implementations are not as nice as the refactor in the other PR

…llm-project#31645) Signed-off-by: rabi <ramishra@redhat.com>

…llm-project#31645) Signed-off-by: rabi <ramishra@redhat.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

…llm-project#31645) Signed-off-by: rabi <ramishra@redhat.com>

rabi requested review from mgoin, pavanimajety and tjtanaa as code owners January 3, 2026 11:27

mergify bot added the rocm Related to AMD ROCm label Jan 3, 2026

gemini-code-assist bot reviewed Jan 3, 2026

View reviewed changes

vllm/model_executor/layers/fused_moe/modular_kernel.py Outdated Show resolved Hide resolved

rabi force-pushed the is_act_and_mul branch from 413c98b to 381b0f7 Compare January 3, 2026 11:49

tjtanaa reviewed Jan 5, 2026

View reviewed changes

tests/kernels/moe/test_rocm_moe_no_act_mul.py Outdated Show resolved Hide resolved

rabi force-pushed the is_act_and_mul branch 3 times, most recently from 373609e to e586bef Compare January 5, 2026 08:11

tjtanaa reviewed Jan 6, 2026

View reviewed changes

tests/kernels/moe/test_triton_moe_no_act_mul.py Show resolved Hide resolved

rabi force-pushed the is_act_and_mul branch from e586bef to df6dd1a Compare January 6, 2026 08:13

rabi requested review from WoosukKwon, tlrmchlsmth and yewentao256 as code owners January 6, 2026 08:13

rabi changed the title ~~feat(rocm): Support is_act_and_mul=False MoE with Triton~~ feat(moe): Add is_act_and_mul=False support for Triton MoE kernels Jan 6, 2026

rabi force-pushed the is_act_and_mul branch from df6dd1a to 0f32bf5 Compare January 6, 2026 08:51

tjtanaa added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 6, 2026

Merge branch 'main' into is_act_and_mul

3a21f32

tjtanaa requested a review from robertgshaw2-redhat January 7, 2026 09:11

tjtanaa approved these changes Jan 8, 2026

View reviewed changes

tjtanaa merged commit 25eef3d into vllm-project:main Jan 8, 2026
52 checks passed

danielafrimi mentioned this pull request Jan 8, 2026

[FIX] Add NO_MUL activation support for modular kernel path #31528

Merged

mgoin mentioned this pull request Jan 8, 2026

Revert "feat(moe): Add is_act_and_mul=False support for Triton MoE kernels" #31978

Merged

yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026

feat(moe): Add is_act_and_mul=False support for Triton MoE kernels (v…

f51b334

…llm-project#31645) Signed-off-by: rabi <ramishra@redhat.com>

akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026

feat(moe): Add is_act_and_mul=False support for Triton MoE kernels (v…

375d986

…llm-project#31645) Signed-off-by: rabi <ramishra@redhat.com>

dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026

feat(moe): Add is_act_and_mul=False support for Triton MoE kernels (v…

d76d913

…llm-project#31645) Signed-off-by: rabi <ramishra@redhat.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026

feat(moe): Add is_act_and_mul=False support for Triton MoE kernels (v…

2f6c3b4

…llm-project#31645) Signed-off-by: rabi <ramishra@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(moe): Add is_act_and_mul=False support for Triton MoE kernels#31645

feat(moe): Add is_act_and_mul=False support for Triton MoE kernels#31645
tjtanaa merged 2 commits intovllm-project:mainfrom
rabi:is_act_and_mul

rabi commented Jan 3, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Jan 6, 2026

Uh oh!

tjtanaa left a comment

Uh oh!

Uh oh!

mgoin commented Jan 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

rabi commented Jan 3, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Jan 6, 2026

Uh oh!

tjtanaa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mgoin commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rabi commented Jan 3, 2026 •

edited by github-actions bot

Loading

mgoin commented Jan 8, 2026 •

edited

Loading