[XPU] enable is_act_and_mul for xpu by xuechendi · Pull Request #37481 · vllm-project/vllm

xuechendi · 2026-03-18T21:25:57Z

Purpose

Testing nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-bf16 on XPU and enable relu2_no_mul

dependencies:

Support sycl impl relu2_no_mul for NVIDIA-Nemotron-3-Nano-30B-A3B-bf16 vllm-xpu-kernels#232 -> Add 'relu2_no_mul' kernel

Test Plan

lm_eval   \
--model vllm   \
--model_args pretrained=/mnt/data/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-bf16,tensor_parallel_size=4,block_size=16,trust_remote_code=True,enable_expert_parallel=True,attention_backend=TRITON_ATTN   \
--tasks gsm8k   \
--num_fewshot 5   \
--batch_size auto \
--limit 1319 \
--apply_chat_template \
--fewshot_as_multiturn

Test Result

lm_eval   \
--model vllm   \
--model_args pretrained=/mnt/data/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-bf16,tensor_parallel_size=4,trust_remote_code=True,enable_expert_parallel=True,attention_backend=TRITON_ATTN   \
--tasks gsm8k   \
--num_fewshot 5   \
--batch_size auto \

Accuracy meet requirement in https://github.com/vllm-project/vllm/blob/main/.buildkite/lm-eval-harness/configs/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.yaml

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
gsm8k	3	flexible-extract	5	exact_match	↑	0.4594	±	0.0137
		strict-match	5	exact_match	↑	0.6945	±	0.0127

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request enables is_act_and_mul=False for XPU platforms in FusedMoE layers. The change correctly adds current_platform.is_xpu() to the supported platforms check. I have one suggestion to improve the clarity of an error message related to this change.

gemini-code-assist · 2026-03-18T21:28:15Z

+        if not self.moe_config.is_act_and_mul and not (
+            current_platform.is_cuda_alike() or current_platform.is_xpu()
+        ):


Since this change adds support for XPU when is_act_and_mul=False, the NotImplementedError message raised within this if block is now outdated. It would be beneficial to update it to include 'XPU' to avoid confusion for future developers. For example: "is_act_and_mul=False is supported only for CUDA, ROCm, and XPU for now".

jikunshang · 2026-03-19T01:41:43Z

better mark as draft for now since we need vllm-xpu-kernel dependency/

xuechendi · 2026-03-19T16:09:12Z

actually I think we can merge this firstly, since it will assert in vllm-xpu-kernels side.
Once 'no_mul' path enabled in vllm-xpu-kernels, it automatically pass

mergify · 2026-03-25T06:28:37Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @xuechendi.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

xuechendi · 2026-04-17T00:27:09Z

@jikunshang , Since this PR will need main branch of vllm-xpu-kernels instead 0.1.5.
Do you think it is OK to merge this PR firstly so who build vllm-xpu-kernels from source can run Nemotron or we need to wait for next vllm-xpu-kernels?
cc @Dboyqiao

mergify · 2026-04-17T00:29:14Z

Hi @xuechendi, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

jikunshang · 2026-04-17T00:32:42Z

@xuechendi we will have next vllm-xpu-kernel release this weedkend or next Monday, then I will merge this.

Signed-off-by: Chendi Xue <chendi.xue@intel.com>

mergify · 2026-04-17T00:42:38Z

Hi @xuechendi, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

mergify · 2026-04-17T00:47:11Z

Hi @xuechendi, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

jikunshang · 2026-04-17T00:55:30Z

pre commit fixed in #40078

jikunshang · 2026-04-27T14:51:25Z

v0.1.7 bump up PR is here #41019. please rebase after it merged. thanks.

xuechendi requested review from mgoin and pavanimajety as code owners March 18, 2026 21:25

gemini-code-assist Bot reviewed Mar 18, 2026

View reviewed changes

mergify Bot added needs-rebase intel-gpu Related to Intel GPU and removed needs-rebase labels Mar 25, 2026

xuechendi force-pushed the wip_nemotron_h_xpu_fusedmoe branch from 133439a to 2ca3d18 Compare April 17, 2026 00:37

enable is_act_and_mul for xpu

ff47abf

Signed-off-by: Chendi Xue <chendi.xue@intel.com>

xuechendi force-pushed the wip_nemotron_h_xpu_fusedmoe branch from 2ca3d18 to ff47abf Compare April 17, 2026 00:42

Merge branch 'main' into wip_nemotron_h_xpu_fusedmoe

41d2ca8

jikunshang approved these changes Apr 27, 2026

View reviewed changes

Merge branch 'main' into wip_nemotron_h_xpu_fusedmoe

2d59d14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[XPU] enable is_act_and_mul for xpu#37481

[XPU] enable is_act_and_mul for xpu#37481
xuechendi wants to merge 3 commits intovllm-project:mainfrom
xuechendi:wip_nemotron_h_xpu_fusedmoe

xuechendi commented Mar 18, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 18, 2026

Uh oh!

jikunshang commented Mar 19, 2026

Uh oh!

xuechendi commented Mar 19, 2026

Uh oh!

mergify Bot commented Mar 25, 2026

Uh oh!

xuechendi commented Apr 17, 2026

Uh oh!

mergify Bot commented Apr 17, 2026

Uh oh!

jikunshang commented Apr 17, 2026

Uh oh!

mergify Bot commented Apr 17, 2026

Uh oh!

mergify Bot commented Apr 17, 2026

Uh oh!

jikunshang commented Apr 17, 2026

Uh oh!

jikunshang commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

xuechendi commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

jikunshang commented Mar 19, 2026

Uh oh!

xuechendi commented Mar 19, 2026

Uh oh!

mergify Bot commented Mar 25, 2026

Uh oh!

xuechendi commented Apr 17, 2026

Uh oh!

mergify Bot commented Apr 17, 2026

Uh oh!

jikunshang commented Apr 17, 2026

Uh oh!

mergify Bot commented Apr 17, 2026

Uh oh!

mergify Bot commented Apr 17, 2026

Uh oh!

jikunshang commented Apr 17, 2026

Uh oh!

jikunshang commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xuechendi commented Mar 18, 2026 •

edited

Loading