Skip to content

[XPU] enable is_act_and_mul for xpu#37481

Open
xuechendi wants to merge 3 commits intovllm-project:mainfrom
xuechendi:wip_nemotron_h_xpu_fusedmoe
Open

[XPU] enable is_act_and_mul for xpu#37481
xuechendi wants to merge 3 commits intovllm-project:mainfrom
xuechendi:wip_nemotron_h_xpu_fusedmoe

Conversation

@xuechendi
Copy link
Copy Markdown
Collaborator

@xuechendi xuechendi commented Mar 18, 2026

Purpose

Testing nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-bf16 on XPU and enable relu2_no_mul

dependencies:

  1. Support sycl impl relu2_no_mul for NVIDIA-Nemotron-3-Nano-30B-A3B-bf16 vllm-xpu-kernels#232 -> Add 'relu2_no_mul' kernel

Test Plan

lm_eval   \
--model vllm   \
--model_args pretrained=/mnt/data/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-bf16,tensor_parallel_size=4,block_size=16,trust_remote_code=True,enable_expert_parallel=True,attention_backend=TRITON_ATTN   \
--tasks gsm8k   \
--num_fewshot 5   \
--batch_size auto \
--limit 1319 \
--apply_chat_template \
--fewshot_as_multiturn
image

Test Result

lm_eval   \
--model vllm   \
--model_args pretrained=/mnt/data/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-bf16,tensor_parallel_size=4,trust_remote_code=True,enable_expert_parallel=True,attention_backend=TRITON_ATTN   \
--tasks gsm8k   \
--num_fewshot 5   \
--batch_size auto \
image

Accuracy meet requirement in https://github.com/vllm-project/vllm/blob/main/.buildkite/lm-eval-harness/configs/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.yaml

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.4594 ± 0.0137
strict-match 5 exact_match 0.6945 ± 0.0127

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables is_act_and_mul=False for XPU platforms in FusedMoE layers. The change correctly adds current_platform.is_xpu() to the supported platforms check. I have one suggestion to improve the clarity of an error message related to this change.

Comment on lines +596 to +598
if not self.moe_config.is_act_and_mul and not (
current_platform.is_cuda_alike() or current_platform.is_xpu()
):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Since this change adds support for XPU when is_act_and_mul=False, the NotImplementedError message raised within this if block is now outdated. It would be beneficial to update it to include 'XPU' to avoid confusion for future developers. For example: "is_act_and_mul=False is supported only for CUDA, ROCm, and XPU for now".

@jikunshang
Copy link
Copy Markdown
Collaborator

better mark as draft for now since we need vllm-xpu-kernel dependency/

@xuechendi
Copy link
Copy Markdown
Collaborator Author

actually I think we can merge this firstly, since it will assert in vllm-xpu-kernels side.
Once 'no_mul' path enabled in vllm-xpu-kernels, it automatically pass

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Mar 25, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @xuechendi.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added needs-rebase intel-gpu Related to Intel GPU and removed needs-rebase labels Mar 25, 2026
@xuechendi
Copy link
Copy Markdown
Collaborator Author

@jikunshang , Since this PR will need main branch of vllm-xpu-kernels instead 0.1.5.
Do you think it is OK to merge this PR firstly so who build vllm-xpu-kernels from source can run Nemotron or we need to wait for next vllm-xpu-kernels?
cc @Dboyqiao

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 17, 2026

Hi @xuechendi, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

@jikunshang
Copy link
Copy Markdown
Collaborator

@xuechendi we will have next vllm-xpu-kernel release this weedkend or next Monday, then I will merge this.

@xuechendi xuechendi force-pushed the wip_nemotron_h_xpu_fusedmoe branch from 133439a to 2ca3d18 Compare April 17, 2026 00:37
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 17, 2026

Hi @xuechendi, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

@xuechendi xuechendi force-pushed the wip_nemotron_h_xpu_fusedmoe branch from 2ca3d18 to ff47abf Compare April 17, 2026 00:42
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 17, 2026

Hi @xuechendi, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

@jikunshang
Copy link
Copy Markdown
Collaborator

pre commit fixed in #40078

@jikunshang
Copy link
Copy Markdown
Collaborator

v0.1.7 bump up PR is here #41019. please rebase after it merged. thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

intel-gpu Related to Intel GPU

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants