Skip to content

[MoE Refactor] Rename FusedMoE.make_expert_params_mapping to fused_moe_make_expert_params_mapping#40671

Merged
robertgshaw2-redhat merged 3 commits intovllm-project:mainfrom
neuralmagic:rename-method
Apr 23, 2026
Merged

[MoE Refactor] Rename FusedMoE.make_expert_params_mapping to fused_moe_make_expert_params_mapping#40671
robertgshaw2-redhat merged 3 commits intovllm-project:mainfrom
neuralmagic:rename-method

Conversation

@bnellnm
Copy link
Copy Markdown
Collaborator

@bnellnm bnellnm commented Apr 23, 2026

Purpose

This is prep work for deleting the FusedMoE class and replacing it with MoERunner. This PR is just a rename of FusedMoE.make_expert_params_mapping to fused_moe_make_expert_params_mapping for models that use the function.

Test Plan

CI

Test Result

cc @robertgshaw2-redhat , @yzong-rh


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Signed-off-by: Bill Nell <bnell@redhat.com>
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@mergify mergify Bot added deepseek Related to DeepSeek models llama Related to Llama models qwen Related to Qwen models gpt-oss Related to GPT-OSS models speculative-decoding labels Apr 23, 2026
Signed-off-by: Bill Nell <bnell@redhat.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a standalone function fused_moe_make_expert_params_mapping in the fused_moe layer and updates numerous model implementations to use this new interface for expert parameter mapping. A critical issue was identified in the new function signature, which lacks default values for several parameters; this will cause runtime TypeError exceptions at call sites that only provide a subset of the required arguments. Providing default values for these parameters is necessary to maintain compatibility with existing model implementations.

Comment on lines +1621 to +1628
# This is a temporary forwarding method which will be removed/modified layer.
def fused_moe_make_expert_params_mapping(
model: torch.nn.Module,
ckpt_gate_proj_name: str,
ckpt_down_proj_name: str,
ckpt_up_proj_name: str,
num_experts: int,
num_redundant_experts: int = 0,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The new function fused_moe_make_expert_params_mapping is missing default values for ckpt_up_proj_name and num_experts. Many models (e.g., AXK1, afmoe, bailing_moe, deepseek_eagle) call this function with only 3 arguments (model, ckpt_gate_proj_name, ckpt_down_proj_name), which will now result in a TypeError at runtime because the function expects 5 required arguments. \n\nYou should provide default values for these parameters to maintain compatibility with existing model implementations that do not specify them.

def fused_moe_make_expert_params_mapping(\n    model: torch.nn.Module,\n    ckpt_gate_proj_name: str,\n    ckpt_down_proj_name: str,\n    ckpt_up_proj_name: Optional[str] = None,\n    num_experts: int = 0,\n    num_redundant_experts: int = 0,\n) -> list[tuple[str, str, int, str]]:

@github-project-automation github-project-automation Bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Apr 23, 2026
@robertgshaw2-redhat robertgshaw2-redhat enabled auto-merge (squash) April 23, 2026 03:01
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 23, 2026
Signed-off-by: Bill Nell <bnell@redhat.com>
auto-merge was automatically disabled April 23, 2026 03:05

Head branch was pushed to by a user without write access

@robertgshaw2-redhat robertgshaw2-redhat merged commit 1c2c1eb into vllm-project:main Apr 23, 2026
71 checks passed
@bnellnm bnellnm deleted the rename-method branch April 24, 2026 19:40
avinashsingh77 pushed a commit to avinashsingh77/vllm that referenced this pull request Apr 27, 2026
…e_make_expert_params_mapping (vllm-project#40671)

Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Avinash Singh <avinashsingh.rcoem@gmail.com>
Lafunamor pushed a commit to Lafunamor/vllm that referenced this pull request May 1, 2026
…e_make_expert_params_mapping (vllm-project#40671)

Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Adrian <info@zzit.ch>
Copilot AI pushed a commit to hongbolv/vllm that referenced this pull request May 7, 2026
…e_make_expert_params_mapping (vllm-project#40671)

Signed-off-by: Bill Nell <bnell@redhat.com>
Co-authored-by: hongbolv <33214277+hongbolv@users.noreply.github.com>
wangxiyuan pushed a commit to vllm-project/vllm-ascend that referenced this pull request May 8, 2026
### What this PR does / why we need it?
Based on #8856.

Sync to vLLM `4d51588e2381018348f1022dfa3a7698899805b7`.

Fix:

---

- MoE refactor @wxsIcey, introduced by
vllm-project/vllm#35782,
vllm-project/vllm#35949,
vllm-project/vllm#40560,
vllm-project/vllm#40671.
- `TypeError: rejection_sample() got an unexpected keyword argument
'synthetic_mode'` -> Add `synthetic_mode` and
`synthetic_conditional_rates` param to ascend `rejection_sample()`.

---

| # | Error | Category | Upstream Commit | Affected vllm-ascend Path |
Fix |
| :- | :------------------------------------------------- | :-------- |
:----------------------------------------------- |
:--------------------------------------------------------- |
:----------------------------------------------- |
| 1 | `encoder_compilation_time` AttributeError | Code Bug | `c08f3b2a6`
([#39240](vllm-project/vllm#39240)) |
`worker/worker.py:567` | `getattr` fallback |
| 2 | `AscendRMSNormGated activation` TypeError | Code Bug | `893611813`
([#40245](vllm-project/vllm#40245)) |
`ops/layernorm.py:160`, `_310p/ops/layernorm.py:43` | Accept
`activation` kwarg |
| 3 | `AscendFusedMoEMethod.apply topk_weights` TypeError| Code Bug |
many (e.g., `5e584ce9e`
([#35782](vllm-project/vllm#35782)), `809d83c2d`
([#40560](vllm-project/vllm#40560)), `4d51588e2`
([#40860](vllm-project/vllm#40860))) |
`ops/fused_moe/fused_moe.py:107` | Major refactor — follow-up PR |
| 4 | `_all_lora_classes` is tuple | Code Bug | `a250f1bd5`
([#35077](vllm-project/vllm#35077)) |
`lora/utils.py:188` | Rebuild tuple instead of `.add()` |
| 5 | `ProfilingChunkScheduler hash_block_size` TypeError| Code Bug |
`7b1bc0a3e` ([#40946](vllm-project/vllm#40946))
| `core/scheduler_profiling_chunk.py:57` | Forward new kwarg |
| 6 | `_moe_C.topk_softmax` AttributeError | Code Bug | MoE router
refactor | router dispatch override needed | Provide `torch_npu`
topk-softmax (with Issue 4) |
| 7 | global experts shape mismatch | Code Bug | follow-on of Issue 4 |
`quantization/methods/w8a8_dynamic.py:198` | Resolve once Issue 4 is
fixed |

- vLLM main:
vllm-project/vllm@d886c26

---------

Signed-off-by: wxsIcey <1790571317@qq.com>
Signed-off-by: Shanshan Shen <87969357+shen-shanshan@users.noreply.github.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
Co-authored-by: wxsIcey <1790571317@qq.com>
Co-authored-by: gcanlin <canlinguosdu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepseek Related to DeepSeek models gpt-oss Related to GPT-OSS models llama Related to Llama models qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants