Skip to content

fix(gguf): Make GGUFMoEMethod.apply() parameters optional#30423

Closed
kitaekatt wants to merge 1 commit intovllm-project:mainfrom
kitaekatt:fix/gguf-moe-method-signature
Closed

fix(gguf): Make GGUFMoEMethod.apply() parameters optional#30423
kitaekatt wants to merge 1 commit intovllm-project:mainfrom
kitaekatt:fix/gguf-moe-method-signature

Conversation

@kitaekatt
Copy link
Copy Markdown
Contributor

Summary

  • Makes top_k and renormalize keyword arguments with defaults in GGUFMoEMethod.apply()
  • These parameters were required but never passed by the caller nor used in the method body

Problem

GGUF MoE models (e.g., Qwen3-30B-A3B-Instruct-GGUF) fail with:

TypeError: GGUFMoEMethod.apply() missing 2 required positional arguments: 'top_k' and 'renormalize'

Root Cause

  • FusedMoE.forward_impl() in layer.py:1946 calls quant_method.apply(layer=self, x=..., router_logits=...)
  • Base class FusedMoEMethodBase.apply() only requires (layer, x, router_logits)
  • GGUFMoEMethod.apply() had extra required positional args that weren't being passed
  • These args were also not used in the method body (it calls layer.select_experts() without them)

Test plan

  • Test with Qwen3-30B-A3B-Instruct-GGUF model
  • Verify GGUF MoE inference works correctly

🤖 Generated with Claude Code

The GGUFMoEMethod.apply() method had required positional parameters
(top_k, renormalize) that were not being passed by the caller in
FusedMoE.forward_impl(). The base class FusedMoEMethodBase.apply()
only requires (layer, x, router_logits).

These parameters were also not used in the method body - it calls
layer.select_experts() without passing them.

This change makes top_k and renormalize keyword arguments with defaults,
matching how the caller invokes the method.

Fixes GGUF MoE models (e.g., Qwen3-30B-A3B-Instruct-GGUF) failing with:
TypeError: GGUFMoEMethod.apply() missing 2 required positional arguments: 'top_k' and 'renormalize'
@mergify
Copy link
Copy Markdown

mergify bot commented Dec 10, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @kitaekatt.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 10, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses a TypeError in GGUFMoEMethod.apply() by making the top_k and renormalize parameters optional. According to the pull request description, these parameters were required but not always passed by the caller, and as they are not used within the method body, making them optional with default values is a safe and effective fix. The change is minimal and directly resolves the bug affecting GGUF MoE models. The implementation looks good.

@kitaekatt
Copy link
Copy Markdown
Contributor Author

Closing this PR as it's no longer needed.

Upstream PR #29066 ([MoE][Refactor] Remove most arguments to FusedMoEMethodBase.apply) resolved the underlying issue by removing the extra arguments from the method signature entirely.

Thanks to the vLLM team for the comprehensive MoE refactor!

@kitaekatt kitaekatt closed this Dec 10, 2025
@kitaekatt kitaekatt deleted the fix/gguf-moe-method-signature branch December 10, 2025 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant