[Bugfix][ROCm] Fix WNA16 MoE quant config init and Qwen3-VL tie_word_embeddings by laudney · Pull Request #34630 · vllm-project/vllm

laudney · 2026-02-16T17:06:41Z

Summary

Two small bug fixes found while testing on ROCm/RDNA4:

WNA16 MoE quant config not initialized before first apply(): The FusedMoEQuantConfig was not being set up before the first forward pass in the WNA16 quantization path, causing failures on first inference.
Qwen3MoeLLMForCausalLM tie_word_embeddings AttributeError: Some Qwen3-VL MoE checkpoint configs lack the tie_word_embeddings field entirely. Changed direct attribute access to getattr(..., False) for safety.

Both fixes are defensive and should not affect existing behavior on any platform.

Test plan

Verify WNA16 MoE models load and run inference without error
Verify Qwen3-VL MoE models with and without tie_word_embeddings in config
Existing CI should pass (no behavioral change for configs that have the field)

Both MoeWNA16Method and CompressedTensorsWNA16MoEMethod pass self.moe_quant_config to fused_experts() without ensuring it has been initialized. When it is still None, fused_experts() falls back to FUSED_MOE_UNQUANTIZED_CONFIG (use_int4_w4a16=False), making the int4 packed weight dimension assertion fail (hidden_size 2048 != w1 1024). Add lazy init guard in both apply() methods so the quant config is built on first use if ensure_moe_quant_config_init() hasn't run yet.

Some Qwen3-VL MoE configs lack tie_word_embeddings, causing AttributeError during model init. Use getattr with False default.

gemini-code-assist

Code Review

This pull request introduces defensive bug fixes for ROCm/RDNA4 environments and Qwen3-VL MoE models. Specifically, it addresses an issue where the FusedMoEQuantConfig was not initialized before the first apply() call in the WNA16 quantization path, which could lead to incorrect kernel execution or failures during the first inference pass, especially when using torch.compile. Additionally, it adds safety to the tie_word_embeddings attribute access in Qwen3MoeLLMForCausalLM to prevent AttributeError when the field is missing from checkpoint configurations. These changes improve the robustness of the model executor without altering existing behavior for standard configurations.

gemini-code-assist · 2026-02-16T17:22:22Z

+        if self.moe_quant_config is None:
+            self.moe_quant_config = self.get_fused_moe_quant_config(layer)


The lazy initialization of moe_quant_config here is critical for correctness when the standard initialization sequence is bypassed, such as during the first compiled forward pass. Without this, fused_experts would default to an unquantized configuration, leading to incorrect results for WNA16 quantized layers.

gemini-code-assist · 2026-02-16T17:22:22Z

+        if self.moe_quant_config is None:
+            self.moe_quant_config = self.get_fused_moe_quant_config(layer)


Similar to the fix in compressed_tensors_moe.py, this lazy initialization ensures that the quantization configuration is available before the first kernel invocation. This is particularly important for backends that rely on fused_experts receiving a valid quant_config to select the appropriate optimized kernels.

gemini-code-assist · 2026-02-16T17:22:22Z

            prefix=maybe_prefix(prefix, "lm_head"),
        )
-        if self.config.tie_word_embeddings:
+        if getattr(self.config, "tie_word_embeddings", False):


Using getattr with a default value of False is a safer approach for accessing tie_word_embeddings. This prevents potential AttributeError crashes when loading checkpoints that do not explicitly define this field in their configuration, which has been observed in some Qwen3-VL MoE variants.

robertgshaw2-redhat · 2026-02-16T18:00:22Z

Do you have this Pr in your branch?

https://github.com/vllm-project/vllm/pull/34371/changes

I think that this should have solved the quant config issue

laudney · 2026-02-16T18:38:00Z

Thanks for the pointer — PR #34371 does cover the WNA16 quant config init issue. The ensure_moe_quant_config_init() call in _moe_forward/_moe_forward_shared runs before forward_impl on all production paths, making the lazy-init guard in apply() redundant.

The other change here (defensive getattr for tie_word_embeddings) is speculative and inconsistent with the rest of the codebase — not worth keeping. Closing this PR.

L.B.R. added 2 commits February 16, 2026 17:06

[Bugfix] Use defensive getattr for tie_word_embeddings in Qwen3MoeLLM

ba192f7

Some Qwen3-VL MoE configs lack tie_word_embeddings, causing AttributeError during model init. Use getattr with False default.

laudney requested review from mgoin, pavanimajety, robertgshaw2-redhat, sighingnow, tlrmchlsmth and yewentao256 as code owners February 16, 2026 17:06

mergify bot added qwen Related to Qwen models rocm Related to AMD ROCm bug Something isn't working labels Feb 16, 2026

github-project-automation bot added this to AMD Feb 16, 2026

github-project-automation bot moved this to Todo in AMD Feb 16, 2026

gemini-code-assist bot reviewed Feb 16, 2026

View reviewed changes

laudney closed this Feb 16, 2026

github-project-automation bot moved this from Todo to Done in AMD Feb 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix][ROCm] Fix WNA16 MoE quant config init and Qwen3-VL tie_word_embeddings#34630

[Bugfix][ROCm] Fix WNA16 MoE quant config init and Qwen3-VL tie_word_embeddings#34630
laudney wants to merge 2 commits intovllm-project:mainfrom
mmonad:fix/rocm-moe-bugfixes

laudney commented Feb 16, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 16, 2026

Uh oh!

gemini-code-assist bot Feb 16, 2026

Uh oh!

gemini-code-assist bot Feb 16, 2026

Uh oh!

robertgshaw2-redhat commented Feb 16, 2026

Uh oh!

laudney commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if self.moe_quant_config is None:
		self.moe_quant_config = self.get_fused_moe_quant_config(layer)

Uh oh!

Conversation

laudney commented Feb 16, 2026

Summary

Test plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

robertgshaw2-redhat commented Feb 16, 2026

Uh oh!

laudney commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants