Fix Qwen3OmniMoe Talker weight loading and config initialization by HuiyingLi · Pull Request #43084 · huggingface/transformers

HuiyingLi · 2026-01-03T08:47:14Z

What does this PR do?

This PR fixes several issues preventing Qwen3OmniMoeForConditionalGeneration from loading and running correctly from the Qwen/Qwen3-Omni-30B-A3B-Instruct checkpoint.

Issues Fixed

Issue: #43083

Incorrect inherited _tied_weights_keys in Talker
Qwen3OmniMoeTalkerForConditionalGeneration inherits from Qwen3MoeForCausalLM, which defines _tied_weights_keys = {"lm_head.weight": "model.embed_tokens.weight"}. However, the Talker model uses codec_head instead of lm_head and doesn't tie weights. This caused loading errors as the loader expected keys that don't exist in the checkpoint. This causes garbled audio output.
Fix: Override _tied_weights_keys = {} in the Talker class.
Incorrect _tp_plan and _pp_plan references
The inherited tensor/pipeline parallelism plans reference lm_head, but the Talker uses codec_head.
Fix: Override with _tp_plan = {"codec_head": "colwise_rep"} and _pp_plan = {"codec_head": (["hidden_states"], ["logits"])}.
Missing initializer_range in config classes
Qwen3OmniMoeTalkerConfig, Qwen3OmniMoeCode2WavConfig, and Qwen3OmniMoeConfig were missing the initializer_range attribute. This caused AttributeError during _init_weights() calls.
Fix: Add initializer_range attribute to all affected config classes.

Who can review?

@zucchini-nlp @ArthurZucker @Cyrilvallez

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

zucchini-nlp · 2026-01-05T06:25:55Z

src/transformers/models/qwen3_omni_moe/modeling_qwen3_omni_moe.py


 @auto_docstring
 class Qwen3OmniMoeTalkerForConditionalGeneration(Qwen3OmniMoeThinkerTextPreTrainedModel, GenerationMixin):
-    _tied_weights_keys = {"lm_head.weight": "model.embed_tokens.weight"}


I believe the correct way is codec_head: model.codec_embedding.weight. It will allow users to tie weights if needed. We just need to make sure that the model is not tying weight, I see that the default is already tie_word_embeddings=False in config

Thank you very much @zucchini-nlp ! Fixed.

zucchini-nlp · 2026-01-05T06:28:07Z

src/transformers/models/qwen3_omni_moe/configuration_qwen3_omni_moe.py

        self.audio_start_token_id = audio_start_token_id
        self.vision_start_token_id = vision_start_token_id
        self.speaker_id = speaker_id
+        self.initializer_range = self.text_config.initializer_range


since we're using text config's init range in any case, we can instead update init_weights to use config.get_text_config().init_range

HuggingFaceDocBuilderDev · 2026-01-05T06:28:38Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

github-actions · 2026-01-05T10:45:32Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen3_omni_moe

ArthurZucker

TY!

…gingface#43084) * fix modular_qwen3_omni_moe Signed-off-by: HuiyingLi <willwin.lee@gmail.com> * update generated configuration and modeling file Signed-off-by: HuiyingLi <willwin.lee@gmail.com> * fix tie weight keys Signed-off-by: HuiyingLi <willwin.lee@gmail.com> --------- Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

HuiyingLi added 2 commits January 3, 2026 00:31

fix modular_qwen3_omni_moe

237f34a

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

update generated configuration and modeling file

0195c5c

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

HuiyingLi mentioned this pull request Jan 3, 2026

feat: add transformers v5 AutomodelForMultimodalLM (depends on transformers v5rc1) NVIDIA-NeMo/Automodel#1006

Draft

3 tasks

zucchini-nlp reviewed Jan 5, 2026

View reviewed changes

fix tie weight keys

e5a88c8

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

ArthurZucker approved these changes Jan 5, 2026

View reviewed changes

ArthurZucker merged commit 64a476b into huggingface:main Jan 5, 2026
17 checks passed

zucchini-nlp mentioned this pull request Jan 12, 2026

Fix Qwen3OmniMoe Talker loading and config initialization #43091

Closed

Sai-Suraj-27 mentioned this pull request Mar 19, 2026

Fix few issues in Qwen_3_Omni_Moe #44848

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Qwen3OmniMoe Talker weight loading and config initialization#43084

Fix Qwen3OmniMoe Talker weight loading and config initialization#43084
ArthurZucker merged 3 commits intohuggingface:mainfrom
HuiyingLi:huiyingl/qwen3_omni_fix

HuiyingLi commented Jan 3, 2026

Uh oh!

zucchini-nlp Jan 5, 2026

Uh oh!

HuiyingLi Jan 5, 2026

Uh oh!

zucchini-nlp Jan 5, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Jan 5, 2026

Uh oh!

github-actions bot commented Jan 5, 2026

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

HuiyingLi commented Jan 3, 2026

What does this PR do?

Issues Fixed

Who can review?

Uh oh!

zucchini-nlp Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

HuiyingLi Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Jan 5, 2026

Uh oh!

github-actions bot commented Jan 5, 2026

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants