Fix Qwen3OmniMoe Talker loading and config initialization by Krish2002 · Pull Request #43091 · huggingface/transformers

Krish2002 · 2026-01-03T23:53:24Z

What does this PR do?

This PR fixes two issues that prevented the model
Qwen/Qwen3-Omni-30B-A3B-Instruct from loading correctly with
AutoModelForMultimodalLM.

Fix 1: `AttributeError: Qwen3OmniMoeTalkerForConditionalGeneration has no attribute 'lm_head'`

Issue

Qwen3OmniMoeTalkerForConditionalGeneration deletes lm_head in its __init__
method, but it inherits _tied_weights_keys from its parent class
(Qwen3MoeForCausalLM), which references lm_head.weight.

During model loading, mark_tied_weights_as_initialized() attempts to access
lm_head.weight, resulting in an AttributeError.

Fix

Explicitly set _tied_weights_keys = {} for
Qwen3OmniMoeTalkerForConditionalGeneration, since this model does not use tied
weights.

This change is implemented in:

src/transformers/models/qwen3_omni_moe/modular_qwen3_omni_moe.py

Fix 2: `AttributeError: ... object has no attribute 'initializer_range'`

Issue

Several composite config classes were missing the initializer_range attribute:

Qwen3OmniMoeTalkerConfig
Qwen3OmniMoeCode2WavConfig
Qwen3OmniMoeConfig

When _initialize_missing_keys() runs during model loading, it may call
_init_weights() for modules that are not loaded from the checkpoint.
_init_weights() expects self.config.initializer_range, which caused an
AttributeError.

Fix

Add an initializer_range parameter (default: 0.02) and store it as an
attribute in the affected config classes.

This change is implemented in:

src/transformers/models/qwen3_omni_moe/modular_qwen3_omni_moe.py

Breaking changes

None.

Tests

Added a unit-level regression test in:

tests/models/qwen3_omni_moe/test_configuration_and_loading.py

The test verifies that:

Qwen3OmniMoeTalkerForConditionalGeneration has empty tied weight keys.
Qwen3OmniMoeTalkerConfig, Qwen3OmniMoeCode2WavConfig, and
Qwen3OmniMoeConfig all define the initializer_range attribute.

Additionally, verified locally that the model loads successfully using the
following reproduction script:

from transformers import AutoModelForMultimodalLM
import torch

model = AutoModelForMultimodalLM.from_pretrained(
    "Qwen/Qwen3-Omni-30B-A3B-Instruct",
    torch_dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",
)

github-actions · 2026-01-05T09:46:05Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen3_omni_moe

Rocketknight1 · 2026-01-12T14:30:03Z

cc @zucchini-nlp

zucchini-nlp · 2026-01-12T15:18:17Z

resolved in #43084, closing as duplicate

Krish SHARMA and others added 10 commits January 4, 2026 00:44

Fix Qwen3OmniMoe Talker loading and config initialization

d335a94

Fix lint error: remove unused pytest import

daeeb86

fixed the pytest error

2266a26

fixed the pytest error

f33e549

removed all pytest

b3330a7

Trigger CI after running check_copies

b56f6c7

Trigger CI after running check_copies

c12448a

all checks passed

6897251

format issues corrected ruff

10ee92f

Merge branch 'main' into fix-qwen3-omni-moe-loading

48c231f

zucchini-nlp closed this Jan 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Qwen3OmniMoe Talker loading and config initialization#43091

Fix Qwen3OmniMoe Talker loading and config initialization#43091
Krish2002 wants to merge 10 commits intohuggingface:mainfrom
Krish2002:fix-qwen3-omni-moe-loading

Krish2002 commented Jan 3, 2026

Uh oh!

github-actions bot commented Jan 5, 2026

Uh oh!

Rocketknight1 commented Jan 12, 2026

Uh oh!

zucchini-nlp commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Krish2002 commented Jan 3, 2026

What does this PR do?

Fix 1: AttributeError: Qwen3OmniMoeTalkerForConditionalGeneration has no attribute 'lm_head'

Issue

Fix

Fix 2: AttributeError: ... object has no attribute 'initializer_range'

Issue

Fix

Breaking changes

Tests

Uh oh!

github-actions bot commented Jan 5, 2026

Uh oh!

Rocketknight1 commented Jan 12, 2026

Uh oh!

zucchini-nlp commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix 1: `AttributeError: Qwen3OmniMoeTalkerForConditionalGeneration has no attribute 'lm_head'`

Fix 2: `AttributeError: ... object has no attribute 'initializer_range'`