Fix Qwen3OmniMoe Talker loading and config initialization#43091
Closed
Krish2002 wants to merge 10 commits intohuggingface:mainfrom
Closed
Fix Qwen3OmniMoe Talker loading and config initialization#43091Krish2002 wants to merge 10 commits intohuggingface:mainfrom
Krish2002 wants to merge 10 commits intohuggingface:mainfrom
Conversation
Contributor
|
[For maintainers] Suggested jobs to run (before merge) run-slow: qwen3_omni_moe |
Member
Member
|
resolved in #43084, closing as duplicate |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This PR fixes two issues that prevented the model
Qwen/Qwen3-Omni-30B-A3B-Instructfrom loading correctly withAutoModelForMultimodalLM.Fix 1:
AttributeError: Qwen3OmniMoeTalkerForConditionalGeneration has no attribute 'lm_head'Issue
Qwen3OmniMoeTalkerForConditionalGenerationdeleteslm_headin its__init__method, but it inherits
_tied_weights_keysfrom its parent class(
Qwen3MoeForCausalLM), which referenceslm_head.weight.During model loading,
mark_tied_weights_as_initialized()attempts to accesslm_head.weight, resulting in anAttributeError.Fix
Explicitly set
_tied_weights_keys = {}forQwen3OmniMoeTalkerForConditionalGeneration, since this model does not use tiedweights.
This change is implemented in:
src/transformers/models/qwen3_omni_moe/modular_qwen3_omni_moe.pyFix 2:
AttributeError: ... object has no attribute 'initializer_range'Issue
Several composite config classes were missing the
initializer_rangeattribute:Qwen3OmniMoeTalkerConfigQwen3OmniMoeCode2WavConfigQwen3OmniMoeConfigWhen
_initialize_missing_keys()runs during model loading, it may call_init_weights()for modules that are not loaded from the checkpoint._init_weights()expectsself.config.initializer_range, which caused anAttributeError.Fix
Add an
initializer_rangeparameter (default:0.02) and store it as anattribute in the affected config classes.
This change is implemented in:
src/transformers/models/qwen3_omni_moe/modular_qwen3_omni_moe.pyBreaking changes
None.
Tests
Added a unit-level regression test in:
tests/models/qwen3_omni_moe/test_configuration_and_loading.py
The test verifies that:
Qwen3OmniMoeTalkerForConditionalGenerationhas empty tied weight keys.Qwen3OmniMoeTalkerConfig,Qwen3OmniMoeCode2WavConfig, andQwen3OmniMoeConfigall define theinitializer_rangeattribute.Additionally, verified locally that the model loads successfully using the
following reproduction script: