Skip to content

Deprecate dtype per sub config#42990

Merged
zucchini-nlp merged 5 commits intohuggingface:mainfrom
zucchini-nlp:dtype-mulltimodal
Jan 13, 2026
Merged

Deprecate dtype per sub config#42990
zucchini-nlp merged 5 commits intohuggingface:mainfrom
zucchini-nlp:dtype-mulltimodal

Conversation

@zucchini-nlp
Copy link
Copy Markdown
Member

What does this PR do?

Fixes #42968, #42641

As discussed, this PR deprecated different dtypes per backbone and falls back to the main dtype if a user is requesting different dtypes.

There is no clean way to support different dtypes when the backbone can have a task head. We either cast hidden states to task dtype in all model files, or we need a workaround to infer task head and force dtype on it when loading. So the better solution is to deprecate dict dtypes, apparently it is not used much

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: llava, llava_next, llava_next_video, llava_onevision

@github-actions
Copy link
Copy Markdown
Contributor

This comment contains run-slow, running the specified jobs:

models: ["models/llava", "models/llava_next", "models/llava_next_video", "models/llava_onevision"]
quantizations: []

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

✅ No failing test specific to this PR 🎉 !

Comment thread tests/utils/test_modeling_utils.py Outdated
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Comment thread src/transformers/modeling_utils.py Outdated
Comment on lines +869 to +874
logger.warning_once(
f"The dtype for {sub_config_key}={sub_dtype} is different from the main dtype={main_dtype}. "
"Setting different dtypes per backbone model might cause device errors downstream, setting the "
f"dtype={main_dtype} for all modules. Note, using different dtypes per module is deprecated and will "
"be removed in future versions."
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means that in the future it will raise an exception, right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, after a few released we can switch to an error

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means that if a user sets dtype="auto", all sub modules will be forcibly cast to the specified main_dtype?

Copy link
Copy Markdown
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, then we need to plainly remove dtype being passed as a dict as well!

Comment thread src/transformers/modeling_utils.py Outdated
Copy link
Copy Markdown
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, if there is not many models with different dtypes in subconfigs, it's indeed better to only show the warning when explicitly passing a dict, and silently skipping otherwise!

@zucchini-nlp zucchini-nlp merged commit 7918b15 into huggingface:main Jan 13, 2026
25 checks passed
SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Jan 23, 2026
* deprecate dtypes per sub config

* fix test

* Update tests/utils/test_modeling_utils.py

Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>

* address comment

---------

Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LlavaNextForConditionalGeneration forward pass broken

6 participants