Load generation config from nested configs by zucchini-nlp · Pull Request #42922 · huggingface/transformers

zucchini-nlp · 2025-12-17T13:13:55Z

What does this PR do?

Fixes #42794. The issue was that model's generation params are saved in text config, but we couldn't call config.get_text_config() on a dict object. We can delete the second load_pretrained because each model gets a default generation config from model config in __init__

We could technically get the text config from dict, but I don't see a reason for adding more code if we can delete redundant blocks

HuggingFaceDocBuilderDev · 2025-12-17T13:22:39Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

vasqu

Nice thx for the test just nits and let's double-check possibly affected models (with run-slow or so)

src/transformers/generation/utils.py

vasqu · 2025-12-17T17:01:27Z

tests/generation/test_utils.py

        pixel_values = floats_tensor((2, 3, 30, 30))
        conditioning_input = torch.tensor([[10], [10]])  # this should be the 2nd output token, after the BOS token
-        model = AutoModelForVision2Seq.from_pretrained(
+        model = AutoModelForImageTextToText.from_pretrained(


Looks unrelated? Nothing to change just noticed

the mapping was deprecated in favor of AutoModelForImageTextToText, so just making sure we don't use deprecated stuff in tests

src/transformers/generation/utils.py

zucchini-nlp · 2025-12-18T10:50:50Z

run-slow: t5, bart, mbart, umt5

github-actions · 2025-12-18T10:51:59Z

This comment contains run-slow, running the specified jobs:

models: ["models/bart", "models/mbart", "models/t5", "models/umt5"]
quantizations: []

github-actions · 2025-12-18T11:17:35Z

CI Results

Workflow Run ⚙️

Model CI Report

❌ Failed tests

bart:
tests/models/bart/test_modeling_bart.py::FastIntegrationTests::test_xsum_1_1_batch_generation
tests/models/bart/test_modeling_bart.py::FastIntegrationTests::test_xsum_1_1_generation
tests/models/bart/test_modeling_bart.py::BartModelIntegrationTests::test_decoder_attention_mask
mbart:
tests/models/mbart/test_modeling_mbart.py::MBartEnroIntegrationTest::test_enro_generate_batch

zucchini-nlp · 2025-12-18T11:53:39Z

super weird, I am getting a failure with these test cases even on main branch. Will try also by ssh-ing to runners

edit: i am blind, i was using an incorrect python env and thus checking against this same branch 🤦🏻

zucchini-nlp · 2025-12-18T17:26:12Z

run-slow: t5, bart, mbart, umt5, llama, gemma3, mistral

github-actions · 2025-12-18T17:27:16Z

This comment contains run-slow, running the specified jobs:

models: ["models/bart", "models/gemma3", "models/llama", "models/mbart", "models/mistral", "models/t5", "models/umt5"]
quantizations: []

github-actions · 2025-12-18T17:54:11Z

CI Results

Workflow Run ⚙️

✅ No failing test specific to this PR 🎉 !

zucchini-nlp · 2025-12-18T18:17:52Z

run-slow: t5, bart, mbart, umt5, llama, gemma3, mistral

zucchini-nlp · 2025-12-18T18:33:36Z

Didn't see that slow tests were triggered. All pass, merging!

* fix * add comment * add a test * wording * this is it!

zucchini-nlp added 3 commits December 17, 2025 13:43

fix

bf730c3

add comment

7b469ff

add a test

83d985c

zucchini-nlp requested a review from vasqu December 17, 2025 13:13

vasqu approved these changes Dec 17, 2025

View reviewed changes

zucchini-nlp added 2 commits December 17, 2025 18:57

Merge remote-tracking branch 'upstream/main' into load-generation-config

dcf820f

wording

d628f35

this is it!

71b057d

zucchini-nlp merged commit f2c6d2a into huggingface:main Dec 18, 2025
26 checks passed

SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Jan 23, 2026

Load generation config from nested configs (huggingface#42922)

e5476f4

* fix * add comment * add a test * wording * this is it!

Conversation

zucchini-nlp commented Dec 17, 2025

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Dec 17, 2025

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vasqu Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zucchini-nlp commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025

CI Results

Model CI Report

❌ Failed tests

Uh oh!

zucchini-nlp commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zucchini-nlp commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025

CI Results

Uh oh!

zucchini-nlp commented Dec 18, 2025

Uh oh!

zucchini-nlp commented Dec 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zucchini-nlp commented Dec 18, 2025 •

edited

Loading