Load generation config from nested configs#42922
Load generation config from nested configs#42922zucchini-nlp merged 6 commits intohuggingface:mainfrom
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
vasqu
left a comment
There was a problem hiding this comment.
Nice thx for the test just nits and let's double-check possibly affected models (with run-slow or so)
| pixel_values = floats_tensor((2, 3, 30, 30)) | ||
| conditioning_input = torch.tensor([[10], [10]]) # this should be the 2nd output token, after the BOS token | ||
| model = AutoModelForVision2Seq.from_pretrained( | ||
| model = AutoModelForImageTextToText.from_pretrained( |
There was a problem hiding this comment.
Looks unrelated? Nothing to change just noticed
There was a problem hiding this comment.
the mapping was deprecated in favor of AutoModelForImageTextToText, so just making sure we don't use deprecated stuff in tests
|
run-slow: t5, bart, mbart, umt5 |
|
This comment contains models: ["models/bart", "models/mbart", "models/t5", "models/umt5"] |
CI ResultsModel CI Report❌ Failed tests
|
|
super weird, I am getting a failure with these test cases even on main branch. Will try also by ssh-ing to runners edit: i am blind, i was using an incorrect python env and thus checking against this same branch 🤦🏻 |
|
run-slow: t5, bart, mbart, umt5, llama, gemma3, mistral |
|
This comment contains models: ["models/bart", "models/gemma3", "models/llama", "models/mbart", "models/mistral", "models/t5", "models/umt5"] |
CI Results✅ No failing test specific to this PR 🎉 ! |
|
run-slow: t5, bart, mbart, umt5, llama, gemma3, mistral |
|
Didn't see that slow tests were triggered. All pass, merging! |
* fix * add comment * add a test * wording * this is it!
What does this PR do?
Fixes #42794. The issue was that model's generation params are saved in text config, but we couldn't call
config.get_text_config()on a dict object. We can delete the secondload_pretrainedbecause each model gets a default generation config from model config in__init__We could technically get the text config from dict, but I don't see a reason for adding more code if we can delete redundant blocks