Skip to content

Conversation

@ArthurZucker
Copy link
Collaborator

What does this PR do?

This add the same support that we have in the PretrainedConfig, where additional kwargs are automaticallu updated.
This will allow users to re-use the GenerationConfig class for most of the use_cases, whithout having to add a model specific class. I was trying to load the following generation_config and got half of my additional arguments deleted 😉

@ArthurZucker ArthurZucker requested review from gante and sgugger and removed request for gante January 23, 2023 20:50
Copy link
Contributor

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably can remove self.generation_kwargs = kwargs.pop("generation_kwargs", {}) on L277, it was intended as a restricted version on these changes

(for context, the model config has the same lines here)

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jan 23, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

@ArthurZucker
Copy link
Collaborator Author

Also will have to add test + this is apparently breaking a lot of things haha

@ArthurZucker
Copy link
Collaborator Author

ArthurZucker commented Jan 24, 2023

Okay, after talking a bit with @gante and testing, this is not the best, this PR will focus on other missing functionalities. Mostly addition of the dict_torch_dtype_to_str function, as the dtype could be passed to the generation 😉
The problem is mostly that if we process all the additional kwargs, we are getting all of the arguments from the configuration.json which mixes things up.
The simplest solution is either to store them in generate_kwargs or re-write the configuration for the model. I though this was cumbersome but it is actually the most logical and cleanest way to do it.

EDIT : gonna just add a condition, if the kwargs are from a config file, they are not added.

@ArthurZucker
Copy link
Collaborator Author

Now only thing left is to add a pretty test with all the different edge cases I encountered.

# remove all the arguments that are in the config_dict

config = cls(**config_dict, **kwargs)
unused_kwargs = config.update(**kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line now only exists to obtain unused_kwargs, as the kwargs get written to the config in the line above, correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well yes, for example when _from_model_config is set to True, then it is still in the kwargs, think I saw something like this.

self.transformers_version = kwargs.pop("transformers_version", __version__)

# Additional attributes without default values
if not self._from_model_config:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a comment here explaining why we need this if, otherwise we may be like "wtf?" in the future

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks for the comment!

ArthurZucker and others added 2 commits January 24, 2023 17:22
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
@ArthurZucker ArthurZucker requested a review from sgugger January 24, 2023 17:10
@ArthurZucker ArthurZucker merged commit 94a7edd into huggingface:main Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants