Adding optimizer_cls_and_kwargs to Trainer.__init__#34358
Adding optimizer_cls_and_kwargs to Trainer.__init__#34358ArthurZucker merged 10 commits intohuggingface:mainfrom apoorvkh:trainer-init-optimizer-cls
optimizer_cls_and_kwargs to Trainer.__init__#34358Conversation
muellerzr
left a comment
There was a problem hiding this comment.
Nice! Seems like a great fix :)
|
If you do |
SunMarc
left a comment
There was a problem hiding this comment.
Thanks for the PR ! LGTM ! I'm curious about why the optimizers args is not enough for you use case ?
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Great, thanks both! @muellerzr I had already run
|
|
@SunMarc I am writing a codebase that uses I believe the |
|
Most checks pass now! A few tests in |
|
Makes sense ! Thanks for explaining ! Maybe you can add a note explaining that in the description of the arg. As for the CI, this PR is indeed unrelated. We will fix it shortly ! |
|
For sure! Done. |
|
Sorry for the wait ! When the CI is fixed from our side, we will merge the PR @apoorvkh |
|
Sounds great, thanks! |
ArthurZucker
left a comment
There was a problem hiding this comment.
Thanks, would you like to add a bit of documentation for discoverability? With an example use case maybe? 🤗
|
Sure, how do those changes to |
…34358) * Adding `optimizer_cls_and_kwargs` to `Trainer.__init__` * formatting * make fix-copies docstring * added more docs for optimizer_cls_and_kwargs * add docs for Trainer(optimizer_cls_and_kwargs) * reverting anchor names
What does this PR do?
Currently,
Trainermust be extended (as follows) to provide a customtorch.optim.Optimizer. The existingoptimizersargument forTrainerassumes the model is already initialized on the correct devices (which is usually handled byTrainer).This PR adds an
optimizer_cls_and_kwargsargument toTrainer. This simply allows a user to passType[torch.optim.Optimizer]andDict[str, Any]when initializing theTrainerrather than having to extend the class (as above).I am making this PR after #31875 (cc: @amyeroberts @muellerzr)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.