-
Notifications
You must be signed in to change notification settings - Fork 212
PoC: Revamp optimizer and scheduler experience using registries #777
PoC: Revamp optimizer and scheduler experience using registries #777
Conversation
…ort for HF transformers provided schedulers.
Dear @karthikrangasai, I added some modifications. Still wip. Do you think we should remove entirely Best, |
Codecov Report
@@ Coverage Diff @@
## master #777 +/- ##
==========================================
- Coverage 85.18% 78.49% -6.69%
==========================================
Files 228 230 +2
Lines 12566 12666 +100
==========================================
- Hits 10704 9942 -762
- Misses 1862 2724 +862
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
…, Any]]. Added necessary tests as well.
…nd not optimizer names.
…d of all text libraries.
optimizer_kwargs: Additional kwargs to use when creating the optimizer (if not passed as an instance). | ||
scheduler: The scheduler or scheduler class to use. | ||
scheduler_kwargs: Additional kwargs to use when creating the scheduler (if not passed as an instance). | ||
lr_scheduler: The scheduler or scheduler class to use. | ||
learning_rate: Learning rate to use for training, defaults to ``1e-3``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove the learning rate which is obselete now.
Hey @SeanNaren @ethanwharris. Should we drop With @karthikrangasai, we are currently thinking to remove it entirely. It will make accessibility worse but it will be fully consistent across the codebase. Best, |
Similar to the discussion of deprecating arguments in the Trainer, it's a convenience for the user. Do we feel it isn't warranted to add to the Task itself? Do we think asking users to just use kwargs all the time is intuitive enough to drop it? I'm personally unsure but curious what others have to say! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After those updates, it should be good to go.
What does this PR do?
Fixes #752
Optimizer and scheduler can currently be an instance but that doesn't comply with the way the code works since you need the model to the instantiate optimizer and need the optimizer to instantiate the scheduler. The main idea is to use the
FlashRegistry
class for this. This blends in well with the idea of Flash being a library for fast prototyping of ML models.This PR focuses on fixing this issue using a few ideas:
How the API looks after this PR is appiled:
The optimizer of choice can be passed as a
The scheduler of choice can be passed as a
You can also register you own custom scheduler recipes beforehand and use them shown as above:
This keeps the work of instantiation and setup of optimizer and scheduler from the flash library to mitigate any errors a user might make if we allowed class names to be passed.
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃