Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for optimizers and learning rate schedulers to LightningCLI #8093

Merged
merged 19 commits into from
Jul 1, 2021
Merged

Add support for optimizers and learning rate schedulers to LightningCLI #8093

merged 19 commits into from
Jul 1, 2021

Conversation

mauvilsa
Copy link
Contributor

@mauvilsa mauvilsa commented Jun 23, 2021

What does this PR do?

New feature for LightningCLI to make optimizers and learning rate schedulers easily configurable. It implements the proposal mentioned in #7576 (comment) except for minor differences required during development. How it should be used is explained in lightning_cli.rst.

This PR does not close #7576.

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

@pep8speaks
Copy link

pep8speaks commented Jun 23, 2021

Hello @mauvilsa! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-06-30 20:45:35 UTC

@codecov
Copy link

codecov bot commented Jun 23, 2021

Codecov Report

Merging #8093 (f971dbb) into master (57dce72) will decrease coverage by 5%.
The diff coverage is 99%.

@@           Coverage Diff           @@
##           master   #8093    +/-   ##
=======================================
- Coverage      93%     88%    -5%     
=======================================
  Files         212     212            
  Lines       13595   13665    +70     
=======================================
- Hits        12635   12024   -611     
- Misses        960    1641   +681     

@carmocca carmocca added argparse (removed) Related to argument parsing (argparse, Hydra, ...) feature Is an improvement or enhancement labels Jun 23, 2021
@carmocca carmocca added this to the v1.4 milestone Jun 23, 2021
@mergify mergify bot removed the has conflicts label Jun 25, 2021
Copy link
Contributor

@carmocca carmocca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @jlperla for his thoughts

pytorch_lightning/utilities/cli.py Outdated Show resolved Hide resolved
docs/source/common/lightning_cli.rst Show resolved Hide resolved
@carmocca carmocca added the ready PRs ready to be merged label Jun 25, 2021
@Borda Borda requested a review from justusschock June 25, 2021 14:42
@jlperla
Copy link

jlperla commented Jun 25, 2021

It all sounds great!

I couldn't quite tell what the CLI would look for for choosing an which optimizer and which lr schedulers (and arguments) and how the default ones and values are set... but maybe I just can't read the diffs correctly.

The only other thing I would say is that whatever that way is, I would doublecheck that it works fine on the grid CLI with escaping characters/etc. Since trying different permutations of optimizers, LR schedulers, and hyperparameters for them is a key use-case. If grid has trouble with the escaping I think it is important to know early.

mauvilsa and others added 2 commits June 25, 2021 20:26
- Changed instantiate_class so that it does not depend on jsonargparse.
- Added yaml snippets for more clarity.
@carmocca carmocca enabled auto-merge (squash) June 25, 2021 19:19
Copy link
Contributor

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your docs are always 100% super nice

docs/source/common/lightning_cli.rst Outdated Show resolved Hide resolved
pytorch_lightning/utilities/cli.py Outdated Show resolved Hide resolved
docs/source/common/lightning_cli.rst Show resolved Hide resolved
pytorch_lightning/utilities/cli.py Outdated Show resolved Hide resolved
tests/utilities/test_cli.py Show resolved Hide resolved
@carmocca carmocca disabled auto-merge June 25, 2021 22:30
@jlperla
Copy link

jlperla commented Jun 26, 2021

However, note that for SGD the learning rate by design is a required parameter and has no default. So in this case only giving the class path would not work. Someone could subclass SGD to give lr a default and make it work. But this is not recommended because for SGD there is no good default. The range of its optimal value depends on the task.

Given the comments above, I am worried that this has a terminal design flaw that makes it unusable in practice.

This sounds like there is no way to set reasonable defaults that are problem specific. And hence in practice an entire dictionary needs to be passed in each time. That is verbose. And "users" need to know what the reasonable problem dependent defaults are for those all as well. At that point I should be using a json file rather than the cli

If that is the design and I am not misunderstanding it, and there is no way to change the defaults in my code, I just dont think I can use this feature. Creating my own subclasses for all optimizers /etc is not a solution.

Isn't there a design or tweak that can make it more like the nice, hierarchical CLI that come up in the rest of this package?

@mauvilsa
Copy link
Contributor Author

@jlperla I believe that you have some expectations for the cli that I don't think are justified. What makes something a cli is being able to run something from the command line. It is not that every minute detail should be grid run sweepable using individual command line arguments. Don't confuse current cli limitations with limitations of grid which should be resolved there. The focus of LightningCLI from the start has been configuration files. Config files allow to overcome limitations that individual command line arguments have. Not sure what you mean by "json file rather than the cli" but that just makes me think of configuration file for a cli.

Luckily the optimizer example is perfect to illustrate why this should be done via a config file instead of individual command line arguments. Imagine that for some task you want to try different optimizers. Each optimizer has its own parameters that could be varied. Even if multiple optimizers have a parameter with the same name, e.g. lr, in general it does not make sense to sweep the same range of values all optimizers. If you are also going to try SGD it does not make sense to try it with a single "default" learning rate. Multiple values should be tried. That is simply the way SGD works, not related to a design of a cli or a sweeper. It is not possible to know a priori what a good value will be. That is why in pytorch it does not have a default value for it.

I have no good idea how the example above could be easily done using individual command line arguments. But with a config file this can be defined quite clearly, for example:

optimizer:
  sweep:
    - class_path: torch.optim.SGD
      init_args:
        lr:
          sweep: [0.00001, 0.0001, 0.001, 0.01, 0.1, 1.0]
        momentum:
          sweep: [0, 0.1]
    - class_path: torch.optim.Adam
      init_args:
        lr:
          sweep: [0.001, 0.01]
        amsgrad:
          sweep: [True, False]
    - class_path: torch.optim.RMSprop

This would be for some hypothetical sweeper that would take care of trying out all the combinations defined in the config file. The cli itself would not be aware of the sweeping and only get a config file with individual values without any sweep:.

mauvilsa and others added 4 commits June 28, 2021 08:30
@jlperla
Copy link

jlperla commented Jun 28, 2021

@jlperla I believe that you have some expectations for the cli that I don't think are justified. What makes something a cli is being able to run something from the command line. It is not that every minute detail should be grid run sweepable using individual command line arguments.

To me, that is really the only reason I want to use the CLI. If I can't make it sweepable with grid and then see the results in the grid webview then I need to use a different system. And just to be clear, this is not narrowly for JUST grid. I would be doing the identical thing on a cluster myself. Come up with the cartesian product of these things, run a whole bunch of aruments with CLI, etc. Not modify individual config files). So this isn't any sort of "lets change PL to be more grid-friendly" as you would want the same thing even if you weren't on grid.

I should also point out that this is the only sane way to work with debugging in PL. In vscode, you set the CLI arugments you want to work with in a new "launch" type and then hit F5.

Don't confuse current cli limitations with limitations of grid which should be resolved there. The focus of LightningCLI from the start has been configuration files.

I don't see a limitation of grid. Except that with grid I hope to one day be able to execute these things programmatically instead of literally calling from a shell - but for that you would need just as much control over default values for execution.

I see a limitation of the CLI. If the config files are the first and foremost goal, then I missed things (and I think it should be renamed as the name LightningCLI suggests otherwise). What is the point then vs. Hydra, which does an amazing job of handling config files?

That is simply the way SGD works, not related to a design of a cli or a sweeper. It is not possible to know a priori what a good value will be. That is why in pytorch it does not have a default value for it.

Exactly! That is why you need to find the value for your problem and then set it. This shows why managing default values is so essential.

Luckily the optimizer example is perfect to illustrate why this should be done via a config file instead of individual command line arguments. I

If that is the assumption and the starting point, then that is fine - but it means that the feature is useful for people like me, and most low-tech users of grid.

Even if multiple optimizers have a parameter with the same name, e.g. lr,
Exactly. That is why you need to tie CLI parameters as usual.

Each optimizer has its own parameters that could be varied.

Exactly, that is why you need to be able to change the defaults for when you choose that parameters.

I have no good idea how the example above could be easily done using individual command line arguments.

I think this is the problem. It hasn't been designed for the CLI. And just to be clear, for this, there is absolutetly nothing specific to my problem. Every single project has the same need for choosing optimizers/learning rate schedulers... tinnkering arorund until you find a nice default... baking them in... and then moving onto the other parts of the project.

I have discussed this with @tchaton and @zippeurfou at various points and they might have some insights on this general topic, but I think the core of the problem is that (for me at least), I want the LightningCLI as a very simple CLI solution for simple projects. I want to be able to write a bunch of calls tot he function in either grid on on my cluster etc. without manually messing around with YAML files. If I wanted a more complciated config file solution, I would look to hydra. But I don't.

@edenlightning edenlightning removed the ready PRs ready to be merged label Jun 28, 2021
@mauvilsa
Copy link
Contributor Author

mauvilsa commented Jun 29, 2021

Being able to provide init arguments of a subclass as independent command line arguments (that is avoiding the need to provide a json) is a general feature that would be implemented in jsonargparse. Regarding the defaults for subclasses, again this is a new feature for jsonargparse and probably with a small change in LightningArgumentParser to be able to specify them. I don't think these are difficult to do.

Neither of these features are specific to optimizers and learning rate schedulers which is the purpose of this pull request. And what is currently implemented here is not incompatible with these additional features. For me it is fine if this pull request does not close #7576 and it is closed only when the rest is ready. But I don't see any more comments that require changes for this pull request

@carmocca carmocca self-requested a review June 29, 2021 22:57
docs/source/common/lightning_cli.rst Show resolved Hide resolved
docs/source/common/lightning_cli.rst Show resolved Hide resolved
pytorch_lightning/utilities/cli.py Outdated Show resolved Hide resolved
pytorch_lightning/utilities/cli.py Outdated Show resolved Hide resolved
pytorch_lightning/utilities/cli.py Outdated Show resolved Hide resolved
pytorch_lightning/utilities/cli.py Outdated Show resolved Hide resolved
pytorch_lightning/utilities/cli.py Show resolved Hide resolved
pytorch_lightning/utilities/cli.py Show resolved Hide resolved
@carmocca carmocca self-requested a review June 29, 2021 23:00
Copy link
Contributor

@carmocca carmocca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is good to go as it is and already large enough. Some recap notes:

@carmocca carmocca merged commit 3c74502 into Lightning-AI:master Jul 1, 2021
@mauvilsa mauvilsa deleted the cli_optimizers_and_lr_schedulers branch July 1, 2021 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
argparse (removed) Related to argument parsing (argparse, Hydra, ...) feature Is an improvement or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Easier way to configure optimizers and schedulers in the CLI
9 participants