Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

[RFC] Enable Lightning Flash to support scheduler on step or monitor #752

Closed
tchaton opened this issue Sep 10, 2021 · 4 comments · Fixed by #777
Closed

[RFC] Enable Lightning Flash to support scheduler on step or monitor #752

tchaton opened this issue Sep 10, 2021 · 4 comments · Fixed by #777
Assignees
Labels
bug / fix Something isn't working help wanted Extra attention is needed

Comments

@tchaton
Copy link
Contributor

tchaton commented Sep 10, 2021

🐛 Bug

Currently, in Lightning Flash, optimizer and scheduler creation is being automated as follow.

class Task(...):

    def configure_optimizers(self) -> Union[Optimizer, Tuple[List[Optimizer], List[_LRScheduler]]]:
        optimizer = self.optimizer
        if not isinstance(self.optimizer, Optimizer):
            self.optimizer_kwargs["lr"] = self.learning_rate
            optimizer = optimizer(filter(lambda p: p.requires_grad, self.parameters()), **self.optimizer_kwargs)
        if self.scheduler:
            return [optimizer], [self._instantiate_scheduler(optimizer)]
        return optimizer

However, in PyTorch Lightning, someone would have to do the follow to enable training on step or monitoring for a scheduler.

class Task(...):


    def configure_optimizers(self) -> Union[Optimizer, Tuple[List[Optimizer], List[_LRScheduler]]]:
        optimizer = self.optimizer
        if not isinstance(self.optimizer, Optimizer):
            self.optimizer_kwargs["lr"] = self.learning_rate
            optimizer = optimizer(filter(lambda p: p.requires_grad, self.parameters()), **self.optimizer_kwargs)
        if self.scheduler:
            return [optimizer], [{"scheduler": self._instantiate_scheduler(optimizer), "interval": "step", "monitor": "val_loss"}]
        return optimizer

Here is the default scheduler dict.

    return {
        "scheduler": None,
        "name": None,  # no custom name
        "interval": "epoch",  # after epoch is over
        "frequency": 1,  # every epoch/batch
        "reduce_on_plateau": False,  # most often not ReduceLROnPlateau scheduler
        "monitor": None,  # value to monitor for ReduceLROnPlateau
        "strict": True,  # enforce that the monitor exists for ReduceLROnPlateau
        "opt_idx": None,  # necessary to store opt_idx when optimizer frequencies are specified
    }

A possible solution API to support scheduler on step would be to provide the entire default dict while instantiating as follow.

ImageClasifier(
    scheduler={
        "scheduler": torch.optim.lr_scheduler.StepLR,
        "interval": "step",
    },
    scheduler_kwargs={"step_size": 1},
)

or

ImageClasifier(
    scheduler={
        "scheduler": torch.optim.lr_scheduler.ReduceLROnPlateau
        "monitor": "val_loss",
    }
)

To Reproduce

Steps to reproduce the behavior:

  1. Go to '...'
  2. Run '....'
  3. Scroll down to '....'
  4. See error

Code sample

Expected behavior

Environment

  • PyTorch Version (e.g., 1.0):
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

@tchaton tchaton added bug / fix Something isn't working help wanted Extra attention is needed labels Sep 10, 2021
@tchaton
Copy link
Contributor Author

tchaton commented Sep 10, 2021

@ethanwharris @karthikrangasai Any opinions ?

@karthikrangasai
Copy link
Contributor

@tchaton we should definitely make this change as it gives more control to the end user.

@ethanwharris
Copy link
Collaborator

ethanwharris commented Sep 10, 2021

@tchaton I think support for this would be good. I spoke with @karthikrangasai recently about a revamp for the whole experience around optimizers / schedulers. A couple of ideas we came up with:

  • allow for callables to be passed, so users can do something like this: functools.partial(MultiStepLR, milestones=[100, 150]) - rather than providing a kwargs dictionary
  • pre-register some good standard scheduler configurations in the scheduler registry, and allow (/ document) extending this. E.g. you could do:
@ImageClassifier.schedulers
def multi_step_100_150(optimizer):
    return MultiStepLR(optimizer, [100, 150])

to register your scheduler recipes.

We could extend this further with the suggestion here (possibly using registry metadata) to be something like this maybe:

@ImageClassifier.schedulers
@ImageClassifier.schedulers("multi_step_100_150_on_step", interval="step")
def multi_step_100_150(optimizer):
    return MultiStepLR(optimizer, [100, 150])

@tchaton
Copy link
Contributor Author

tchaton commented Sep 10, 2021

Yes, I really like the registry approach.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug / fix Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants