[RFC] Enable Lightning Flash to support scheduler on step or monitor #752

tchaton · 2021-09-10T10:41:59Z

🐛 Bug

Currently, in Lightning Flash, optimizer and scheduler creation is being automated as follow.

class Task(...):

    def configure_optimizers(self) -> Union[Optimizer, Tuple[List[Optimizer], List[_LRScheduler]]]:
        optimizer = self.optimizer
        if not isinstance(self.optimizer, Optimizer):
            self.optimizer_kwargs["lr"] = self.learning_rate
            optimizer = optimizer(filter(lambda p: p.requires_grad, self.parameters()), **self.optimizer_kwargs)
        if self.scheduler:
            return [optimizer], [self._instantiate_scheduler(optimizer)]
        return optimizer

However, in PyTorch Lightning, someone would have to do the follow to enable training on step or monitoring for a scheduler.

class Task(...):


    def configure_optimizers(self) -> Union[Optimizer, Tuple[List[Optimizer], List[_LRScheduler]]]:
        optimizer = self.optimizer
        if not isinstance(self.optimizer, Optimizer):
            self.optimizer_kwargs["lr"] = self.learning_rate
            optimizer = optimizer(filter(lambda p: p.requires_grad, self.parameters()), **self.optimizer_kwargs)
        if self.scheduler:
            return [optimizer], [{"scheduler": self._instantiate_scheduler(optimizer), "interval": "step", "monitor": "val_loss"}]
        return optimizer

Here is the default scheduler dict.

    return {
        "scheduler": None,
        "name": None,  # no custom name
        "interval": "epoch",  # after epoch is over
        "frequency": 1,  # every epoch/batch
        "reduce_on_plateau": False,  # most often not ReduceLROnPlateau scheduler
        "monitor": None,  # value to monitor for ReduceLROnPlateau
        "strict": True,  # enforce that the monitor exists for ReduceLROnPlateau
        "opt_idx": None,  # necessary to store opt_idx when optimizer frequencies are specified
    }

A possible solution API to support scheduler on step would be to provide the entire default dict while instantiating as follow.

ImageClasifier(
    scheduler={
        "scheduler": torch.optim.lr_scheduler.StepLR,
        "interval": "step",
    },
    scheduler_kwargs={"step_size": 1},
)

or

ImageClasifier(
    scheduler={
        "scheduler": torch.optim.lr_scheduler.ReduceLROnPlateau
        "monitor": "val_loss",
    }
)

To Reproduce

Steps to reproduce the behavior:

Go to '...'
Run '....'
Scroll down to '....'
See error

Code sample

Expected behavior

Environment

PyTorch Version (e.g., 1.0):
OS (e.g., Linux):
How you installed PyTorch (conda, pip, source):
Build command you used (if compiling from source):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Additional context

The text was updated successfully, but these errors were encountered:

tchaton · 2021-09-10T10:42:24Z

@ethanwharris @karthikrangasai Any opinions ?

karthikrangasai · 2021-09-10T10:54:00Z

@tchaton we should definitely make this change as it gives more control to the end user.

ethanwharris · 2021-09-10T11:01:15Z

@tchaton I think support for this would be good. I spoke with @karthikrangasai recently about a revamp for the whole experience around optimizers / schedulers. A couple of ideas we came up with:

allow for callables to be passed, so users can do something like this: functools.partial(MultiStepLR, milestones=[100, 150]) - rather than providing a kwargs dictionary
pre-register some good standard scheduler configurations in the scheduler registry, and allow (/ document) extending this. E.g. you could do:

@ImageClassifier.schedulers
def multi_step_100_150(optimizer):
    return MultiStepLR(optimizer, [100, 150])

to register your scheduler recipes.

We could extend this further with the suggestion here (possibly using registry metadata) to be something like this maybe:

@ImageClassifier.schedulers
@ImageClassifier.schedulers("multi_step_100_150_on_step", interval="step")
def multi_step_100_150(optimizer):
    return MultiStepLR(optimizer, [100, 150])

tchaton · 2021-09-10T11:05:40Z

Yes, I really like the registry approach.

tchaton added bug / fix Something isn't working help wanted Extra attention is needed labels Sep 10, 2021

ethanwharris assigned karthikrangasai Sep 10, 2021

karthikrangasai mentioned this issue Sep 20, 2021

PoC: Revamp optimizer and scheduler experience using registries #777

Merged

10 tasks

tchaton closed this as completed in #777 Oct 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Enable Lightning Flash to support scheduler on step or monitor #752

[RFC] Enable Lightning Flash to support scheduler on step or monitor #752

tchaton commented Sep 10, 2021 •

edited

Loading

tchaton commented Sep 10, 2021

karthikrangasai commented Sep 10, 2021

ethanwharris commented Sep 10, 2021 •

edited

Loading

tchaton commented Sep 10, 2021

[RFC] Enable Lightning Flash to support scheduler on step or monitor #752

[RFC] Enable Lightning Flash to support scheduler on step or monitor #752

Comments

tchaton commented Sep 10, 2021 • edited Loading

🐛 Bug

To Reproduce

Code sample

Expected behavior

Environment

Additional context

tchaton commented Sep 10, 2021

karthikrangasai commented Sep 10, 2021

ethanwharris commented Sep 10, 2021 • edited Loading

tchaton commented Sep 10, 2021

tchaton commented Sep 10, 2021 •

edited

Loading

ethanwharris commented Sep 10, 2021 •

edited

Loading