Move reload_dataloaders_every_n_epochs to the DataHooks class #8738

ananthsub · 2021-08-05T05:20:43Z

🚀 Feature

Motivation

We are auditing the Lightning components and APIs to assess opportunities for improvements:

reload_dataloaders_every_n_epochs is an argument to the Trainer constructor. However, this could be a property of the DataHooks class, instead of the trainer, as whether to initiate reloading the dataloading every n epochs should be determined by the actor providing the dataloaders (e.g. the LightningModule or LightningDataModule).

This is very similar to #8733 and how automatic/manual optimization is a property of the LightningModule. That property also started out as a trainer argument before being migrated to the lightning module. Since this pattern keeps occurring, we should separately understand why it's so appealing to add things to the trainer constructor instead of a more specific component.

Moreover, this one setting controls dataloader behavior for both train & val dataloaders. Do we need more granular control? Do we need two properties, one for training and one for validation? This could make sense as we could have very different epoch counts with features like val_check_interval, where the training epoch count != val epoch count. The property for validation would only apply during trainer.fit as trainer.validate only makes a single pass through the data.

However, the documentation for the test_dataloader: https://github.com/PyTorchLightning/pytorch-lightning/blob/963c26764682fa4cf64c93c5a7572ae0040e9c32/pytorch_lightning/core/hooks.py#L535-L537
Is this a copy/paste issue?

Pitch

Add a property to the DataHooks class for this in v1.5
Deprecate the Trainer argument for this in v1.5
Remove the Trainer argument in v1.7

Benefits:

Simplify the Trainer constructor (one fewer argument)
Keep the data loader management in one place instead of two (at the DataHooks level)

Alternatives

Keep as is?

Additional context

If you enjoy Lightning, check out our other projects! ⚡

_{Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning

Bolts: Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Lightning Transformers: Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.}

The text was updated successfully, but these errors were encountered:

ninginthecloud · 2021-08-16T22:58:25Z

Hi, @ananthsub Can I work on this issue? Thank you~

ninginthecloud · 2021-08-20T18:16:52Z

Hi, @ananthsub For this issue, I will do the migration of reload_dataloaders_every_n_epochs first before adding two different properties (train_*, val_*)to give more gradual control.
Because there exist some confusion about how reload_dataloaders_every_n_epochs works (see post), and sometimes, people want to only reload train_dataloader (post). I think it'll be great to add these two properties to avoid confusion and give more control.

ninginthecloud · 2021-08-20T18:22:03Z

One thing I'm not sure is if we still want to keep _should_reload_dl_epoch() in pytorch_lightning/trainer/properites.py. I tend to move it to fit_loop and evaluation_loop separately. Let me know what you think.

ananthsub · 2021-08-20T18:38:05Z

One thing I'm not sure is if we still want to keep _should_reload_dl_epoch() in pytorch_lightning/trainer/properites.py. I tend to move it to fit_loop and evaluation_loop separately. Let me know what you think.

I think it makes sense to be moved into the loops directly. the property is already private it's already private, and this way we can continue whittling down what's exposed on trainer properties. we could have a mini helper function to share across fit and evaluation loops if necessary

@awaelchli @tchaton @carmocca what do you think? it's somewhat related to #8946

tchaton · 2021-08-21T12:13:20Z

Hey @ananthsub,

I think it is fine to move _should_reload_dl_epoch to the loops.

And I believe it would be worth to explore better scheduling mechanism owned by the DataModule.

class DataModule

    def should_reload_train_dataloader(self, epoch: int, total_batch_idx) -> bool:
        return epoch % 2

ananthsub added feature Is an improvement or enhancement help wanted Open to be worked on data handling Generic data-related topic design Includes a design discussion labels Aug 5, 2021

ananthsub assigned ninginthecloud Aug 16, 2021

ananthsub added this to the v1.5 milestone Aug 16, 2021

ninginthecloud mentioned this issue Aug 18, 2021

Deprecate reload_dataloaders_every_n_epochs flag on Trainer and set it as a property for DataHooks #8986

Closed

12 tasks

awaelchli added the deprecation Includes a deprecation label Aug 19, 2021

ninginthecloud mentioned this issue Aug 20, 2021

Deprecate reload_dataloaders_every_n_epochs flag on Trainer and set it as a property for DataHooks #9024

Closed

12 tasks

tchaton added the let's do it! approved to implement label Aug 30, 2021

ninginthecloud mentioned this issue Sep 20, 2021

add _is_fresh_start_epoch attribute to fit_loop to avoid reload dl twice when training starts #9614

Closed

12 tasks

awaelchli modified the milestones: v1.5, v1.6 Nov 4, 2021

ananthsub mentioned this issue Nov 10, 2021

[RFC] Move Trainer's loop-affecting arguments to fit, validate, test, and predict #10444

Closed

carmocca modified the milestones: 1.6, 1.7 Feb 16, 2022

carmocca modified the milestones: pl:1.7, pl:future Jul 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move reload_dataloaders_every_n_epochs to the DataHooks class #8738

Move reload_dataloaders_every_n_epochs to the DataHooks class #8738

ananthsub commented Aug 5, 2021

ninginthecloud commented Aug 16, 2021

ninginthecloud commented Aug 20, 2021

ninginthecloud commented Aug 20, 2021

ananthsub commented Aug 20, 2021

tchaton commented Aug 21, 2021 •

edited

Loading

Move reload_dataloaders_every_n_epochs to the DataHooks class #8738

Move reload_dataloaders_every_n_epochs to the DataHooks class #8738

Comments

ananthsub commented Aug 5, 2021

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

If you enjoy Lightning, check out our other projects! ⚡

ninginthecloud commented Aug 16, 2021

ninginthecloud commented Aug 20, 2021

ninginthecloud commented Aug 20, 2021

ananthsub commented Aug 20, 2021

tchaton commented Aug 21, 2021 • edited Loading

tchaton commented Aug 21, 2021 •

edited

Loading