[fix] Attach train+val dataloaders to trainer in trainer loop #7207

ananthsub · 2021-04-26T03:34:26Z

What does this PR do?

Fixes #7208 .

The issue occurs because the validation dataloader is never attached to the trainer by the time we reach this stage in the training loop: https://github.com/PyTorchLightning/pytorch-lightning/blob/44d775fccfb825561937f6fa03fe258af25c2b83/pytorch_lightning/trainer/training_loop.py#L558-L560

Currently this is conditional based on this logic, which is called once at the start of training: https://github.com/PyTorchLightning/pytorch-lightning/blob/44d775fccfb825561937f6fa03fe258af25c2b83/pytorch_lightning/trainer/trainer.py#L604

Removing the reload_dataloaders_every_epoch check here gives me consistent behavior again.

I don't think this is a fully proper fix - we should be attaching these dataloaders to the trainer more clearly, and operating them on them for reloads later on in the training loop.

This is a behavior change that resulted after #6075. Before, run_evaluation ran, which populated the dataloader settings on the trainer, and then the checkpoint callback ran. After, the checkpoint callback ran first, before run_evaluation. the check for whether we run evaluation depends on the val dataloader being present, which was being set inside of run_evaluation.

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

codecov · 2021-04-26T03:54:25Z

Codecov Report

Merging #7207 (c97d172) into master (338f5a3) will decrease coverage by 4%.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master   #7207    +/-   ##
=======================================
- Coverage      91%     87%    -4%     
=======================================
  Files         199     199            
  Lines       12797   12797            
=======================================
- Hits        11678   11145   -533     
- Misses       1119    1652   +533

pep8speaks · 2021-04-26T05:31:35Z

Hello @ananthsub! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-04-29 22:07:06 UTC

tests/trainer/test_dataloaders.py

pytorch_lightning/trainer/training_loop.py

edenlightning · 2021-04-26T10:55:23Z

@justusschock @awaelchli

ananthsub · 2021-04-27T20:38:31Z

@awaelchli @justusschock @tchaton what do you think?

justusschock

Looks good to me.

tests/trainer/test_dataloaders.py

ananthsub · 2021-04-28T20:01:54Z

@tchaton @awaelchli @Borda mind taking a look?

tests/trainer/test_dataloaders.py

ananthsub · 2021-04-30T05:12:51Z

@carmocca @Borda @tchaton mind taking a look?

ananthsub requested review from awaelchli, Borda, carmocca, justusschock, kaushikb11, SeanNaren, tchaton and williamFalcon as code owners April 26, 2021 03:34

ananthsub changed the title ~~Attach train+val dataloaders to trainer in trainer loop~~ [fix] Attach train+val dataloaders to trainer in trainer loop Apr 26, 2021

ananthsub mentioned this pull request Apr 26, 2021

[Bugfix] Fixed epoch level schedulers not being called when val_check_interval < 1.0 #6075

Merged

11 tasks

ananthsub commented Apr 26, 2021

View reviewed changes

tests/trainer/test_dataloaders.py Show resolved Hide resolved

awaelchli added the bug Something isn't working label Apr 26, 2021

awaelchli added this to the v1.3 milestone Apr 26, 2021

mergify bot added the has conflicts label Apr 26, 2021

tchaton reviewed Apr 26, 2021

View reviewed changes

pytorch_lightning/trainer/training_loop.py Show resolved Hide resolved

pytorch_lightning/trainer/training_loop.py Show resolved Hide resolved

justusschock reviewed Apr 28, 2021

View reviewed changes

tests/trainer/test_dataloaders.py Show resolved Hide resolved

justusschock self-requested a review April 28, 2021 06:48

justusschock approved these changes Apr 28, 2021

View reviewed changes

mergify bot removed the has conflicts label Apr 28, 2021

awaelchli approved these changes Apr 28, 2021

View reviewed changes

ananthsub added ready PRs ready to be merged data handling Generic data-related topic labels Apr 29, 2021

ananthsub added 3 commits April 29, 2021 14:49

Update training_loop.py

1f83436

Update test_dataloaders.py

e37b9d5

changelog

8176238

ananthsub added 5 commits April 29, 2021 14:50

delay reload

e218dab

go back

c585f23

comments

06587f8

Update training_loop.py

727628a

Update test_dataloaders.py

7831688

ananthsub force-pushed the attach-dataloaders branch from c4cbff9 to 7831688 Compare April 29, 2021 21:50

awaelchli approved these changes Apr 29, 2021

View reviewed changes

tests/trainer/test_dataloaders.py Outdated Show resolved Hide resolved

Update tests/trainer/test_dataloaders.py

c97d172

awaelchli approved these changes Apr 30, 2021

View reviewed changes

carmocca approved these changes Apr 30, 2021

View reviewed changes

ananthsub merged commit e407edb into Lightning-AI:master Apr 30, 2021

ananthsub deleted the attach-dataloaders branch April 30, 2021 16:01

This was referenced Jul 16, 2021

rework dataloader reset logic in Trainer #8435

Closed

Clear dataloader references before attaching new dataloaders to Trainer #8442

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix] Attach train+val dataloaders to trainer in trainer loop #7207

[fix] Attach train+val dataloaders to trainer in trainer loop #7207

ananthsub commented Apr 26, 2021 •

edited by carmocca

Loading

codecov bot commented Apr 26, 2021 •

edited

Loading

pep8speaks commented Apr 26, 2021 •

edited

Loading

edenlightning commented Apr 26, 2021

ananthsub commented Apr 27, 2021

justusschock left a comment •

edited

Loading

ananthsub commented Apr 28, 2021

ananthsub commented Apr 30, 2021

[fix] Attach train+val dataloaders to trainer in trainer loop #7207

[fix] Attach train+val dataloaders to trainer in trainer loop #7207

Conversation

ananthsub commented Apr 26, 2021 • edited by carmocca Loading

What does this PR do?

Before submitting

PR review

Did you have fun?

codecov bot commented Apr 26, 2021 • edited Loading

Codecov Report

pep8speaks commented Apr 26, 2021 • edited Loading

Comment last updated at 2021-04-29 22:07:06 UTC

edenlightning commented Apr 26, 2021

ananthsub commented Apr 27, 2021

justusschock left a comment • edited Loading

Choose a reason for hiding this comment

ananthsub commented Apr 28, 2021

ananthsub commented Apr 30, 2021

ananthsub commented Apr 26, 2021 •

edited by carmocca

Loading

codecov bot commented Apr 26, 2021 •

edited

Loading

pep8speaks commented Apr 26, 2021 •

edited

Loading

justusschock left a comment •

edited

Loading