[PoC] Add KFold - External Loop. #8715

tchaton · 2021-08-04T11:41:01Z

What does this PR do?

Fixes #839

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

pep8speaks · 2021-08-04T11:41:06Z

Hello @tchaton! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-08-09 11:06:55 UTC

codecov · 2021-08-04T11:42:27Z

Codecov Report

Merging #8715 (8d667f1) into master (aacd131) will decrease coverage by 4%.
The diff coverage is 60%.

@@           Coverage Diff           @@
##           master   #8715    +/-   ##
=======================================
- Coverage      93%     88%    -4%     
=======================================
  Files         169     171     +2     
  Lines       14071   14265   +194     
=======================================
- Hits        13040   12579   -461     
- Misses       1031    1686   +655

pl_examples/loops_customisation/k_fold.py

tchaton · 2021-08-04T18:50:54Z

pytorch_lightning/utilities/boring_model.py

@@ -0,0 +1,203 @@
+# Copyright The PyTorch Lightning team.


note for reviewing. Adding boring_model.py to utilities as it is pretty fundamental and should be part of the codebase for debugging purposes.

I don't think the same. We already have a boring model for our tests and a boring model for bug reports and debugging. I vote for not including it in utilities.

@carmocca Any thoughts ?

I ran into the same issue in #7614 (comment) (blocked for the same reason).

Some pl_examples rely on our MNIST implementation, which is in the test directory.

https://github.com/PyTorchLightning/pytorch-lightning/blob/69f287eb85c36412d5d4d6541bc25f6a75a977ea/pl_examples/basic_examples/mnist_datamodule.py#L40

We currently include everything in our distribution, which is what I try to avoid in the linked PR.

However, CI fails because then the pl_examples do not have access to the MNIST implementation if we do.

So we have two real options:

Duplicate the BoringModel/MNIST implementations in the pl_examples directory

Move BoringModel/MNIST to the source tree (what this PR does) to avoid code duplication.

I think I prefer (2). If you guys agree I can do this change in a separate PR

cc @Borda

Lets continue discussing in #8776

justusschock

TBH, I don't think this is the right way to approach this. This isn't what loops are meant for and this weakens the call hierarchy in a way we shouldn't allow. In the related issue this was discussed and approved from all sides to be a standalone class

pl_examples/loops_customisation/k_fold.py

pytorch_lightning/loops/base.py

Borda · 2021-08-04T20:53:15Z

can we get a bigger picture of the PR?

tchaton · 2021-08-05T08:12:52Z

TBH, I don't think this is the right way to approach this. This isn't what loops are meant for and this weakens the call hierarchy in a way we shouldn't allow. In the related issue this was discussed and approved from all sides to be a standalone class

Hey @justusschock.

I strongly disagree there. I believe this is what Loop are meant from the beginning.

Users should have the choice to either built on top of Lightning fit, validate, test, predict primitives + some carefully selected utilities provided by Lightning to fully control their fitting flow or switch any existing Loop within Lightning (which is definitely much advanced).

The first approach should be used when multiple trainer.fit are sequentially used such as in Task based learning, Cross Validation, etc... This should reduce boilerplate while enabling users to quickly experiment with new ideas or built their own framework on top of Lightning with their own callbacks system + a fully encapsulated logic.

Furthermore, if users properly implements the ExternalLoop contract, Lightning can add checkpointing + fault tolerant to their loops while maintaining full customization.

Note: The goal is not to expose the Trainer internals and we need to be clear this is meant to be used with fit, validate, test, predict primitives + some carefully selected utilities provided by Lightning to control data scheduling.

@awaelchli @williamFalcon Any thoughts there ?
Best,
T.C

awaelchli · 2021-08-05T08:24:06Z

pytorch_lightning/utilities/boring_model.py

@@ -0,0 +1,203 @@
+# Copyright The PyTorch Lightning team.


I don't think the same. We already have a boring model for our tests and a boring model for bug reports and debugging. I vote for not including it in utilities.

pl_examples/loops_customisation/k_fold.py

pytorch_lightning/loops/base.py

pytorch_lightning/trainer/data_loading.py

justusschock · 2021-08-10T07:37:24Z

pl_examples/loops_customisation/k_fold.py

+
+    # utilities for creating a hold
+    def process_dataset(self, stage: str, dataset: Dataset) -> Subset:
+        kfold = KFold(self.num_folds, random_state=42, shuffle=True)


is a dependency for sklearn worth it for just this?
Should we maybe have a more general abstract function create_splits the user has to implement? Since there are so many different ways to create data splits. And we then only iterate over the splits here.

justusschock · 2021-08-10T07:38:00Z

pl_examples/loops_customisation/k_fold.py

+from pytorch_lightning.utilities import rank_zero_only
+from pytorch_lightning.utilities.boring_model import BoringModel, RandomDataset
+
+seed_everything(42)


Suggested change

seed_everything(42)

rather not seed anything globally

justusschock · 2021-08-10T07:38:38Z

pl_examples/loops_customisation/k_fold.py

+    def loop_base_callback() -> Type[Callback]:
+        class BaseKFoldCallback(Callback):
+            @rank_zero_only
+            def on_fold_start(self, trainer, pl_module, counter):
+                """Override with your own logic"""
+
+        return BaseKFoldCallback


can't we define this outside this class but in the file namespace?

awaelchli · 2021-08-10T08:34:54Z

pl_examples/loops_customisation/k_fold.py

+loop = KFoldLoop(5)
+model = BoringModel()
+datamodule = BoringDataModule()
+loop.connect_trainer(max_epochs=10, callbacks=KFoldCallback())


alternatively these could be passed in through init via an argument trainer_kwargs.

awaelchli · 2021-08-10T08:37:43Z

pl_examples/loops_customisation/k_fold.py

+class BaseDataModule(LightningDataModule):
+    def __init__(self):
+        super().__init__()
+        self.non_picklable = None


curious, what was the idea here? seems left over :)

awaelchli · 2021-08-10T08:38:18Z

pl_examples/loops_customisation/k_fold.py

+    def __init__(self):
+        super().__init__()
+        self.non_picklable = None
+        self.checkpoint_state: Optional[str] = None


same question here :)

awaelchli · 2021-08-10T08:39:30Z

pytorch_lightning/trainer/connectors/checkpoint_connector.py

            "fit_loop": self.trainer.fit_loop.state_dict(),
            "validate_loop": self.trainer.validate_loop.state_dict(),
            "test_loop": self.trainer.test_loop.state_dict(),
            "predict_loop": self.trainer.predict_loop.state_dict(),
        }
+        external_loop = getattr(self.trainer, "external_loop", None)
+        if external_loop:
+            state_dict.update({"external_loop": external_loop.state_dict()})


can there be more than one external loop. I mean, one external loop nested inside another?

It could and Loop will automatically gather their children states.

awaelchli · 2021-08-10T08:41:24Z

pytorch_lightning/trainer/connectors/checkpoint_connector.py

            "fit_loop": self.trainer.fit_loop.state_dict(),
            "validate_loop": self.trainer.validate_loop.state_dict(),
            "test_loop": self.trainer.test_loop.state_dict(),
            "predict_loop": self.trainer.predict_loop.state_dict(),
        }
+        external_loop = getattr(self.trainer, "external_loop", None)


perhaps trainer can have a property like one we have for the other loops?

awaelchli · 2021-08-10T08:42:56Z

pytorch_lightning/trainer/connectors/checkpoint_connector.py

@@ -186,7 +186,7 @@ def restore_callbacks(self) -> None:
            )
        self.trainer.on_load_checkpoint(self._loaded_checkpoint)

-    def restore_loops(self) -> None:
+    def restore_loops(self, restore_external_loop: bool = False) -> None:


it's probably enough if this is controlled below by the existence of an external loop as checked below.

stale · 2021-09-04T03:25:23Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need further help see our docs: https://pytorch-lightning.readthedocs.io/en/latest/generated/CONTRIBUTING.html#pull-request or ask the assistance of a core contributor here or on Slack. Thank you for your contributions.

stale · 2021-09-09T05:48:07Z

This pull request is going to be closed. Please feel free to reopen it create a new from the actual master.

turian · 2021-09-09T19:19:37Z

@tchaton will this PR make this thru?

tchaton · 2021-09-10T10:32:29Z

Hey @turian,

I am quite unsure. I see some advantages for an External Loop, but I believe it is maybe too much of a learning curve for new users.
What are your thoughts there ?

Best,
T.C

turian · 2021-09-13T16:34:53Z

@tchaton where is the quickstart version of how to use this?

tchaton · 2021-09-13T16:59:35Z

Hey @turian,

Are you interested in ExternalLoop or KFold ?

Best,
T.C

turian · 2021-09-13T17:16:34Z

@tchaton KFold. What are the use-cases for external loop?

update

c7a1fe5

SeanNaren reviewed Aug 4, 2021

View reviewed changes

pl_examples/loops_customisation/k_fold.py Show resolved Hide resolved

tchaton added 2 commits August 4, 2021 16:23

update

210bd51

update

91a9dff

tchaton changed the title ~~[PoC] Add External Loop to Lightning~~ [PoC] Add KFold Aug 4, 2021

tchaton added 5 commits August 4, 2021 17:55

update

2fb8496

Merge branch 'master' into poc_loop_customization

6ac569f

simplify code

626525e

break if new trainer is attached

480744e

update

7cca34d

tchaton commented Aug 4, 2021

View reviewed changes

tchaton added 2 commits August 4, 2021 20:57

changelog

40089ae

add warning

d6d30fd

tchaton marked this pull request as ready for review August 4, 2021 19:01

tchaton requested review from awaelchli, Borda, carmocca, justusschock, kaushikb11 and williamFalcon as code owners August 4, 2021 19:01

update

dd56081

justusschock reviewed Aug 4, 2021

View reviewed changes

resolve bug

d209d53

awaelchli reviewed Aug 5, 2021

View reviewed changes

pytorch_lightning/trainer/data_loading.py Outdated Show resolved Hide resolved

update

615ab30

tchaton added 4 commits August 5, 2021 12:04

update tests

de5da36

add some typing

d86e7af

update

4b42c07

Merge branch 'master' into poc_loop_customization

8a03ea1

carmocca mentioned this pull request Aug 6, 2021

Move some debug examples to the source directory #8776

Closed

10 tasks

tchaton added 2 commits August 9, 2021 13:02

update on comments

f853b60

update

8d667f1

justusschock reviewed Aug 10, 2021

View reviewed changes

awaelchli reviewed Aug 10, 2021

View reviewed changes

tchaton changed the title ~~[PoC] Add KFold~~ [PoC] Add KFold - External Loop. Aug 10, 2021

turian mentioned this pull request Aug 12, 2021

Cross validation feature #839

Closed

mergify bot added has conflicts and removed has conflicts labels Aug 13, 2021

stale bot added the won't fix This will not be worked on label Sep 4, 2021

stale bot closed this Sep 9, 2021

Borda deleted the poc_loop_customization branch March 29, 2022 05:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PoC] Add KFold - External Loop. #8715

[PoC] Add KFold - External Loop. #8715

tchaton commented Aug 4, 2021 •

edited by carmocca

Loading

pep8speaks commented Aug 4, 2021 •

edited

Loading

codecov bot commented Aug 4, 2021 •

edited

Loading

tchaton Aug 4, 2021

awaelchli Aug 5, 2021

tchaton Aug 5, 2021

carmocca Aug 5, 2021 •

edited

Loading

carmocca Aug 6, 2021

justusschock left a comment

Borda commented Aug 4, 2021

tchaton commented Aug 5, 2021 •

edited

Loading

awaelchli Aug 5, 2021

justusschock Aug 10, 2021 •

edited

Loading

justusschock Aug 10, 2021

justusschock Aug 10, 2021

awaelchli Aug 10, 2021

awaelchli Aug 10, 2021

awaelchli Aug 10, 2021

awaelchli Aug 10, 2021

tchaton Aug 10, 2021

awaelchli Aug 10, 2021

awaelchli Aug 10, 2021

stale bot commented Sep 4, 2021

stale bot commented Sep 9, 2021

turian commented Sep 9, 2021

tchaton commented Sep 10, 2021

turian commented Sep 13, 2021

tchaton commented Sep 13, 2021

turian commented Sep 13, 2021

[PoC] Add KFold - External Loop. #8715

[PoC] Add KFold - External Loop. #8715

Conversation

tchaton commented Aug 4, 2021 • edited by carmocca Loading

What does this PR do?

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

PR review

Did you have fun?

pep8speaks commented Aug 4, 2021 • edited Loading

Comment last updated at 2021-08-09 11:06:55 UTC

codecov bot commented Aug 4, 2021 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carmocca Aug 5, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justusschock left a comment

Choose a reason for hiding this comment

Borda commented Aug 4, 2021

tchaton commented Aug 5, 2021 • edited Loading

Choose a reason for hiding this comment

justusschock Aug 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stale bot commented Sep 4, 2021

stale bot commented Sep 9, 2021

turian commented Sep 9, 2021

tchaton commented Sep 10, 2021

turian commented Sep 13, 2021

tchaton commented Sep 13, 2021

turian commented Sep 13, 2021

tchaton commented Aug 4, 2021 •

edited by carmocca

Loading

pep8speaks commented Aug 4, 2021 •

edited

Loading

codecov bot commented Aug 4, 2021 •

edited

Loading

carmocca Aug 5, 2021 •

edited

Loading

tchaton commented Aug 5, 2021 •

edited

Loading

justusschock Aug 10, 2021 •

edited

Loading