Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PoC] Add KFold - External Loop. #8715

Closed
wants to merge 19 commits into from
Closed

Conversation

tchaton
Copy link
Contributor

@tchaton tchaton commented Aug 4, 2021

What does this PR do?

Fixes #839

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

@pep8speaks
Copy link

pep8speaks commented Aug 4, 2021

Hello @tchaton! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-08-09 11:06:55 UTC

@codecov
Copy link

codecov bot commented Aug 4, 2021

Codecov Report

Merging #8715 (8d667f1) into master (aacd131) will decrease coverage by 4%.
The diff coverage is 60%.

@@           Coverage Diff           @@
##           master   #8715    +/-   ##
=======================================
- Coverage      93%     88%    -4%     
=======================================
  Files         169     171     +2     
  Lines       14071   14265   +194     
=======================================
- Hits        13040   12579   -461     
- Misses       1031    1686   +655     

@tchaton tchaton changed the title [PoC] Add External Loop to Lightning [PoC] Add KFold Aug 4, 2021
@@ -0,0 +1,203 @@
# Copyright The PyTorch Lightning team.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note for reviewing. Adding boring_model.py to utilities as it is pretty fundamental and should be part of the codebase for debugging purposes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the same. We already have a boring model for our tests and a boring model for bug reports and debugging. I vote for not including it in utilities.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carmocca Any thoughts ?

Copy link
Contributor

@carmocca carmocca Aug 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into the same issue in #7614 (comment) (blocked for the same reason).

Some pl_examples rely on our MNIST implementation, which is in the test directory.

https://github.com/PyTorchLightning/pytorch-lightning/blob/69f287eb85c36412d5d4d6541bc25f6a75a977ea/pl_examples/basic_examples/mnist_datamodule.py#L40

We currently include everything in our distribution, which is what I try to avoid in the linked PR.

However, CI fails because then the pl_examples do not have access to the MNIST implementation if we do.

So we have two real options:

  1. Duplicate the BoringModel/MNIST implementations in the pl_examples directory
  2. Move BoringModel/MNIST to the source tree (what this PR does) to avoid code duplication.

I think I prefer (2). If you guys agree I can do this change in a separate PR

cc @Borda

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets continue discussing in #8776

@tchaton tchaton marked this pull request as ready for review August 4, 2021 19:01
Copy link
Member

@justusschock justusschock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I don't think this is the right way to approach this. This isn't what loops are meant for and this weakens the call hierarchy in a way we shouldn't allow. In the related issue this was discussed and approved from all sides to be a standalone class

pl_examples/loops_customisation/k_fold.py Outdated Show resolved Hide resolved
pl_examples/loops_customisation/k_fold.py Outdated Show resolved Hide resolved
pl_examples/loops_customisation/k_fold.py Outdated Show resolved Hide resolved
pytorch_lightning/loops/base.py Outdated Show resolved Hide resolved
@Borda
Copy link
Member

Borda commented Aug 4, 2021

can we get a bigger picture of the PR?

@tchaton
Copy link
Contributor Author

tchaton commented Aug 5, 2021

TBH, I don't think this is the right way to approach this. This isn't what loops are meant for and this weakens the call hierarchy in a way we shouldn't allow. In the related issue this was discussed and approved from all sides to be a standalone class

Hey @justusschock.

I strongly disagree there. I believe this is what Loop are meant from the beginning.

Users should have the choice to either built on top of Lightning fit, validate, test, predict primitives + some carefully selected utilities provided by Lightning to fully control their fitting flow or switch any existing Loop within Lightning (which is definitely much advanced).

The first approach should be used when multiple trainer.fit are sequentially used such as in Task based learning, Cross Validation, etc... This should reduce boilerplate while enabling users to quickly experiment with new ideas or built their own framework on top of Lightning with their own callbacks system + a fully encapsulated logic.

Furthermore, if users properly implements the ExternalLoop contract, Lightning can add checkpointing + fault tolerant to their loops while maintaining full customization.

Note: The goal is not to expose the Trainer internals and we need to be clear this is meant to be used with fit, validate, test, predict primitives + some carefully selected utilities provided by Lightning to control data scheduling.

@awaelchli @williamFalcon Any thoughts there ?
Best,
T.C

@@ -0,0 +1,203 @@
# Copyright The PyTorch Lightning team.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the same. We already have a boring model for our tests and a boring model for bug reports and debugging. I vote for not including it in utilities.

pl_examples/loops_customisation/k_fold.py Outdated Show resolved Hide resolved
pl_examples/loops_customisation/k_fold.py Outdated Show resolved Hide resolved
pytorch_lightning/loops/base.py Outdated Show resolved Hide resolved
pytorch_lightning/loops/base.py Outdated Show resolved Hide resolved

# utilities for creating a hold
def process_dataset(self, stage: str, dataset: Dataset) -> Subset:
kfold = KFold(self.num_folds, random_state=42, shuffle=True)
Copy link
Member

@justusschock justusschock Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is a dependency for sklearn worth it for just this?
Should we maybe have a more general abstract function create_splits the user has to implement? Since there are so many different ways to create data splits. And we then only iterate over the splits here.

from pytorch_lightning.utilities import rank_zero_only
from pytorch_lightning.utilities.boring_model import BoringModel, RandomDataset

seed_everything(42)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
seed_everything(42)

rather not seed anything globally

Comment on lines +148 to +154
def loop_base_callback() -> Type[Callback]:
class BaseKFoldCallback(Callback):
@rank_zero_only
def on_fold_start(self, trainer, pl_module, counter):
"""Override with your own logic"""

return BaseKFoldCallback
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we define this outside this class but in the file namespace?

loop = KFoldLoop(5)
model = BoringModel()
datamodule = BoringDataModule()
loop.connect_trainer(max_epochs=10, callbacks=KFoldCallback())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively these could be passed in through init via an argument trainer_kwargs.

class BaseDataModule(LightningDataModule):
def __init__(self):
super().__init__()
self.non_picklable = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious, what was the idea here? seems left over :)

def __init__(self):
super().__init__()
self.non_picklable = None
self.checkpoint_state: Optional[str] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question here :)

"fit_loop": self.trainer.fit_loop.state_dict(),
"validate_loop": self.trainer.validate_loop.state_dict(),
"test_loop": self.trainer.test_loop.state_dict(),
"predict_loop": self.trainer.predict_loop.state_dict(),
}
external_loop = getattr(self.trainer, "external_loop", None)
if external_loop:
state_dict.update({"external_loop": external_loop.state_dict()})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can there be more than one external loop. I mean, one external loop nested inside another?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could and Loop will automatically gather their children states.

"fit_loop": self.trainer.fit_loop.state_dict(),
"validate_loop": self.trainer.validate_loop.state_dict(),
"test_loop": self.trainer.test_loop.state_dict(),
"predict_loop": self.trainer.predict_loop.state_dict(),
}
external_loop = getattr(self.trainer, "external_loop", None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps trainer can have a property like one we have for the other loops?

@@ -186,7 +186,7 @@ def restore_callbacks(self) -> None:
)
self.trainer.on_load_checkpoint(self._loaded_checkpoint)

def restore_loops(self) -> None:
def restore_loops(self, restore_external_loop: bool = False) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's probably enough if this is controlled below by the existence of an external loop as checked below.

@tchaton tchaton changed the title [PoC] Add KFold [PoC] Add KFold - External Loop. Aug 10, 2021
@turian turian mentioned this pull request Aug 12, 2021
@stale
Copy link

stale bot commented Sep 4, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need further help see our docs: https://pytorch-lightning.readthedocs.io/en/latest/generated/CONTRIBUTING.html#pull-request or ask the assistance of a core contributor here or on Slack. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Sep 4, 2021
@stale
Copy link

stale bot commented Sep 9, 2021

This pull request is going to be closed. Please feel free to reopen it create a new from the actual master.

@stale stale bot closed this Sep 9, 2021
@turian
Copy link
Contributor

turian commented Sep 9, 2021

@tchaton will this PR make this thru?

@tchaton
Copy link
Contributor Author

tchaton commented Sep 10, 2021

Hey @turian,

I am quite unsure. I see some advantages for an External Loop, but I believe it is maybe too much of a learning curve for new users.
What are your thoughts there ?

Best,
T.C

@turian
Copy link
Contributor

turian commented Sep 13, 2021

@tchaton where is the quickstart version of how to use this?

@tchaton
Copy link
Contributor Author

tchaton commented Sep 13, 2021

Hey @turian,

Are you interested in ExternalLoop or KFold ?

Best,
T.C

@turian
Copy link
Contributor

turian commented Sep 13, 2021

@tchaton KFold. What are the use-cases for external loop?

@Borda Borda deleted the poc_loop_customization branch March 29, 2022 05:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has conflicts won't fix This will not be worked on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cross validation feature
9 participants