-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
self.xxx_dataloader() broken from 1.4 -> 1.5 #10834
Comments
@jgibson2 before v1.5, calling How do you want to call or access dataloaders inside of your LightningModule? the example you provided has no implementation for |
This previously only worked because the Trainer patched the given dataloaders onto the LightningModule as if they were defined. We classified this an unintended behavior with hard to debug side effects. It was removed in #9764. To access the dataloaders passed to the trainer, you need to call |
I believe that a reply to an earlier issue which I raised may also be useful here: |
Thanks all -- I upgraded from 1.2 to 1.5, so I didn't see warnings for this. Although this makes sense to prevent silent bugs, it does make the process of retrieving the dataloader (and thus dataset) more complicated. Previously, calling def get_test_dataloader(self):
try:
test_dl = self.test_dataloader()
except NotImplementedError as _:
test_dl = self.trainer.test_dataloaders[0]
return test_dl Maybe this could be better reflected in the docs? |
@jgibson2 I think that accessing trainer.x_dataloader should be equivalent to calling self.test_dataloader(), even in the case when a datamodule is used and the x_dataloader methods are defined over there. Furthermore, the issue #10430 proposes to move the initialization of dataloaders even earlier so hooks like configure_optimizers() will also be able to access the dataloaders already. Perhaps the access to dataloaders can be highlighted in the data section of our docs: https://pytorch-lightning.readthedocs.io/en/1.5.2/guides/data.html |
Thanks @awaelchli -- I opened a PR to put an example in the documentation |
@jgibson2 @awaelchli - for my understanding, if the lightning module implementation has a dependence on the dataloader, and since the LightningModule API has the hooks to specify this to the trainer, then shouldn't the lightning module be the one providing the data to the trainer? ie, class MyLightningModule(LightningModule):
def train_dataloader(self):
self.train_dataloader = Dataloader(....)
return self.train_dataloader
def training_step(self, batch, batch_idx):
# use self.train_dataloader here |
Yes I know what you mean. The pattern you describe is not currently what we have in our docs or as advertisement of best practice. But it is certainly a way to deal with what OP needed. The reason for my recommendation was mainly these points:
These three points to me motivate the use of the dataloader references on the trainer. |
🐛 Bug
Calling
self.test_dataloader()
in apl.LightningModule
results in aNotImplementedError
in 1.5.3, but works in 1.4.9. The docs still reflect the ability to get the dataloader using this family of functions.To Reproduce
Expected behavior
self.test_dataloader()
functions as in 1.4Environment
Additional context
cc @justusschock @awaelchli @ninginthecloud
The text was updated successfully, but these errors were encountered: