Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

self.xxx_dataloader() broken from 1.4 -> 1.5 #10834

Closed
jgibson2 opened this issue Nov 30, 2021 · 8 comments
Closed

self.xxx_dataloader() broken from 1.4 -> 1.5 #10834

jgibson2 opened this issue Nov 30, 2021 · 8 comments
Labels
bug Something isn't working data handling Generic data-related topic working as intended Working as intended

Comments

@jgibson2
Copy link
Contributor

jgibson2 commented Nov 30, 2021

🐛 Bug

Calling self.test_dataloader() in a pl.LightningModule results in a NotImplementedError in 1.5.3, but works in 1.4.9. The docs still reflect the ability to get the dataloader using this family of functions.

To Reproduce

class BoringModel(LightningModule):
    def __init__(self):
        super().__init__()
        self.layer = torch.nn.Linear(32, 2)

    def forward(self, x):
        return self.layer(x)

    def training_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("train_loss", loss)
        return {"loss": loss}

    def validation_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("valid_loss", loss)

    def test_step(self, batch, batch_idx):
        self.test_dataloader()
        loss = self(batch).sum()
        self.log("test_loss", loss)

    def configure_optimizers(self):
        return torch.optim.SGD(self.layer.parameters(), lr=0.1)


train_data = DataLoader(RandomDataset(32, 64), batch_size=2)
val_data = DataLoader(RandomDataset(32, 64), batch_size=2)
test_data = DataLoader(RandomDataset(32, 64), batch_size=2)

model = BoringModel()
trainer = Trainer(
    default_root_dir=os.getcwd(),
    limit_train_batches=1,
    limit_val_batches=1,
    limit_test_batches=1,
    num_sanity_val_steps=0,
    max_epochs=1,
    enable_model_summary=False,
)
trainer.fit(model, train_dataloaders=train_data, val_dataloaders=val_data)
trainer.test(model, dataloaders=test_data)

Expected behavior

self.test_dataloader() functions as in 1.4

Environment

* CUDA:
	- GPU:
		- Tesla K80
	- available:         True
	- version:           11.1
* Packages:
	- numpy:             1.19.5
	- pyTorch_debug:     False
	- pyTorch_version:   1.10.0+cu111
	- pytorch-lightning: 1.5.3
	- tqdm:              4.62.3
* System:
	- OS:                Linux
	- architecture:
		- 64bit
		- 
	- processor:         x86_64
	- python:            3.7.12
	- version:           #1 SMP Sat Jun 5 09:50:34 PDT 2021

Additional context

cc @justusschock @awaelchli @ninginthecloud

@jgibson2 jgibson2 added the bug Something isn't working label Nov 30, 2021
@ananthsub
Copy link
Contributor

@jgibson2 before v1.5, calling self.{}_dataloader() would log a warning if the method wasn't implemented in your LightningModule. Now it raises a NotImplementedError to make this explicit that xxx_dataloader was not overridden.

How do you want to call or access dataloaders inside of your LightningModule? the example you provided has no implementation for test_dataloader so I'm not sure how you wanted to use this.

@ananthsub ananthsub added the data handling Generic data-related topic label Nov 30, 2021
@awaelchli
Copy link
Contributor

awaelchli commented Nov 30, 2021

This previously only worked because the Trainer patched the given dataloaders onto the LightningModule as if they were defined. We classified this an unintended behavior with hard to debug side effects. It was removed in #9764.

To access the dataloaders passed to the trainer, you need to call self.trainer.test_dataloaders[0] now. Also note, the xxx_dataloader() methods are considered hooks and hooks are not meant to be called by the user (though granted this distinction may not be very clear to the user for many of our hooks).

@awaelchli awaelchli added the working as intended Working as intended label Nov 30, 2021
@AndresAlgaba
Copy link
Contributor

I believe that a reply to an earlier issue which I raised may also be useful here:
#10558 (comment)

@jgibson2
Copy link
Contributor Author

Thanks all -- I upgraded from 1.2 to 1.5, so I didn't see warnings for this. Although this makes sense to prevent silent bugs, it does make the process of retrieving the dataloader (and thus dataset) more complicated. Previously, calling self.xxx_dataloader().dataset was a source-agnostic way of getting the dataset, and now I believe something like this would work:

def get_test_dataloader(self):
    try:
        test_dl = self.test_dataloader()
    except NotImplementedError as _:
        test_dl = self.trainer.test_dataloaders[0]
    return test_dl

Maybe this could be better reflected in the docs?

@awaelchli
Copy link
Contributor

@jgibson2 I think that accessing trainer.x_dataloader should be equivalent to calling self.test_dataloader(), even in the case when a datamodule is used and the x_dataloader methods are defined over there.

Furthermore, the issue #10430 proposes to move the initialization of dataloaders even earlier so hooks like configure_optimizers() will also be able to access the dataloaders already.

Perhaps the access to dataloaders can be highlighted in the data section of our docs: https://pytorch-lightning.readthedocs.io/en/1.5.2/guides/data.html
What do you think?

@jgibson2
Copy link
Contributor Author

jgibson2 commented Dec 2, 2021

Thanks @awaelchli -- I opened a PR to put an example in the documentation

@ananthsub
Copy link
Contributor

ananthsub commented Dec 3, 2021

@jgibson2 @awaelchli - for my understanding, if the lightning module implementation has a dependence on the dataloader, and since the LightningModule API has the hooks to specify this to the trainer, then shouldn't the lightning module be the one providing the data to the trainer? ie, xxx_dataloader(self) should be implemented in the lightning module. this way there's no roundabout means or ambiguity of how to access the dataloader.

class MyLightningModule(LightningModule):
    def train_dataloader(self):
        self.train_dataloader = Dataloader(....)
        return self.train_dataloader

    def training_step(self, batch, batch_idx):
        # use self.train_dataloader here

@awaelchli
Copy link
Contributor

awaelchli commented Dec 4, 2021

Yes I know what you mean. The pattern you describe is not currently what we have in our docs or as advertisement of best practice. But it is certainly a way to deal with what OP needed.

The reason for my recommendation was mainly these points:

  • dataloader() methods are hooks. Hooks are not meant to be manually called by the user (general guiding principle to ensure correctness, not a strict rule)
  • dataloader() are methods. Methods may have side-effects and if the dataloader hook is meant to be called once, yet user calls it a second time, it can lead to confusion.
  • dataloaders may only be defined in the DataModule

These three points to me motivate the use of the dataloader references on the trainer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data handling Generic data-related topic working as intended Working as intended
Projects
None yet
Development

No branches or pull requests

4 participants