self.xxx_dataloader() broken from 1.4 -> 1.5 #10834

jgibson2 · 2021-11-30T04:29:58Z

🐛 Bug

Calling self.test_dataloader() in a pl.LightningModule results in a NotImplementedError in 1.5.3, but works in 1.4.9. The docs still reflect the ability to get the dataloader using this family of functions.

To Reproduce

class BoringModel(LightningModule):
    def __init__(self):
        super().__init__()
        self.layer = torch.nn.Linear(32, 2)

    def forward(self, x):
        return self.layer(x)

    def training_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("train_loss", loss)
        return {"loss": loss}

    def validation_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("valid_loss", loss)

    def test_step(self, batch, batch_idx):
        self.test_dataloader()
        loss = self(batch).sum()
        self.log("test_loss", loss)

    def configure_optimizers(self):
        return torch.optim.SGD(self.layer.parameters(), lr=0.1)


train_data = DataLoader(RandomDataset(32, 64), batch_size=2)
val_data = DataLoader(RandomDataset(32, 64), batch_size=2)
test_data = DataLoader(RandomDataset(32, 64), batch_size=2)

model = BoringModel()
trainer = Trainer(
    default_root_dir=os.getcwd(),
    limit_train_batches=1,
    limit_val_batches=1,
    limit_test_batches=1,
    num_sanity_val_steps=0,
    max_epochs=1,
    enable_model_summary=False,
)
trainer.fit(model, train_dataloaders=train_data, val_dataloaders=val_data)
trainer.test(model, dataloaders=test_data)

Expected behavior

self.test_dataloader() functions as in 1.4

Environment

* CUDA:
	- GPU:
		- Tesla K80
	- available:         True
	- version:           11.1
* Packages:
	- numpy:             1.19.5
	- pyTorch_debug:     False
	- pyTorch_version:   1.10.0+cu111
	- pytorch-lightning: 1.5.3
	- tqdm:              4.62.3
* System:
	- OS:                Linux
	- architecture:
		- 64bit
		- 
	- processor:         x86_64
	- python:            3.7.12
	- version:           #1 SMP Sat Jun 5 09:50:34 PDT 2021

Additional context

cc @justusschock @awaelchli @ninginthecloud

The text was updated successfully, but these errors were encountered:

ananthsub · 2021-11-30T06:37:00Z

@jgibson2 before v1.5, calling self.{}_dataloader() would log a warning if the method wasn't implemented in your LightningModule. Now it raises a NotImplementedError to make this explicit that xxx_dataloader was not overridden.

How do you want to call or access dataloaders inside of your LightningModule? the example you provided has no implementation for test_dataloader so I'm not sure how you wanted to use this.

awaelchli · 2021-11-30T10:09:43Z

This previously only worked because the Trainer patched the given dataloaders onto the LightningModule as if they were defined. We classified this an unintended behavior with hard to debug side effects. It was removed in #9764.

To access the dataloaders passed to the trainer, you need to call self.trainer.test_dataloaders[0] now. Also note, the xxx_dataloader() methods are considered hooks and hooks are not meant to be called by the user (though granted this distinction may not be very clear to the user for many of our hooks).

AndresAlgaba · 2021-11-30T10:52:25Z

I believe that a reply to an earlier issue which I raised may also be useful here:
#10558 (comment)

jgibson2 · 2021-11-30T18:39:27Z

Thanks all -- I upgraded from 1.2 to 1.5, so I didn't see warnings for this. Although this makes sense to prevent silent bugs, it does make the process of retrieving the dataloader (and thus dataset) more complicated. Previously, calling self.xxx_dataloader().dataset was a source-agnostic way of getting the dataset, and now I believe something like this would work:

def get_test_dataloader(self):
    try:
        test_dl = self.test_dataloader()
    except NotImplementedError as _:
        test_dl = self.trainer.test_dataloaders[0]
    return test_dl

Maybe this could be better reflected in the docs?

awaelchli · 2021-12-01T01:27:15Z

@jgibson2 I think that accessing trainer.x_dataloader should be equivalent to calling self.test_dataloader(), even in the case when a datamodule is used and the x_dataloader methods are defined over there.

Furthermore, the issue #10430 proposes to move the initialization of dataloaders even earlier so hooks like configure_optimizers() will also be able to access the dataloaders already.

Perhaps the access to dataloaders can be highlighted in the data section of our docs: https://pytorch-lightning.readthedocs.io/en/1.5.2/guides/data.html
What do you think?

jgibson2 · 2021-12-02T20:44:45Z

Thanks @awaelchli -- I opened a PR to put an example in the documentation

ananthsub · 2021-12-03T23:42:09Z

@jgibson2 @awaelchli - for my understanding, if the lightning module implementation has a dependence on the dataloader, and since the LightningModule API has the hooks to specify this to the trainer, then shouldn't the lightning module be the one providing the data to the trainer? ie, xxx_dataloader(self) should be implemented in the lightning module. this way there's no roundabout means or ambiguity of how to access the dataloader.

class MyLightningModule(LightningModule):
    def train_dataloader(self):
        self.train_dataloader = Dataloader(....)
        return self.train_dataloader

    def training_step(self, batch, batch_idx):
        # use self.train_dataloader here

awaelchli · 2021-12-04T05:31:42Z

Yes I know what you mean. The pattern you describe is not currently what we have in our docs or as advertisement of best practice. But it is certainly a way to deal with what OP needed.

The reason for my recommendation was mainly these points:

dataloader() methods are hooks. Hooks are not meant to be manually called by the user (general guiding principle to ensure correctness, not a strict rule)
dataloader() are methods. Methods may have side-effects and if the dataloader hook is meant to be called once, yet user calls it a second time, it can lead to confusion.
dataloaders may only be defined in the DataModule

These three points to me motivate the use of the dataloader references on the trainer.

jgibson2 added the bug Something isn't working label Nov 30, 2021

ananthsub added the data handling Generic data-related topic label Nov 30, 2021

awaelchli added the working as intended Working as intended label Nov 30, 2021

jgibson2 closed this as completed Nov 30, 2021

jgibson2 mentioned this issue Dec 2, 2021

Add example of getting DataLoader from within LightningModule #10903

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

self.xxx_dataloader() broken from 1.4 -> 1.5 #10834

self.xxx_dataloader() broken from 1.4 -> 1.5 #10834

jgibson2 commented Nov 30, 2021 •

edited by github-actions bot

Loading

ananthsub commented Nov 30, 2021

awaelchli commented Nov 30, 2021 •

edited

Loading

AndresAlgaba commented Nov 30, 2021

jgibson2 commented Nov 30, 2021

awaelchli commented Dec 1, 2021

jgibson2 commented Dec 2, 2021

ananthsub commented Dec 3, 2021 •

edited

Loading

awaelchli commented Dec 4, 2021 •

edited

Loading

self.xxx_dataloader() broken from 1.4 -> 1.5 #10834

self.xxx_dataloader() broken from 1.4 -> 1.5 #10834

Comments

jgibson2 commented Nov 30, 2021 • edited by github-actions bot Loading

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

ananthsub commented Nov 30, 2021

awaelchli commented Nov 30, 2021 • edited Loading

AndresAlgaba commented Nov 30, 2021

jgibson2 commented Nov 30, 2021

awaelchli commented Dec 1, 2021

jgibson2 commented Dec 2, 2021

ananthsub commented Dec 3, 2021 • edited Loading

awaelchli commented Dec 4, 2021 • edited Loading

jgibson2 commented Nov 30, 2021 •

edited by github-actions bot

Loading

awaelchli commented Nov 30, 2021 •

edited

Loading

ananthsub commented Dec 3, 2021 •

edited

Loading

awaelchli commented Dec 4, 2021 •

edited

Loading