Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test fails with _lazy_train_dataloader #859

Closed
Borda opened this issue Feb 15, 2020 · 16 comments · Fixed by #917 or #926
Closed

test fails with _lazy_train_dataloader #859

Borda opened this issue Feb 15, 2020 · 16 comments · Fixed by #917 or #926
Assignees
Labels
bug Something isn't working good first issue Good for newcomers help wanted Open to be worked on
Milestone

Comments

@Borda
Copy link
Member

Borda commented Feb 15, 2020

🐛 Bug

The tests are time-to-time randomly failing across all platforms (but mainly macOS and Win) with following message:

_____________________________ test_lbfgs_cpu_model _____________________________

self = LightningTestModel(
  (c_d1): Linear(in_features=784, out_features=1000, bias=True)
  (c_d1_bn): BatchNorm1d(1000, eps...ats=True)
  (c_d1_drop): Dropout(p=0.2, inplace=False)
  (c_d2): Linear(in_features=1000, out_features=10, bias=True)
)

    def _get_data_loader(self):
        try:
>           value = getattr(self, attr_name)

pytorch_lightning/core/decorators.py:16: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = LightningTestModel(
  (c_d1): Linear(in_features=784, out_features=1000, bias=True)
  (c_d1_bn): BatchNorm1d(1000, eps...ats=True)
  (c_d1_drop): Dropout(p=0.2, inplace=False)
  (c_d2): Linear(in_features=1000, out_features=10, bias=True)
)
name = '_lazy_train_dataloader'

    def __getattr__(self, name):
        if '_parameters' in self.__dict__:
            _parameters = self.__dict__['_parameters']
            if name in _parameters:
                return _parameters[name]
        if '_buffers' in self.__dict__:
            _buffers = self.__dict__['_buffers']
            if name in _buffers:
                return _buffers[name]
        if '_modules' in self.__dict__:
            modules = self.__dict__['_modules']
            if name in modules:
                return modules[name]
        raise AttributeError("'{}' object has no attribute '{}'".format(
>           type(self).__name__, name))
E       AttributeError: 'LightningTestModel' object has no attribute '_lazy_train_dataloader'

To Reproduce

https://github.com/Borda/pytorch-lightning/runs/448238658

@Borda Borda added bug Something isn't working help wanted Open to be worked on good first issue Good for newcomers labels Feb 15, 2020
@jeremyjordan
Copy link
Contributor

jeremyjordan commented Feb 16, 2020

I think this is caused by a failed connection to yann.lecun.com to download the MNIST dataset.

Downloading code here: https://pytorch.org/docs/stable/_modules/torchvision/datasets/mnist.html#MNIST

During handling of the above exception, another exception occurred:

self = <urllib.request.HTTPHandler object at 0x11ef30350>
http_class = <class 'http.client.HTTPConnection'>
req = <urllib.request.Request object at 0x11ee39f90>, http_conn_args = {}
host = 'yann.lecun.com', h = <http.client.HTTPConnection object at 0x11eddc390>

@williamFalcon
Copy link
Contributor

_lazy_train_dataloader

means something in the dataloader failed...

@jeremyjordan
Copy link
Contributor

yes. my understanding is:

dataloader construction attempt here --> calls _dataloader here --> constructs TestingMNIST dataset with arg download=True --> periodically the download from "http://yann.lecun.com/" fails.

@Borda
Copy link
Member Author

Borda commented Feb 17, 2020

Do we have a better source for downloading the dataset?
Another option would be adding the downloaded dataset to CI cache...

@williamFalcon
Copy link
Contributor

nah. i’ll create a link on s3 and we can get from there

@djbyrne
Copy link
Contributor

djbyrne commented Feb 18, 2020

I can take a look if the issue isn't already resolved

@Borda Borda added this to the 0.6.1 milestone Feb 18, 2020
@Borda
Copy link
Member Author

Borda commented Feb 21, 2020

I think that having own MNIST class would be great because then we could drop torchvision dependency even in tests...

@awaelchli awaelchli mentioned this issue Feb 22, 2020
5 tasks
This was referenced Feb 22, 2020
@Borda
Copy link
Member Author

Borda commented Feb 23, 2020

@djbyrne how is it going, do need some help with it? =)

@djbyrne
Copy link
Contributor

djbyrne commented Feb 23, 2020

Hey @Borda, have the caching of the MNIST dataset working now, just testing it. Not sure if I have done it the best way though. Maybe take a look if you have a sec https://github.com/djbyrne/pytorch-lightning/blob/enhancement/lazy_dataloader/.github/workflows/ci-testing.yml

@Borda
Copy link
Member Author

Borda commented Feb 23, 2020

great, at first glance, it looks good to me... not sure how to test that it is really caching, maybe add s simple (temporary print in the code to mask if it is using cache or downloading new dataset...)

@Borda
Copy link
Member Author

Borda commented Feb 23, 2020

Also, it seems that the torchvision is not released yet for python 3.8 so it would be great to have own MNIST dataset class to be completely free... #910 #915

@djbyrne
Copy link
Contributor

djbyrne commented Feb 23, 2020

Yeah cool I can add that. Will i put it in the one PR for 859? or do seperate PR's?

@awaelchli
Copy link
Contributor

it seems to only happen in circleci build macOS-10.15, 3.7, minimal. Then maybe it is not a problem with the download.

@Borda
Copy link
Member Author

Borda commented Feb 23, 2020

it seems to only happen in circleci build macOS-10.15, 3.7, minimal. Then maybe it is not a problem with the download.

I was observing in randomly on all tests...

@Borda
Copy link
Member Author

Borda commented Feb 23, 2020

@awaelchli the caching was just one approach... We shall create own MNIST dataset class so we may drop dependency on torchvision completely...
Moreover with own test case we may simply it to use e.g. only 5 digits
@williamFalcon can you share link on MNIST on S3

@Borda
Copy link
Member Author

Borda commented Feb 25, 2020

We need to create an own MNIST dataset...

@Borda Borda self-assigned this Feb 25, 2020
@Borda Borda modified the milestones: 0.6.1, 0.6.2 Feb 25, 2020
@Borda Borda modified the milestones: 0.7.1, 0.7.0 Feb 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Open to be worked on
Projects
None yet
5 participants