Bugfix/_has_len #2307

thschaaf · 2020-06-20T22:55:37Z

What does this PR do?

Enable tests for dataloader where len is defined but can raise NotImplementedError (e.g. in some torchtext cases).

Continuation of PR #2293

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you create a separate PR for every change.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

…-lightning into bugfix/_has_len

…nd corrected match string for raised exception

pep8speaks · 2020-06-20T22:55:40Z

Hello @thschaaf! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-06-23 14:56:23 UTC

…nfDataloader

…duce runtime for continuous integration

…to reduce test time

…o reduce test time

…s in time

codecov · 2020-06-21T07:40:37Z

Codecov Report

Merging #2307 into master will increase coverage by 0%.
The diff coverage is n/a.

@@          Coverage Diff           @@
##           master   #2307   +/-   ##
======================================
  Coverage      88%     88%           
======================================
  Files          70      70           
  Lines        5502    5502           
======================================
+ Hits         4835    4844    +9     
+ Misses        667     658    -9

Borda

remove all

@pytest.mark.skip('TODO: speed up this test')

…omNotImplementedErrorDataloader stops to speedup test.

thschaaf · 2020-06-22T04:45:03Z

@Borda I commented out all
@pytest.mark.skip('TODO: speed up this test')

The local test ran successful on my laptop.

thschaaf · 2020-06-22T15:36:11Z

@Borda For your information the added test passed on all machines that showed "canceled" (e.g. https://github.com/PyTorchLightning/pytorch-lightning/runs/792966810?check_suite_focus=true). However I assume that the overall runtime was too long.

Ideally the runtime for the test should be increased.

What do you recommend to get the PR merged?

…to 2.

CHANGELOG.md

awaelchli · 2020-06-22T15:45:28Z

tests/trainer/test_dataloaders.py

@@ -374,7 +374,7 @@ def test_inf_train_dataloader(tmpdir, check_interval):


 @pytest.mark.parametrize('check_interval', [50, 1.0])
-@pytest.mark.skip('TODO: speed up this test')
+# @pytest.mark.skip('TODO: speed up this test')
 def test_not_implemented_error_train_dataloader(tmpdir, check_interval):


this and the following test in which your removed the skip are taking too long because the model runs for the whole epoch, right? But the test suggests that there should be a test for a raise.
I think that's the problem.

we can also set just nb steps to 3 or so...

awaelchli · 2020-06-22T15:46:08Z

tests/trainer/test_dataloaders.py

-@pytest.mark.skip('TODO: speed up this test')
-def test_not_implemented_error_dataloader(tmpdir, check_interval):
+# @pytest.mark.skip('TODO: speed up this test')
+def test_not_implemented_error_val_dataloader(tmpdir, check_interval):


same here, no test for "not implemented error"

The name of the test refers CustomNotImplementedErrorDataloader which raises a NotImplementedError when len is called. In this configuration training should take place. I am open for name suggestions.

Co-authored-by: Adrian Wälchli <[email protected]>

Borda

is this ready to review? :]

CHANGELOG.md

Borda

trying to get some more descriptive names

tests/trainer/test_dataloaders.py

mergify · 2020-06-23T11:22:00Z

This pull request is now in conflict... :(

mergify · 2020-06-23T13:19:21Z

This pull request is now in conflict... :(

thschaaf · 2020-06-23T14:50:19Z

@Borda Did you have time to look at my previous questions. (inlined below). In particular the one to create a new issue.

Borda · 2020-06-23T14:56:01Z

@Borda Did you have time to look at my previous questions. (inlined below). In particular the one to create a new issue.

I hope I 'll get to this today, maybe we can contact on slack?

Borda · 2020-06-23T15:10:44Z

This implies the issue has nothing todo with my bug fix from #2293 and indicates probably a more serious issue in Pytorch-lightning or the testing framework or the interaction.

Mind describes your suspicion in a new issue so we can split the PR fix from others

It could be still that there is a problem with the test that I copied, but it is not obvious why parallel data loading would make the test system so unstable just with the new test. Especially since the test got hanging in very different tests.

@williamFalcon have you experienced something similar?

This seems not like a good solution, and I am worried that so far nobody seemed to have run into this issue, except maybe for the author of the original inf_dataloader tests. Since these test were skipped it makes me suspect that this problem could exist undetected for some time.

but they were skipped by your PR, right?

I am not happy with the solution and hope that you or someone else can explain what is going on.

maybe I am missing something, what are you not sure about this tests repair?

How do you feel about merging this and creating a new issue?

yes, that I would do...

cc: @awaelchli

thschaaf · 2020-06-23T17:29:26Z

@Borda Did you have time to look at my previous questions. (inlined below). In particular the one to create a new issue.

I hope I 'll get to this today, maybe we can contact on slack?

Sure. I have slack installed for work. How do I contact you or join the right slack channel or workspace?

thschaaf · 2020-06-23T17:47:07Z

but they were skipped by your PR, right?

Yes. I just locally enabled them and the inf_dataloader test don't take long at all.

awaelchli · 2020-06-24T23:05:20Z

Not so sure about the num_workers = 0 in the test. Sooner or later someone is going to change that and then has to debug these tests. I would rather not set it if we can't explain why it does not work. Maybe it has to do with raising the StopIteration?

awaelchli · 2020-06-26T16:47:17Z

@thschaaf This raises an NotImplementedError now on the master branch, could you have a look?

thschaaf · 2020-06-26T17:11:40Z

@thschaaf This raises an NotImplementedError now on the master branch, could you have a look?

@awaelchli will do.

Borda · 2020-06-26T17:18:52Z

addressing the issue in #2375, @thschaaf pls check it there

Thomas Schaaf added 7 commits June 19, 2020 00:55

deal with NotImplementedError raised by torchtext

0d0dcff

deal with NotImplementedError raised by torchtext

2cbaa3e

Merge branch 'bugfix/_has_len' of https://github.com/thschaaf/pytorch…

490b86a

…-lightning into bugfix/_has_len

Added tests for dataloader which raise NotImplementedError in __len__()

2f40577

Fixed some typos

1e7f096

enabled tests for dataloader raising NotImplementedError in __len__ a…

61c0a83

…nd corrected match string for raised exception

Merge remote-tracking branch 'origin/master' into bugfix/_has_len

8d7523f

mergify bot requested a review from a team June 20, 2020 22:56

Thomas Schaaf added 5 commits June 20, 2020 19:42

deleted empty line for style compliance

975b374

refactored CustomNotImplementedErrorDataloader to derive from CustomI…

89e6d5e

…nfDataloader

enabled reduced number of not_implemented_error dataloader test to re…

e5e68bb

…duce runtime for continuous integration

reduced test number of not_implemented_error dataloader test further …

4db26e6

…to reduce test time

reduced test number of not_implemented_error dataloader test to one t…

5b6413e

…o reduce test time

thschaaf mentioned this pull request Jun 21, 2020

Bugfix/_has_len #2293

Merged

7 tasks

disabled all not_implemented_error dataloader test to see if test pas…

8a04b7c

…s in time

Borda added the bug Something isn't working label Jun 21, 2020

Borda requested changes Jun 21, 2020

View reviewed changes

mergify bot requested a review from a team June 21, 2020 12:28

Thomas Schaaf added 3 commits June 21, 2020 10:24

added __next__ with a reduced number (5) of elements after which Cust…

19522fc

…omNotImplementedErrorDataloader stops to speedup test.

enabling all not_implemented_error dataloader test

f480072

added brief description of change and relation of torchtext

7e06de0

thschaaf changed the title ~~[wip] Bugfix/_has_len~~ Bugfix/_has_len Jun 22, 2020

CustomNotImplementedErrorDataloader reduced number of batches served …

d495334

…to 2.

awaelchli reviewed Jun 22, 2020

View reviewed changes

mergify bot requested a review from a team June 22, 2020 15:46

Update CHANGELOG.md

17bcc79

Co-authored-by: Adrian Wälchli <[email protected]>

thschaaf changed the title ~~[wip] Bugfix/_has_len~~ Bugfix/_has_len Jun 23, 2020

Borda reviewed Jun 23, 2020

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

Borda approved these changes Jun 23, 2020

View reviewed changes

mergify bot requested a review from a team June 23, 2020 05:54

Borda reviewed Jun 23, 2020

View reviewed changes

Apply suggestions from code review

413b06d

mergify bot requested a review from a team June 23, 2020 06:09

Merge branch 'master' into bugfix/_has_len

907893e

Merge branch 'master' into bugfix/_has_len

c96cd9b

Apply suggestions from code review

fa695c4

Borda requested review from awaelchli, jeremyjordan, justusschock, SkafteNicki and williamFalcon June 24, 2020 19:15

Borda added the ready PRs ready to be merged label Jun 24, 2020

williamFalcon merged commit 7c0a3f4 into Lightning-AI:master Jun 26, 2020

Borda mentioned this pull request Jun 26, 2020

fix get dataloader size #2375

Merged

7 tasks

thschaaf mentioned this pull request Jun 29, 2020

testing gets stuck when num_workers is set to value >0 in tests/base/model_utilities.py #2421

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix/_has_len #2307

Bugfix/_has_len #2307

thschaaf commented Jun 20, 2020 •

edited

Loading

pep8speaks commented Jun 20, 2020 •

edited

Loading

codecov bot commented Jun 21, 2020 •

edited

Loading

Borda left a comment

thschaaf commented Jun 22, 2020

thschaaf commented Jun 22, 2020

awaelchli Jun 22, 2020

Borda Jun 22, 2020

awaelchli Jun 22, 2020

thschaaf Jun 23, 2020

Borda left a comment

Borda left a comment

mergify bot commented Jun 23, 2020

mergify bot commented Jun 23, 2020

thschaaf commented Jun 23, 2020 •

edited by Borda

Loading

Borda commented Jun 23, 2020

Borda commented Jun 23, 2020

thschaaf commented Jun 23, 2020

thschaaf commented Jun 23, 2020

awaelchli commented Jun 24, 2020

awaelchli commented Jun 26, 2020 •

edited

Loading

thschaaf commented Jun 26, 2020

Borda commented Jun 26, 2020

Bugfix/_has_len #2307

Bugfix/_has_len #2307

Conversation

thschaaf commented Jun 20, 2020 • edited Loading

What does this PR do?

Before submitting

PR review

Did you have fun?

pep8speaks commented Jun 20, 2020 • edited Loading

Comment last updated at 2020-06-23 14:56:23 UTC

codecov bot commented Jun 21, 2020 • edited Loading

Codecov Report

Borda left a comment

Choose a reason for hiding this comment

thschaaf commented Jun 22, 2020

thschaaf commented Jun 22, 2020

awaelchli Jun 22, 2020

Choose a reason for hiding this comment

Borda Jun 22, 2020

Choose a reason for hiding this comment

awaelchli Jun 22, 2020

Choose a reason for hiding this comment

thschaaf Jun 23, 2020

Choose a reason for hiding this comment

Borda left a comment

Choose a reason for hiding this comment

Borda left a comment

Choose a reason for hiding this comment

mergify bot commented Jun 23, 2020

mergify bot commented Jun 23, 2020

thschaaf commented Jun 23, 2020 • edited by Borda Loading

Borda commented Jun 23, 2020

Borda commented Jun 23, 2020

thschaaf commented Jun 23, 2020

thschaaf commented Jun 23, 2020

awaelchli commented Jun 24, 2020

awaelchli commented Jun 26, 2020 • edited Loading

thschaaf commented Jun 26, 2020

Borda commented Jun 26, 2020

thschaaf commented Jun 20, 2020 •

edited

Loading

pep8speaks commented Jun 20, 2020 •

edited

Loading

codecov bot commented Jun 21, 2020 •

edited

Loading

thschaaf commented Jun 23, 2020 •

edited by Borda

Loading

awaelchli commented Jun 26, 2020 •

edited

Loading