-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add str method to datamodule #20301
Add str method to datamodule #20301
Conversation
Will add documentation once code is fixed |
I have been waiting for quite some time. How do I go on? Do I just wait some more until someone looks at this? |
hey @MrWhatZitToYaa thanks for taking this up and sorry for the wait can you add a bit more detail on what do we think it's good printing from a datamodule? the original design in #9947 was a bit different, as was #9967, but happy to consider alternatives after there's some discussion on the rationale |
The idea originally came to me when I had to to use the Datamodule to implement some datasets we have in our lab. It was really annoying to deal with the fact that you never really know what you are dealing with and makes debugging really hard. I talked to some colleagues and they agreed that they ran into the same problem. That's why I would like to add a str functionality for the Datamodule: primarily for debugging / logging. |
First implementation scetch
Added alternative Boring Data Module implementations Added test cases for all possible options Added additional check for NotImplementedError in string function of DataModule
Made changes to comply with requested suggestions Switched from hardcoded \n to more general os.linesep
0c57055
to
122cf6d
Compare
Corrected the annotation for the internal function and the list that is suppsoed to store the information on the datasets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution, I aligned it with the latest master and added a comment for a fix before we can merge. It should be quick.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #20301 +/- ##
=========================================
- Coverage 88% 79% -9%
=========================================
Files 267 264 -3
Lines 23304 23328 +24
=========================================
- Hits 20407 18365 -2042
- Misses 2897 4963 +2066 |
Fixed type annotation issue Reduced code size by using Sized object from abc library
for more information, see https://pre-commit.ci
Switched from Dataset based implementation to Dataloader based implementation
Added missing size value to tuple in the error case instead of returning only a string
Adjusted test to match the new implementation requirenemnts Added necessary BoringModules for tests Fixed bugs and annotation issues in the str method
Refactored code and made it more readable by implementing more abstarct fucntion methods Adjusted tests Removed debug statements Removed TODO comments
Renamed varaibles to more sensible names to increase readability
The string method will now check if any of the 4 dataloaders are available and try to print information on them. I hope this is how you imagined it to work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. We're almost there, I propose we update the strings from
Train dataset: available=yes, size=unknown
Validation dataset: available=yes, size=64
Test dataset: available=no, size=unknown
Prediction dataset: available=yes, size=64
to
Train dataloader: size=NA
Validation dataloader: size=64
Test dataloader: None
Predict dataloader: size=64
summary:
- dataset -> dataloader
- Prediction -> Predict
- remove "available"
- unknown -> NA
- "available=no" -> None
what do you think?
Switched name from dataset to dataloader Switched name Prediction to Predict removed available keyword and instead write None if not available Switched from unknown to NA
Alright, sounds good to me |
What does this PR do?
Added a str function to the DataModule in order to be able to print Datasets included in the module instead of an adress.
Fixes #9947
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:
Reviewer checklist
📚 Documentation preview 📚: https://pytorch-lightning--20301.org.readthedocs.build/en/20301/