-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caching MNIST dataset for testing #917
Caching MNIST dataset for testing #917
Conversation
* Added MNIST datset to the tests directory * Caches dataset based off hash of the test.pt file
is it not possible to download it once and reuse? otherwise the package will grow by 100+ MB for only the tests and most users don't want that. |
thanks for the PR! we can’t add a dataset to the framework. @Borda |
We are not adding them to package, it is just in CI test time, e.g. running PRs... Let review the PR =) |
i see binaries in the /test folder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see now, there is no need to add the dataset to git, torchvision check if it has it in temp and if not, then it is downloaded...
This PR shall contain only changes in Github action config
Yeah the was probably misunderstanding when we talked about caching lol |
@Borda apologies, I misunderstood the request. Ill remove the MNIST files from the directory. |
It looks like the caching of the pip files is only being done for Linux. Should I add in caching of the locations for Windows and MacOS as well? |
@Borda good to go? |
@djbyrne can you subclass MNIST dataset for tests only and add the urls we have? |
@williamFalcon this is just to override the resources list of links to the dataset right. This is different to the request to have an MNIST class decoupled from torchvision that @Borda mentioned? |
Pls do not remove the blank lines, then it shall be fine for Linux tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
* Caching MNIST dataset for testing * Added MNIST datset to the tests directory * Caches dataset based off hash of the test.pt file * Cleaned Up yml file * Cleaned Up yml file * Removed MNIST Data from framework * Set cache key for dataset to 'mnist' * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Jirka Borovec <[email protected]>
Added MNIST datset to the tests directory
Caches dataset based off hash of the test.pt file
Before submitting
What does this PR do?
Fixes #859 .
Caches the MNIST dataset. This will prevent the CI/CD testing from crashing when the link to the original dataset is unavailable and will speed up CI/CD as the pipeline will not have to re-download files.
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃