Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
s3Path downloads all files it is reading to a temp directory by default. This meant we were downloading the entire bucket! Unsurprisingly we ran out of disc space! There was no configurable option for this in v0.7 of cloudpathlib, but later versions support cache clearing. Going with the `close_file` strategy ensures that we cleanup files after they are read into memory. See for more: https://cloudpathlib.drivendata.org/v0.17/caching/#setting-the-cache-clearing-method
- Loading branch information