-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connection error while half-downloading metadata #45
Comments
I've reduced the download code to the following:
Still see that even with resume_download=True it keeps downloading the same files every time after error |
same here, any solution? |
A temporary solution would be to catch the URLs that are downloaded and then download them manually. Change download_upstream.py
Find and change the file Find the line 1245 and add the print and return statement
Finally call the downloader
This gives you a list of ~24K URLs to manually download. Now you just need some sort of download utility that can batch-download URLs and you have the metadata. |
Hello! I'm running the command:
python download_upstream.py --scale medium --data_dir medium --skip_shards
After downloading some files it interrupts with the error:
As you can see, there is not too much details in error message. May this be caused some files missing on server? Or just connection problems? If the last, how can I resume thedownload? Flag
--overwrite_metadata
seems not suitable because it removes all already downloaded files.The text was updated successfully, but these errors were encountered: