-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix streaming for servers not supporting HTTP range requests #3689
Conversation
Does it mean that huge files might end up being downloaded? It would go against the purpose of streaming, I think. At least, this fallback should be an option that could be disabled |
Yes, it is against the purpose of streaming, but streaming is not possible if the server does not allow HTTP range requests. We have two options: either we download the file or we throw an error. |
I think we simply cannot fallback to downloading the file if streaming fails without the user being aware of it. Some options:
|
IMO in general we should throw an exception and ask the user to not use streaming mode in that case. Your second point is also interesting but I feel like it could be confusing for users sometimes: it doesn't feel natural that the streaming-ability should depend on the size of the file. |
Sure, I think we should just throw an exception |
Current behavior is already throwing an Exception:
We could customize the exception class and/or the exception message. |
I'm not sure we really need to change anything. I opened the issue #3677 because discovery was streamable and is not anymore (according to my test suite in https://github.com/huggingface/datasets-preview-backend): I was not sure if it was due to some regression in the library, or to some change in the dataset itself. |
I'm wondering why it worked before and it is no longer working... |
Yup a message that says that the host doesn't support streaming because it doesn't support HTTP Range requests would be useful ! |
DONE, @lhoestq. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice thanks !
* Download file locally when range request not supported * Revert "Download file locally when range request not supported" This reverts commit 9f90bcb. * Raise custom error when range request not supported
Some servers do not support HTTP range requests, whereas this is required to stream some file formats (like ZIP).
This PR implements a workaround for those cases, by download the files locally in a temporary directory (cleaned up by the OS once the process is finished).This PR raises custom error explaining that streaming is not possible because data host server does not support HTTP range requests.
Fix #3677.