-
-
Notifications
You must be signed in to change notification settings - Fork 951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The HTTP chunk size is too small causing a bottleneck in download speeds #3143
Comments
Are you sure it's not the download server that's limiting your download speed? I just tested a bit with large Danbooru files and The chunk size is rather small and there should definitely be an option to configure it, but I don't think that is what's primarily limiting your speeds. |
Can't say that I've ever noticed gallery-dl bottlenecking my download rate.. The site/server is usually the limiting factor (or a proxy, if used), on many occasions gallery-dl was saturating my download bandwidth just fine.. |
Thanks for the reply folks. I am having this issue on all sites that I have tested such as reddit, imgur, gfycat etc. If it is allowed, I can share a link here. Changing the ps: I'm using the latest version of the library. (gallery-dl==1.23.5) |
and double the previous default from 16384 (2**14) to 32768 (2**15)
That might be the reason for your problem, given how "fast" Python is. Anyway, commit bca9f96 adds a |
@mikf Thanks for adding the option. Meanwhile I profiled the code and narrowed the issue down to this: >>> import time
>>> def do():
... t1 = time.perf_counter()
... for i in range(1000):
... time.sleep(0.01)
... t2 = time.perf_counter()
... dt = t2 - t1
... print(f'{dt=}', f'exp={0.01*1000}')
...
>>> do()
dt=15.758051600001636 exp=10.0 Turns out on my machine
and with 1MiB chunk size
This is specific to my computer and probably not an issue on most modern machines. |
@trodiz I think I found a solution for inaccurate
diff --git a/gallery_dl/downloader/http.py b/gallery_dl/downloader/http.py
index 2e7e76e6..69dc4813 100644
--- a/gallery_dl/downloader/http.py
+++ b/gallery_dl/downloader/http.py
@@ -301,7 +301,7 @@ class HttpDownloader(DownloaderBase):
if elapsed < expected:
# sleep if less time elapsed than expected
time.sleep(expected - elapsed)
- t2 = time.time()
+ t2 += expected - elapsed
t1 = t2 |
Although it eliminated the inaccuracy in Here's an example of a file I'm trying to download:
Here's a solution that should work on all Windows machines: import ctypes
import platform
if platform.system() == "Windows":
ntdll = ctypes.WinDLL('NTDLL.DLL')
ntdll.NtSetTimerResolution(0) This improves the time resolution to 1ms or lower and everything will run as expected. |
the variable in question:
gallery-dl/gallery_dl/downloader/http.py
Line 30 in 0f9dfb7
When you set a rate limit, eg:
-r 3M
, instead of getting a download rate closer to that specified value you'll get a much lower value. In my case it was throttled at 1 mega bytes per second. The very smallchunk_size
creates an overhead in each iteration of the for loop in_receive_rate
function. I'd recommend a much more commonly used chunk size of 1,048,576 (1MiB).The text was updated successfully, but these errors were encountered: