Skip to content

Commit 26cbc1a

Browse files
committed
Fix memory exhaustion when downloading large files
Enable streaming for file downloads by passing stream=True to requests. This prevents loading entire files into memory when downloading datasets, competitions, models, and kernel outputs. Fixes #754
1 parent 3f8e54c commit 26cbc1a

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

src/kagglesdk/kaggle_http_client.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@
1616
KaggleEnv,
1717
)
1818
from kagglesdk.kaggle_object import KaggleObject
19+
from kagglesdk.common.types.file_download import FileDownload
20+
from kagglesdk.common.types.http_redirect import HttpRedirect
1921
from typing import Type
2022

2123
# TODO (http://b/354237483) Generate the client from the existing one.
@@ -81,6 +83,12 @@ def call(
8183

8284
# Merge environment settings into session
8385
settings = self._session.merge_environment_settings(http_request.url, {}, None, None, None)
86+
87+
# Use stream=True for file downloads to avoid loading entire file into memory
88+
# See: https://github.com/Kaggle/kaggle-api/issues/754
89+
if response_type is not None and (response_type == FileDownload or response_type == HttpRedirect):
90+
settings['stream'] = True
91+
8492
http_response = self._session.send(http_request, **settings)
8593

8694
response = self._prepare_response(response_type, http_response)

0 commit comments

Comments
 (0)