[doc build] use rayci.anyscale.dev to fetch doc build cache #57877

aslonnie · 2025-10-18T08:24:20Z

so that we are not tied to using public s3 buckets

gemini-code-assist

Code Review

This pull request refactors the documentation build cache fetching mechanism to use a URL endpoint instead of direct S3 access via boto3. This is a good simplification. My review includes two main points: improving an error message to be more accurate and robust, and fixing a potential performance issue in the file download logic by streaming the response to avoid high memory usage. Overall, the changes are in the right direction.

gemini-code-assist · 2025-10-18T08:25:30Z

doc/load_doc_cache.py

-    except botocore.exceptions.ClientError as e:
-        print(f"Failed to download {s3_file_path} from S3: {str(e)}")
-        raise e
+    with requests.get(f"{DOC_BUILD_S3_URL}/{commit}.tgz", allow_redirects=True) as response:


When downloading files with requests, it's important to use the stream=True parameter in the get request. Without it, the entire file content is loaded into memory at once. For large cache files, this can lead to high memory consumption and potential performance issues. By using stream=True, the response content is streamed, and iter_content can process it in chunks, which is much more memory-efficient.

Suggested change

with requests.get(f"{DOC_BUILD_S3_URL}/{commit}.tgz", allow_redirects=True) as response:

with requests.get(f"{DOC_BUILD_S3_URL}/{commit}.tgz", allow_redirects=True, stream=True) as response:

doc/load_doc_cache.py

stop fetching from s3 directly Signed-off-by: Lonnie Liu <[email protected]>

…ect#57877) so that we are not tied to using public s3 buckets Signed-off-by: Lonnie Liu <[email protected]> Signed-off-by: xgui <[email protected]>

so that we are not tied to using public s3 buckets Signed-off-by: Lonnie Liu <[email protected]> Signed-off-by: elliot-barn <[email protected]>

…ect#57877) so that we are not tied to using public s3 buckets Signed-off-by: Lonnie Liu <[email protected]>

…ect#57877) so that we are not tied to using public s3 buckets Signed-off-by: Lonnie Liu <[email protected]> Signed-off-by: Aydin Abiar <[email protected]>

aslonnie requested a review from a team as a code owner October 18, 2025 08:24

gemini-code-assist bot reviewed Oct 18, 2025

View reviewed changes

aslonnie requested a review from khluu October 18, 2025 08:41

aslonnie force-pushed the lonnie-251017-docbuild branch from 5991c65 to 8a51261 Compare October 18, 2025 08:43

This comment was marked as outdated.

Sign in to view

ray-gardener bot added docs An issue or change related to documentation devprod labels Oct 18, 2025

aslonnie force-pushed the lonnie-251017-docbuild branch from 8a51261 to 0a58e0a Compare October 18, 2025 18:21

This comment was marked as outdated.

Sign in to view

aslonnie force-pushed the lonnie-251017-docbuild branch from 0a58e0a to 9dae4be Compare October 18, 2025 18:25

aslonnie added the go add ONLY when ready to merge, run all tests label Oct 18, 2025

[doc build] use rayci.anyscale.dev to fetch doc build cache

c8b8085

stop fetching from s3 directly Signed-off-by: Lonnie Liu <[email protected]>

aslonnie force-pushed the lonnie-251017-docbuild branch from 4009dbc to c8b8085 Compare October 18, 2025 18:43

aslonnie requested a review from a team October 20, 2025 21:43

khluu approved these changes Oct 20, 2025

View reviewed changes

aslonnie merged commit 4badd82 into master Oct 20, 2025
6 checks passed

aslonnie deleted the lonnie-251017-docbuild branch October 20, 2025 23:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[doc build] use rayci.anyscale.dev to fetch doc build cache #57877

[doc build] use rayci.anyscale.dev to fetch doc build cache #57877

Uh oh!

aslonnie commented Oct 18, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 18, 2025

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	with requests.get(f"{DOC_BUILD_S3_URL}/{commit}.tgz", allow_redirects=True) as response:
	with requests.get(f"{DOC_BUILD_S3_URL}/{commit}.tgz", allow_redirects=True, stream=True) as response:

[doc build] use rayci.anyscale.dev to fetch doc build cache #57877

[doc build] use rayci.anyscale.dev to fetch doc build cache #57877

Uh oh!

Conversation

aslonnie commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aslonnie commented Oct 18, 2025 •

edited

Loading