-
Notifications
You must be signed in to change notification settings - Fork 7.2k
[5/n] IPv6 support: Add IPv6 localhost support and eliminate 0.0.0.0 server bindings #59852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a significant and well-executed step towards full IPv6 support in Ray. It systematically replaces hardcoded IPv4 localhost addresses with a dynamic get_localhost_ip() function and eliminates insecure 0.0.0.0 bindings in favor of specific IP address bindings for gRPC servers. The changes are extensive, touching many parts of the codebase in both Python and C++, and appear to be correct and consistent. I have one suggestion for src/ray/util/network_util.cc to improve performance and code consistency by caching the result of get_localhost_ip. Overall, this is a high-quality contribution that improves Ray's networking capabilities and security posture.
89703b9 to
1d303e8
Compare
| rpc_server_(config.grpc_server_name, | ||
| config.grpc_server_port, | ||
| config.node_ip_address == "127.0.0.1", | ||
| config.node_ip_address, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, this argument was listen_to_localhost_only. If it was true, we used 127.0.0.1; if not, we used 0.0.0.0.
After this PR, the argument is bind_address, so we now directly pass config.node_ip_address. If it is localhost, the behavior remains the same; if it is not, we change the binding from 0.0.0.0 to the node IP.
There are many similar changes in this PR following this pattern.
1d303e8 to
7cbcc28
Compare
|
@sampan-s-nayak can you help review pls? |
edoakes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, just some minor nits
python/ray/_common/network_utils.py
Outdated
| get_all_interfaces_ip as _get_all_interfaces_ip, | ||
| get_localhost_ip as _get_localhost_ip, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm what is the point of this?
if you want to "alias" the imports, I believe you can simply do from ray._raylet import get_localhost_ip and others can do from network_utils import get_localhost_ip (it'll be in the global namespace of network_utils)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thank you! Fixed here: c8b4bb8
python/ray/_private/ray_constants.py
Outdated
| RAY_DASHBOARD_STARTUP_TIMEOUT_S = env_integer("RAY_DASHBOARD_STARTUP_TIMEOUT_S", 60) | ||
|
|
||
| DEFAULT_DASHBOARD_IP = "127.0.0.1" | ||
| DEFAULT_DASHBOARD_IP = get_localhost_ip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file is currently constants & env vars, so adding a dynamic lookup that interacts with the OS/network interfaces doesn't sit well with me. can we replace instances of DEFAULT_DASHBOARD_IP with a call to the util function directly instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trade-off is losing a single point of change, especially since these are used as function parameter defaults. Same pattern exists for Serve's DEFAULT_HTTP_HOST. That said, fair point - replaced with direct calls:92799f1
efb52e9 to
c8b4bb8
Compare
|
there are some merge conflicts |
Signed-off-by: yicheng <yicheng@anyscale.com>
Signed-off-by: yicheng <yicheng@anyscale.com>
…ip() Signed-off-by: yicheng <yicheng@anyscale.com>
92799f1 to
d6413d4
Compare
edoakes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Kicked off premerge, lmk when passing
….0.1 in test_reporter_dashboard_and_runtime_env_agent Signed-off-by: yicheng <yicheng@anyscale.com>
…0.0.0.0 server bindings (ray-project#59852)" This reverts commit 60bb749.
…0.0.0.0 server bindings (ray-project#59852)" This reverts commit 60bb749. Signed-off-by: zac <zac@anyscale.com>
…0.0.0.0 server bindings (ray-project#59852)" This reverts commit 60bb749. Signed-off-by: zac <zac@anyscale.com>
…0.0.0.0 server bindings (ray-project#59852)" This reverts commit 60bb749.
…0.0.0.0 server bindings (ray-project#59852)" This reverts commit 60bb749.
#60012) ## Description Reverting #59852 as it causes release test infra to fail. We need to update the infra to jive with the new port discovery settings properly. ## Related issues > Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234". ## Additional information Example failing build: https://buildkite.com/ray-project/release/builds/74681#
…server bindings (ray-project#59852) ## Description This PR adds IPv6 localhost support and improves server binding security by eliminating 0.0.0.0 bindings. ### goal - Avoid hardcoding 127.0.0.1, which breaks IPv6 support. - Avoid proactively using 0.0.0.0, which is insecure. ##### Server side - For local-only servers, bind to localhost (resolved via GetLocalhostIP()/ get_localhost_ip(); IPv4/IPv6). - For servers that need cross-node communication, bind to the node IP instead of 0.0.0.0. - If the user explicitly configures a bind address, always respect the user setting. ##### Client side - Use localhost when connecting to local-only servers (resolved via get_localhost_ip()). - Use the node IP when connecting to servers that require cross-node communication. #### Note: `0.0.0.0 → node_ip` related changes this PR made: - GCS Server: `0.0.0.0 → node_ip` - Raylet gRPC: `0.0.0.0 → node_ip` - Core Worker gRPC: `0.0.0.0 → node_ip` - Object Manager: `0.0.0.0 → node_ip` - Remote Python Debugger: `0.0.0.0 → node_ip` - Ray Client Server already passed the node IP before this PR, but its default `--host` was 0.0.0.0. This PR changed it to localhost. - Dashboard Server by default binds to localhost. This PR just updated the documentation to suggest using node IP instead of 0.0.0.0 for remote access. - Non-Ray-core components (e.g., Serve): This PR keeps them binding to all interfaces as before, but replaced hardcoded 0.0.0.0 with `get_all_interfaces_ip()` to handle IPv6 (returns :: for IPv6 environments). ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: yicheng <yicheng@anyscale.com> Co-authored-by: yicheng <yicheng@anyscale.com> Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>
ray-project#60012) ## Description Reverting ray-project#59852 as it causes release test infra to fail. We need to update the infra to jive with the new port discovery settings properly. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information Example failing build: https://buildkite.com/ray-project/release/builds/74681# Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>
Description
This PR adds IPv6 localhost support and improves server binding security by eliminating 0.0.0.0 bindings.
goal
Server side
Client side
Note:
0.0.0.0 → node_iprelated changes this PR made:0.0.0.0 → node_ip0.0.0.0 → node_ip0.0.0.0 → node_ip0.0.0.0 → node_ip0.0.0.0 → node_ip--hostwas 0.0.0.0. This PR changed it to localhost.get_all_interfaces_ip()to handle IPv6 (returns :: for IPv6 environments).Related issues
Additional information