-
Notifications
You must be signed in to change notification settings - Fork 7k
[core] (cgroups 21/n) Do not move dashboard modules into the workers cgroup even if they are drivers #57955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cgroup even if they are drivers Signed-off-by: irabbani <[email protected]>
Signed-off-by: irabbani <[email protected]>
Signed-off-by: irabbani <[email protected]>
| def get_pid(self): | ||
| return os.getpid() | ||
|
|
||
| second_driver_proc = create_driver_in_internal_namespace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only added the test to ray start because in ray.init, the testing process is a driver that is already in the workers cgroup. Therefore, the newly spawned second driver will also be in the workers cgroup.
| runner.invoke(scripts.stop) | ||
| assert_process_in_not_moved_into_ray_cgroups( | ||
| node_id, resource_isolation_config, second_driver_proc.pid | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Test Assertion Fails Due to PID Type Mismatch
The assert_process_in_not_moved_into_ray_cgroups function expects a string PID but receives an integer from second_driver_proc.pid. This type mismatch causes the internal PID comparison to always fail, making the test assertion ineffective and always pass.
edoakes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. nits
| ) | ||
|
|
||
|
|
||
| def assert_process_in_not_moved_into_ray_cgroups( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you forget a word in the name?
| node_id: used to construct the path of the cgroup subtree | ||
| resource_isolation_config: used to construct the path of the cgroup | ||
| subtree | ||
| pid: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing
| """ | ||
| import ray | ||
| import time | ||
| ray.init(namespace='_ray_internal_') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a constant for it somewhere you could use, but hard coding is also fine (it shouldn't change)
…cgroup even if they are drivers (#57955) For more details about the resource isolation project see #54703. Driver processes that are registered in ray's internal namespace (such as ray dashboard's job and serve modules) are considered system processes. Therefore, they will not be moved into the workers cgroup when they register with the raylet. --------- Signed-off-by: irabbani <[email protected]>
Cherrypicking #57955 into v2.51.0 Signed-off-by: irabbani <[email protected]>
…cgroup even if they are drivers (ray-project#57955) For more details about the resource isolation project see ray-project#54703. Driver processes that are registered in ray's internal namespace (such as ray dashboard's job and serve modules) are considered system processes. Therefore, they will not be moved into the workers cgroup when they register with the raylet. --------- Signed-off-by: irabbani <[email protected]> Signed-off-by: xgui <[email protected]>
…cgroup even if they are drivers (ray-project#57955) For more details about the resource isolation project see ray-project#54703. Driver processes that are registered in ray's internal namespace (such as ray dashboard's job and serve modules) are considered system processes. Therefore, they will not be moved into the workers cgroup when they register with the raylet. --------- Signed-off-by: irabbani <[email protected]>
…cgroup even if they are drivers (ray-project#57955) For more details about the resource isolation project see ray-project#54703. Driver processes that are registered in ray's internal namespace (such as ray dashboard's job and serve modules) are considered system processes. Therefore, they will not be moved into the workers cgroup when they register with the raylet. --------- Signed-off-by: irabbani <[email protected]> Signed-off-by: Aydin Abiar <[email protected]>
…cgroup even if they are drivers (ray-project#57955) For more details about the resource isolation project see ray-project#54703. Driver processes that are registered in ray's internal namespace (such as ray dashboard's job and serve modules) are considered system processes. Therefore, they will not be moved into the workers cgroup when they register with the raylet. --------- Signed-off-by: irabbani <[email protected]> Signed-off-by: Future-Outlier <[email protected]>
For more details about the resource isolation project see #54703.
Driver processes that are registered in ray's internal namespace (such as ray dashboard's job and serve modules) are considered system processes. Therefore, they will not be moved into the workers cgroup when they register with the raylet.