[core] (cgroups 11/n) Raylet will move system processes into cgroup on startup #56522

israbbani · 2025-09-15T05:04:40Z

This PR stacks on #56352 .

For more details about the resource isolation project see #54703.

This PR the makes the raylet move the system processes into the system cgroup on startup if resource isolation is enabled.

It introduces the following

A new raylet cli arg --system-pids which is a comma-separated string of pids of system processes that are started before the raylet. As of today, it contains
- On the head node: gcs_server, dashboard_api_server, ray client server, monitor (autoscaler)
- On every node (including head): process subreaper, log monitor.
End-to-end integration tests for resource isolation with the Ray SDK (ray.init) and the Ray CLI (ray --start)

There are a few rough edges (I've added a comment on the PR where relevant):

The construction of ResourceIsolationConfig is spread across multiple call-sites (create the object, add the object store memory, add the system pids). The big positive of doing it this way was to fail fast on invalid user input (in scripts.py and worker.py). I think it needs to have at least two components: the user input (cgroup_path, system_reserved_memory, ...) and the derived input (system_pids, total_system_reserved_memory).
How to determine which processes should be moved? Right now I'm using self.all_processes in node.py. It should contain all processes started so far, but there's no guarantee.
How intrusive should the integration test be? Should we count the number of pids inside the system cgroup? (This was answered in [core] (cgroups 12/n) Raylet will start worker processes in the application cgroup #56549)
How should a user setup multiple nodes on the same VM? I haven't written an integration test for it yet because there are multiple options for how to set this up.

to perform cgroup operations. Signed-off-by: irabbani <[email protected]>

Signed-off-by: irabbani <[email protected]>

instead of clone for older kernel headers < 5.7 (which is what we have in CI) Signed-off-by: irabbani <[email protected]>

Signed-off-by: irabbani <[email protected]>

Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Ibrahim Rabbani <[email protected]>

Signed-off-by: irabbani <[email protected]>

…irabbani/cgroups-1

Signed-off-by: irabbani <[email protected]>

Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Ibrahim Rabbani <[email protected]>

Signed-off-by: irabbani <[email protected]>

…irabbani/cgroups-1

Signed-off-by: irabbani <[email protected]>

fix CI. Signed-off-by: irabbani <[email protected]>

Signed-off-by: irabbani <[email protected]>

Signed-off-by: Ibrahim Rabbani <[email protected]>

Signed-off-by: irabbani <[email protected]>

israbbani · 2025-09-18T23:08:05Z

@edoakes I addressed all of your comments except for the RAY_CHECK. I'll leave that as a TODO. There are a few more in cgroups 12/n that need to be upgraded too.

israbbani · 2025-09-18T23:09:14Z

I've kicked off MacOS and Windows tests to be extra super duper ultra sure that post-merge won't break. Lets wait for them to pass and I'll ping for merge.

israbbani · 2025-09-19T17:58:06Z

CI post-merge is in bad shape. All failures are unrelated:

MacOS C++/Java tests failed due unrelated cpp_tests core c++ and java tests [g16_s8] RAY_INSTALL_JAVA=1 ./ci/ray_ci/macos/macos_ci.sh run_ray_cpp_and_java
MacOS python tests failures unrelated core flaky tests [g16_s9] ./ci/ray_ci/macos/macos_ci.sh run_flaky_tests
- //python/ray/tests:test_object_manager_fault_tolerance FAILED in 3 out of 3 in 41.4s
Windows CPP tests failing due to FakeRayClient not compiling
Python tests failing due to TIMEOUT: //python/ray/tests:test_reference_counting

@edoakes this should be good to merge.

edoakes · 2025-09-19T18:10:02Z

CI is a little too red for me to be comfortable merging this, don't want to get in the habit of force merging. Let's hold off until test issues are resolved.

israbbani · 2025-09-23T15:29:11Z

Premerge is green. Postmerge-macos is broken for other reasons (#56830). I've merged master and kicked off another premerge.

@edoakes can you merge this if successful?

Signed-off-by: irabbani <[email protected]>

edoakes · 2025-09-23T20:08:31Z

Test failure: https://buildkite.com/ray-project/premerge/builds/49657#019977d7-530a-4b78-8296-3e948219c6c0/179-5349

Don't see on tracker -- relevant?

…n startup (ray-project#56522) This PR stacks on ray-project#56352 . For more details about the resource isolation project see ray-project#54703. This PR the makes the raylet move the system processes into the system cgroup on startup if resource isolation is enabled. It introduces the following * A new raylet cli arg `--system-pids` which is a comma-separated string of pids of system processes that are started before the raylet. As of today, it contains * On the head node: gcs_server, dashboard_api_server, ray client server, monitor (autoscaler) * On every node (including head): process subreaper, log monitor. * End-to-end integration tests for resource isolation with the Ray SDK (`ray.init`) and the Ray CLI (`ray --start`) There are a few rough edges (I've added a comment on the PR where relevant): 1. The construction of ResourceIsolationConfig is spread across multiple call-sites (create the object, add the object store memory, add the system pids). The big positive of doing it this way was to fail fast on invalid user input (in scripts.py and worker.py). I think it needs to have at least two components: the user input (cgroup_path, system_reserved_memory, ...) and the derived input (system_pids, total_system_reserved_memory). 2. How to determine which processes should be moved? Right now I'm using `self.all_processes` in `node.py`. It _should_ contain all processes started so far, but there's no guarantee. 3. How intrusive should the integration test be? Should we count the number of pids inside the system cgroup? (This was answered in ray-project#56549) 4. How should a user setup multiple nodes on the same VM? I haven't written an integration test for it yet because there are multiple options for how to set this up. --------- Signed-off-by: irabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: zac <[email protected]>

…n startup (#56522) This PR stacks on #56352 . For more details about the resource isolation project see #54703. This PR the makes the raylet move the system processes into the system cgroup on startup if resource isolation is enabled. It introduces the following * A new raylet cli arg `--system-pids` which is a comma-separated string of pids of system processes that are started before the raylet. As of today, it contains * On the head node: gcs_server, dashboard_api_server, ray client server, monitor (autoscaler) * On every node (including head): process subreaper, log monitor. * End-to-end integration tests for resource isolation with the Ray SDK (`ray.init`) and the Ray CLI (`ray --start`) There are a few rough edges (I've added a comment on the PR where relevant): 1. The construction of ResourceIsolationConfig is spread across multiple call-sites (create the object, add the object store memory, add the system pids). The big positive of doing it this way was to fail fast on invalid user input (in scripts.py and worker.py). I think it needs to have at least two components: the user input (cgroup_path, system_reserved_memory, ...) and the derived input (system_pids, total_system_reserved_memory). 2. How to determine which processes should be moved? Right now I'm using `self.all_processes` in `node.py`. It _should_ contain all processes started so far, but there's no guarantee. 3. How intrusive should the integration test be? Should we count the number of pids inside the system cgroup? (This was answered in #56549) 4. How should a user setup multiple nodes on the same VM? I haven't written an integration test for it yet because there are multiple options for how to set this up. --------- Signed-off-by: irabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: elliot-barn <[email protected]>

…cation cgroup (#56549) This PR stacks on #56522 . For more details about the resource isolation project see #54703. This PR the makes the raylet move runtime_env and dashboard agents into the system cgroup. Workers are now spawned inside the application cgroup. It introduces the following: * I've added a new target `raylet_cgroup_types` which defines the type used all functions that need to add a process to a cgroup. * A new parameter is added to `NodeManager`, `WorkerPool`, `AgentManager`, and `Process` constructors. The parameter is a callback that will use the CgroupManager to add a process to the respective cgroup. * The callback is created in `main.cc`. * `main.cc` owns CgroupManager because it needs to outlive the `WorkerPool`. * `process.c` calls the callback after fork() in the child process so nothing else can happen in the forked process before it's moved into the correct cgroup. * Integration tests in python for end-to-end testing of cgroups with system and application processes moved into their respective cgroups. The tests are inside `python/ray/tests/resource_isolation/test_resource_isolation_integration.py` and have similar setup/teardown to the C++ integration tests introduced in #55063. --------- Signed-off-by: Ibrahim Rabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]>

…n startup (ray-project#56522) This PR stacks on ray-project#56352 . For more details about the resource isolation project see ray-project#54703. This PR the makes the raylet move the system processes into the system cgroup on startup if resource isolation is enabled. It introduces the following * A new raylet cli arg `--system-pids` which is a comma-separated string of pids of system processes that are started before the raylet. As of today, it contains * On the head node: gcs_server, dashboard_api_server, ray client server, monitor (autoscaler) * On every node (including head): process subreaper, log monitor. * End-to-end integration tests for resource isolation with the Ray SDK (`ray.init`) and the Ray CLI (`ray --start`) There are a few rough edges (I've added a comment on the PR where relevant): 1. The construction of ResourceIsolationConfig is spread across multiple call-sites (create the object, add the object store memory, add the system pids). The big positive of doing it this way was to fail fast on invalid user input (in scripts.py and worker.py). I think it needs to have at least two components: the user input (cgroup_path, system_reserved_memory, ...) and the derived input (system_pids, total_system_reserved_memory). 2. How to determine which processes should be moved? Right now I'm using `self.all_processes` in `node.py`. It _should_ contain all processes started so far, but there's no guarantee. 3. How intrusive should the integration test be? Should we count the number of pids inside the system cgroup? (This was answered in ray-project#56549) 4. How should a user setup multiple nodes on the same VM? I haven't written an integration test for it yet because there are multiple options for how to set this up. --------- Signed-off-by: irabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Marco Stephan <[email protected]>

…cation cgroup (ray-project#56549) This PR stacks on ray-project#56522 . For more details about the resource isolation project see ray-project#54703. This PR the makes the raylet move runtime_env and dashboard agents into the system cgroup. Workers are now spawned inside the application cgroup. It introduces the following: * I've added a new target `raylet_cgroup_types` which defines the type used all functions that need to add a process to a cgroup. * A new parameter is added to `NodeManager`, `WorkerPool`, `AgentManager`, and `Process` constructors. The parameter is a callback that will use the CgroupManager to add a process to the respective cgroup. * The callback is created in `main.cc`. * `main.cc` owns CgroupManager because it needs to outlive the `WorkerPool`. * `process.c` calls the callback after fork() in the child process so nothing else can happen in the forked process before it's moved into the correct cgroup. * Integration tests in python for end-to-end testing of cgroups with system and application processes moved into their respective cgroups. The tests are inside `python/ray/tests/resource_isolation/test_resource_isolation_integration.py` and have similar setup/teardown to the C++ integration tests introduced in ray-project#55063. --------- Signed-off-by: Ibrahim Rabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Marco Stephan <[email protected]>

…n startup (#56522) This PR stacks on #56352 . For more details about the resource isolation project see #54703. This PR the makes the raylet move the system processes into the system cgroup on startup if resource isolation is enabled. It introduces the following * A new raylet cli arg `--system-pids` which is a comma-separated string of pids of system processes that are started before the raylet. As of today, it contains * On the head node: gcs_server, dashboard_api_server, ray client server, monitor (autoscaler) * On every node (including head): process subreaper, log monitor. * End-to-end integration tests for resource isolation with the Ray SDK (`ray.init`) and the Ray CLI (`ray --start`) There are a few rough edges (I've added a comment on the PR where relevant): 1. The construction of ResourceIsolationConfig is spread across multiple call-sites (create the object, add the object store memory, add the system pids). The big positive of doing it this way was to fail fast on invalid user input (in scripts.py and worker.py). I think it needs to have at least two components: the user input (cgroup_path, system_reserved_memory, ...) and the derived input (system_pids, total_system_reserved_memory). 2. How to determine which processes should be moved? Right now I'm using `self.all_processes` in `node.py`. It _should_ contain all processes started so far, but there's no guarantee. 3. How intrusive should the integration test be? Should we count the number of pids inside the system cgroup? (This was answered in #56549) 4. How should a user setup multiple nodes on the same VM? I haven't written an integration test for it yet because there are multiple options for how to set this up. --------- Signed-off-by: irabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: elliot-barn <[email protected]>

…cation cgroup (#56549) This PR stacks on #56522 . For more details about the resource isolation project see #54703. This PR the makes the raylet move runtime_env and dashboard agents into the system cgroup. Workers are now spawned inside the application cgroup. It introduces the following: * I've added a new target `raylet_cgroup_types` which defines the type used all functions that need to add a process to a cgroup. * A new parameter is added to `NodeManager`, `WorkerPool`, `AgentManager`, and `Process` constructors. The parameter is a callback that will use the CgroupManager to add a process to the respective cgroup. * The callback is created in `main.cc`. * `main.cc` owns CgroupManager because it needs to outlive the `WorkerPool`. * `process.c` calls the callback after fork() in the child process so nothing else can happen in the forked process before it's moved into the correct cgroup. * Integration tests in python for end-to-end testing of cgroups with system and application processes moved into their respective cgroups. The tests are inside `python/ray/tests/resource_isolation/test_resource_isolation_integration.py` and have similar setup/teardown to the C++ integration tests introduced in #55063. --------- Signed-off-by: Ibrahim Rabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: elliot-barn <[email protected]>

…n startup (#56522) This PR stacks on #56352 . For more details about the resource isolation project see #54703. This PR the makes the raylet move the system processes into the system cgroup on startup if resource isolation is enabled. It introduces the following * A new raylet cli arg `--system-pids` which is a comma-separated string of pids of system processes that are started before the raylet. As of today, it contains * On the head node: gcs_server, dashboard_api_server, ray client server, monitor (autoscaler) * On every node (including head): process subreaper, log monitor. * End-to-end integration tests for resource isolation with the Ray SDK (`ray.init`) and the Ray CLI (`ray --start`) There are a few rough edges (I've added a comment on the PR where relevant): 1. The construction of ResourceIsolationConfig is spread across multiple call-sites (create the object, add the object store memory, add the system pids). The big positive of doing it this way was to fail fast on invalid user input (in scripts.py and worker.py). I think it needs to have at least two components: the user input (cgroup_path, system_reserved_memory, ...) and the derived input (system_pids, total_system_reserved_memory). 2. How to determine which processes should be moved? Right now I'm using `self.all_processes` in `node.py`. It _should_ contain all processes started so far, but there's no guarantee. 3. How intrusive should the integration test be? Should we count the number of pids inside the system cgroup? (This was answered in #56549) 4. How should a user setup multiple nodes on the same VM? I haven't written an integration test for it yet because there are multiple options for how to set this up. --------- Signed-off-by: irabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Douglas Strodtman <[email protected]>

…cation cgroup (ray-project#56549) This PR stacks on ray-project#56522 . For more details about the resource isolation project see ray-project#54703. This PR the makes the raylet move runtime_env and dashboard agents into the system cgroup. Workers are now spawned inside the application cgroup. It introduces the following: * I've added a new target `raylet_cgroup_types` which defines the type used all functions that need to add a process to a cgroup. * A new parameter is added to `NodeManager`, `WorkerPool`, `AgentManager`, and `Process` constructors. The parameter is a callback that will use the CgroupManager to add a process to the respective cgroup. * The callback is created in `main.cc`. * `main.cc` owns CgroupManager because it needs to outlive the `WorkerPool`. * `process.c` calls the callback after fork() in the child process so nothing else can happen in the forked process before it's moved into the correct cgroup. * Integration tests in python for end-to-end testing of cgroups with system and application processes moved into their respective cgroups. The tests are inside `python/ray/tests/resource_isolation/test_resource_isolation_integration.py` and have similar setup/teardown to the C++ integration tests introduced in ray-project#55063. --------- Signed-off-by: Ibrahim Rabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Douglas Strodtman <[email protected]>

…n startup (ray-project#56522) This PR stacks on ray-project#56352 . For more details about the resource isolation project see ray-project#54703. This PR the makes the raylet move the system processes into the system cgroup on startup if resource isolation is enabled. It introduces the following * A new raylet cli arg `--system-pids` which is a comma-separated string of pids of system processes that are started before the raylet. As of today, it contains * On the head node: gcs_server, dashboard_api_server, ray client server, monitor (autoscaler) * On every node (including head): process subreaper, log monitor. * End-to-end integration tests for resource isolation with the Ray SDK (`ray.init`) and the Ray CLI (`ray --start`) There are a few rough edges (I've added a comment on the PR where relevant): 1. The construction of ResourceIsolationConfig is spread across multiple call-sites (create the object, add the object store memory, add the system pids). The big positive of doing it this way was to fail fast on invalid user input (in scripts.py and worker.py). I think it needs to have at least two components: the user input (cgroup_path, system_reserved_memory, ...) and the derived input (system_pids, total_system_reserved_memory). 2. How to determine which processes should be moved? Right now I'm using `self.all_processes` in `node.py`. It _should_ contain all processes started so far, but there's no guarantee. 3. How intrusive should the integration test be? Should we count the number of pids inside the system cgroup? (This was answered in ray-project#56549) 4. How should a user setup multiple nodes on the same VM? I haven't written an integration test for it yet because there are multiple options for how to set this up. --------- Signed-off-by: irabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]>

…cation cgroup (ray-project#56549) This PR stacks on ray-project#56522 . For more details about the resource isolation project see ray-project#54703. This PR the makes the raylet move runtime_env and dashboard agents into the system cgroup. Workers are now spawned inside the application cgroup. It introduces the following: * I've added a new target `raylet_cgroup_types` which defines the type used all functions that need to add a process to a cgroup. * A new parameter is added to `NodeManager`, `WorkerPool`, `AgentManager`, and `Process` constructors. The parameter is a callback that will use the CgroupManager to add a process to the respective cgroup. * The callback is created in `main.cc`. * `main.cc` owns CgroupManager because it needs to outlive the `WorkerPool`. * `process.c` calls the callback after fork() in the child process so nothing else can happen in the forked process before it's moved into the correct cgroup. * Integration tests in python for end-to-end testing of cgroups with system and application processes moved into their respective cgroups. The tests are inside `python/ray/tests/resource_isolation/test_resource_isolation_integration.py` and have similar setup/teardown to the C++ integration tests introduced in ray-project#55063. --------- Signed-off-by: Ibrahim Rabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]>

…n startup (ray-project#56522) This PR stacks on ray-project#56352 . For more details about the resource isolation project see ray-project#54703. This PR the makes the raylet move the system processes into the system cgroup on startup if resource isolation is enabled. It introduces the following * A new raylet cli arg `--system-pids` which is a comma-separated string of pids of system processes that are started before the raylet. As of today, it contains * On the head node: gcs_server, dashboard_api_server, ray client server, monitor (autoscaler) * On every node (including head): process subreaper, log monitor. * End-to-end integration tests for resource isolation with the Ray SDK (`ray.init`) and the Ray CLI (`ray --start`) There are a few rough edges (I've added a comment on the PR where relevant): 1. The construction of ResourceIsolationConfig is spread across multiple call-sites (create the object, add the object store memory, add the system pids). The big positive of doing it this way was to fail fast on invalid user input (in scripts.py and worker.py). I think it needs to have at least two components: the user input (cgroup_path, system_reserved_memory, ...) and the derived input (system_pids, total_system_reserved_memory). 2. How to determine which processes should be moved? Right now I'm using `self.all_processes` in `node.py`. It _should_ contain all processes started so far, but there's no guarantee. 3. How intrusive should the integration test be? Should we count the number of pids inside the system cgroup? (This was answered in ray-project#56549) 4. How should a user setup multiple nodes on the same VM? I haven't written an integration test for it yet because there are multiple options for how to set this up. --------- Signed-off-by: irabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]>

…cation cgroup (ray-project#56549) This PR stacks on ray-project#56522 . For more details about the resource isolation project see ray-project#54703. This PR the makes the raylet move runtime_env and dashboard agents into the system cgroup. Workers are now spawned inside the application cgroup. It introduces the following: * I've added a new target `raylet_cgroup_types` which defines the type used all functions that need to add a process to a cgroup. * A new parameter is added to `NodeManager`, `WorkerPool`, `AgentManager`, and `Process` constructors. The parameter is a callback that will use the CgroupManager to add a process to the respective cgroup. * The callback is created in `main.cc`. * `main.cc` owns CgroupManager because it needs to outlive the `WorkerPool`. * `process.c` calls the callback after fork() in the child process so nothing else can happen in the forked process before it's moved into the correct cgroup. * Integration tests in python for end-to-end testing of cgroups with system and application processes moved into their respective cgroups. The tests are inside `python/ray/tests/resource_isolation/test_resource_isolation_integration.py` and have similar setup/teardown to the C++ integration tests introduced in ray-project#55063. --------- Signed-off-by: Ibrahim Rabbani <[email protected]> Co-authored-by: Edward Oakes <[email protected]>

israbbani and others added 30 commits July 24, 2025 20:39

[core] (cgroups 1/n) Adding a sys/fs filesystem driver

05c4dbc

to perform cgroup operations. Signed-off-by: irabbani <[email protected]>

adding the copyright

645f9a0

Signed-off-by: irabbani <[email protected]>

Adding a fallback for creating processes inside cgroups with fork/exec

2bb2c5b

instead of clone for older kernel headers < 5.7 (which is what we have in CI) Signed-off-by: irabbani <[email protected]>

adding a pause in the tests to see what's up with the container

4793094

Signed-off-by: irabbani <[email protected]>

Update src/ray/common/cgroup2/cgroup_driver_interface.h

85d0ebf

Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Ibrahim Rabbani <[email protected]>

Comments

3a5a020

Signed-off-by: irabbani <[email protected]>

Merge branch 'irabbani/cgroups-1' of github.com:ray-project/ray into …

68b0c93

…irabbani/cgroups-1

Putting the cgroupv2 tests into a separate target

f52354b

Signed-off-by: irabbani <[email protected]>

removing test sleep

148d04d

Signed-off-by: irabbani <[email protected]>

Removing a docstring

d3f8b79

Signed-off-by: irabbani <[email protected]>

enabling CI tests

d76ff15

Signed-off-by: irabbani <[email protected]>

fixing absl imports

2798ea5

Signed-off-by: irabbani <[email protected]>

commenting local

3fda505

Signed-off-by: irabbani <[email protected]>

doxygen formatting

9e1e931

Signed-off-by: irabbani <[email protected]>

Merge branch 'master' into irabbani/cgroups-1

f066f34

removing integration tests

e6b4926

Signed-off-by: irabbani <[email protected]>

final cleanup

f4e0cb2

Signed-off-by: irabbani <[email protected]>

iwyu

544ba83

Signed-off-by: irabbani <[email protected]>

Merge branch 'master' into irabbani/cgroups-1

669ba99

we cpplintin!

2e341d6

Signed-off-by: irabbani <[email protected]>

Update src/ray/common/cgroup2/sysfs_cgroup_driver.cc

9e46ce6

Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Ibrahim Rabbani <[email protected]>

Apply suggestions from code review

7c745c5

Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Ibrahim Rabbani <[email protected]>

bug

d7eb863

Signed-off-by: irabbani <[email protected]>

Merge branch 'irabbani/cgroups-1' of github.com:ray-project/ray into …

ff64534

…irabbani/cgroups-1

[core] Integration tests for SysFsCgroupDriver.

da4b475

Signed-off-by: irabbani <[email protected]>

Cleaning up cgroup_test_utils and attempting to

37e205f

fix CI. Signed-off-by: irabbani <[email protected]>

broken

7b83932

Signed-off-by: irabbani <[email protected]>

up

b911d25

Signed-off-by: irabbani <[email protected]>

upup

63506dc

Signed-off-by: irabbani <[email protected]>

Merge branch 'master' into irabbani/cgroups-2

e6f1ae9

Signed-off-by: Ibrahim Rabbani <[email protected]>

hmm

30326a8

Signed-off-by: irabbani <[email protected]>

israbbani added 2 commits September 19, 2025 11:18

Merge branch 'master' into irabbani/cgroups-11

d40a91a

Merge branch 'master' into irabbani/cgroups-11

7ef0cb6

This comment was marked as outdated.

Sign in to view

feedback

93d0473

Signed-off-by: irabbani <[email protected]>

ray-project deleted a comment from cursor bot Sep 23, 2025

edoakes enabled auto-merge (squash) September 23, 2025 18:13

Merge branch 'master' into irabbani/cgroups-11

7738c28

github-actions bot disabled auto-merge September 23, 2025 18:29

edoakes merged commit b3860f7 into master Sep 23, 2025
6 checks passed

edoakes deleted the irabbani/cgroups-11 branch September 23, 2025 21:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[core] (cgroups 11/n) Raylet will move system processes into cgroup on startup #56522

[core] (cgroups 11/n) Raylet will move system processes into cgroup on startup #56522

Uh oh!

israbbani commented Sep 15, 2025 •

edited

Loading

Uh oh!

israbbani commented Sep 18, 2025

Uh oh!

israbbani commented Sep 18, 2025

Uh oh!

israbbani commented Sep 19, 2025

Uh oh!

edoakes commented Sep 19, 2025

Uh oh!

israbbani commented Sep 23, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

edoakes commented Sep 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[core] (cgroups 11/n) Raylet will move system processes into cgroup on startup #56522

[core] (cgroups 11/n) Raylet will move system processes into cgroup on startup #56522

Uh oh!

Conversation

israbbani commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

israbbani commented Sep 18, 2025

Uh oh!

israbbani commented Sep 18, 2025

Uh oh!

israbbani commented Sep 19, 2025

Uh oh!

edoakes commented Sep 19, 2025

Uh oh!

israbbani commented Sep 23, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

edoakes commented Sep 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

israbbani commented Sep 15, 2025 •

edited

Loading