Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Cannot detect number of cpus inside docker correctly with cpu bursting #27958

Open
jjyao opened this issue Aug 17, 2022 · 1 comment
Open
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core P3 Issue moderate in impact or severity

Comments

@jjyao
Copy link
Collaborator

jjyao commented Aug 17, 2022

What happened + What you expected to happen

This is from https://discuss.ray.io/t/ray-init-fails-to-register-workers/6044

The problem is that Ray loses the right CPU count when our OpenShift has CPU Bursting active. If I start a pod with 1 CPU, without CPU bursting available, then /sys/fs/cgroup/cpu/cpu.cfs_quota_us returns the right amount. However, if I have CPU bursting on, it returns -1.
In this case, I think the /sys/fs/cgroup/cpu/cpu.shares contains the “guaranteed” millicores assigned to the pod, however Ray is not looking at that file right now.

Versions / Dependencies

master

Reproduction script

See above

Issue Severity

No response

@jjyao jjyao added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) core Issues that should be addressed in Ray Core P2 Important issue, but not time-critical and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Aug 17, 2022
@zen-xu
Copy link
Contributor

zen-xu commented Jun 5, 2024

Is there any progress? This has already impacted using Ray in the k8s cluster.

@jjyao jjyao added P3 Issue moderate in impact or severity and removed P2 Important issue, but not time-critical labels Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core P3 Issue moderate in impact or severity
Projects
None yet
Development

No branches or pull requests

2 participants