Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[k8s] Large CPU requests cause ray to fail to start #3598

Closed
romilbhardwaj opened this issue May 26, 2024 · 3 comments
Closed

[k8s] Large CPU requests cause ray to fail to start #3598

romilbhardwaj opened this issue May 26, 2024 · 3 comments
Labels

Comments

@romilbhardwaj
Copy link
Collaborator

From a user:

Seeing this error consistently on one of the sky-launched k8s pods, but on other k8s clusters it works:

2024-05-25 21:38:32,524 INFO worker.py:1715 -- Connected to Ray cluster. View the dashboard at http://127.0.0.1:36821/
[2024-05-25 21:38:39,655 E 1262748 1262748] core_worker.cc:215: Failed to register worker 01000000ffffffffffffffffffffffffffffffffffffffffffffffff to Raylet. IOError: [RayletClient] Unable to register worker with raylet. No such file or directory

...
it has 256 cores

Looks related to ray-project/ray#30012 (comment)

Copy link
Contributor

This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.

Copy link
Contributor

This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Jan 23, 2025
Copy link
Contributor

github-actions bot commented Feb 2, 2025

This issue was closed because it has been stalled for 10 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant