Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Private/onprem clusters always need explicit ssh_private_key in docker #10838

Open
AmeerHajAli opened this issue Sep 16, 2020 · 7 comments
Open
Labels
bug Something that is supposed to be working; but isn't P2 Important issue, but not time-critical

Comments

@AmeerHajAli
Copy link
Contributor

AmeerHajAli commented Sep 16, 2020

When starting private clusters without docker, it is not necessary to provide the ssh_private_key as the user has authorized access to the nodes. But when adding docker, the head node process inside the container does not have privileges to ssh to the other nodes and hence needs the ssh_private_key to be provided explicitly.

I think it is fine to leave it as is but just wanted to bring it up.

@AmeerHajAli AmeerHajAli added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Sep 16, 2020
@ijrsvt
Copy link
Contributor

ijrsvt commented Sep 17, 2020

If the key is available on the host at /path/to/key/key on the host a quick fix would be to add a docker file mount to docker::run_options such as

run_options:
 -  "-v /path/to/key/key:root/key"

@ericl ericl added P2 Important issue, but not time-critical and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Sep 21, 2020
@jmakov
Copy link
Contributor

jmakov commented Sep 12, 2021

When starting private clusters without docker, it is not necessary to provide the ssh_private_key as the user has authorized access to the nodes. But when adding docker, the head node process inside the container does not have privileges to ssh to the other nodes and hence needs the ssh_private_key to be provided explicitly.

I think it is fine to leave it as is but just wanted to bring it up.

This should really be documented...

@AmeerHajAli
Copy link
Contributor Author

@DmitriGekhtman , what do you suggest we do here?

@DmitriGekhtman
Copy link
Contributor

@AmeerHajAli @ijrsvt why is this an issue for on-prem clusters but not for cloud clusters? In both cases, the head needs ssh access to workers.

@DmitriGekhtman
Copy link
Contributor

DmitriGekhtman commented Sep 12, 2021

Oh I see, it's a documentation problem. I think we could add an example-docker.yaml or something with this info to the local cluster examples.

Or modify example-full to use docker -- my main hesitation there is that variable size clusters don't really work right now with docker, also currently we're not careful enough to clean up docker state when we're done using Ray on an on-prem node.
#17689

@ijrsvt
Copy link
Contributor

ijrsvt commented Sep 13, 2021

why is this an issue for on-prem clusters but not for cloud clusters?

For cloud clusters we auto-insert this field!
AWS:

config["auth"]["ssh_private_key"] = key_path

GCP:
config["auth"]["ssh_private_key"] = private_key_path

@DmitriGekhtman
Copy link
Contributor

DmitriGekhtman commented Sep 13, 2021

Right, we try to auto-configure a key for cloud providers.

But for cloud users who supply a key manually, the situation should be the same as on-prem users who supply a key?

@DmitriGekhtman DmitriGekhtman removed their assignment Nov 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't P2 Important issue, but not time-critical
Projects
None yet
Development

No branches or pull requests

5 participants