Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nvidia GPU devices in systemd will be ignored after 1.1.3 #3671

Closed
panli889 opened this issue Nov 24, 2022 · 6 comments
Closed

Nvidia GPU devices in systemd will be ignored after 1.1.3 #3671

panli889 opened this issue Nov 24, 2022 · 6 comments

Comments

@panli889
Copy link

panli889 commented Nov 24, 2022

I found a PR https://github.com/opencontainers/runc/pull/3504/files in 1.13, in which it will ignore the devices can't be os.Stat. However, Nvidia GPU is that kind of device, though I do not know why.

After the version, If we use the command like this: 'docker run --device /dev/nvidia0:/dev/nvidia0', the DeviceAllow list systemd received from runc will be different from the list runc operates. So if we run systemctl daemon-reload, the device will not be able to access anymore.

So I'm wondering is it possible to revert the PR, or is there any other solutions?

@panli889 panli889 changed the title Nvidia GPU devices will be ignored after 1.13 Nvidia GPU devices in systemd will be ignored after 1.13 Nov 24, 2022
@AkihiroSuda AkihiroSuda changed the title Nvidia GPU devices in systemd will be ignored after 1.13 Nvidia GPU devices in systemd will be ignored after 1.1.3 Nov 25, 2022
@AkihiroSuda
Copy link
Member

@cyphar PTAL

@panli889
Copy link
Author

@cyphar Hi, any progress?

@kolyshkin
Copy link
Contributor

@panli889 can you please check and let us know if the fix in PR #3620 also fixes your issue?

@panli889
Copy link
Author

@panli889 can you please check and let us know if the fix in PR #3620 also fixes your issue?

Thanks for the reply! I've test it with the code, it reports the same issue.

For the normal nvidia GPU device, the location is at /dev/nvidia0, and in the common.go now it turns to /dev/char/195:0. If I run it with the os.Stat, it still reports no such file or directory.

@kolyshkin
Copy link
Contributor

This should be fixed in runc-1.1.7 (by PR #3845), provided systemd >= v240 is used. For older systemd versions, there's nothing we can do.

@panli889 can you please check that this fixes the issue?

@panli889
Copy link
Author

panli889 commented May 4, 2023

This should be fixed in runc-1.1.7 (by PR #3845), provided systemd >= v240 is used. For older systemd versions, there's nothing we can do.

@panli889 can you please check that this fixes the issue?

Hi @kolyshkin, Thanks for your efforts! It's ok from my vision, our systemd version is >= 240!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants