-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cgroup/systemd: fix making CharDevice path in systemdProperties #3568
Conversation
4330574
to
b9ecab5
Compare
PTAL, thanks! cc @AkihiroSuda @thaJeztah |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment about the implementation, but in general, this feels rather odd to have this exception for these devices, and I wonder if this should be included in runc, being the reference implementation of the OCI runtime spec; does the spec describe anything about this special case?
b9ecab5
to
e28355c
Compare
Thanks, I have modified the code. cat /run/systemd/system/cri-containerd-74e9a65ee73edfecdf345f477e4bfb44a39428243d4a64519cd860fcc0f6901b.scope.d/50-DeviceAllow.conf
[Scope]
DeviceAllow=
DeviceAllow=char-pts rwm
DeviceAllow=/dev/char/10:200 rwm
DeviceAllow=/dev/char/5:2 rwm
DeviceAllow=/dev/char/5:0 rwm
DeviceAllow=/dev/char/1:9 rwm
DeviceAllow=/dev/char/195:0 rw
DeviceAllow=/dev/char/195:1 rw The And if DevicePolicy.conf set cat /run/systemd/system/cri-containerd-74e9a65ee73edfecdf345f477e4bfb44a39428243d4a64519cd860fcc0f6901b.scope.d/50-DevicePolicy.conf
[Scope]
DevicePolicy=strict
cat /sys/fs/cgroup/devices/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podeeccd2f8_7bef_4054_a659_6554b908432a.slice/cri-containerd-74e9a65ee73edfecdf345f477e4bfb44a39428243d4a64519cd860fcc0f6901b.scope/devices.list
c 136:* rwm
c 5:2 rwm
c 195:* m |
Indeed, not all character devices have /dev/char/MM:mm equivalent for some reason. Here's what I found on my machine (Fedora 36 laptop running kernel 5.17.14-300.fc36.x86_64): [kir@kir-rhat linux]$ ls -lR /dev | grep ^c | awk '{print $10, $5, $6}' | sed -e 's|, |:|' -e 's| | /dev/char/|' | awk '{printf "/dev/" $1 "\t"; system("ls -l " $2);}' 2>&1 | grep cannot
/dev/cuse ls: cannot access '/dev/char/10:203': No such file or directory
/dev/lp0 ls: cannot access '/dev/char/6:0': No such file or directory
/dev/lp1 ls: cannot access '/dev/char/6:1': No such file or directory
/dev/lp2 ls: cannot access '/dev/char/6:2': No such file or directory
/dev/lp3 ls: cannot access '/dev/char/6:3': No such file or directory
/dev/ppp ls: cannot access '/dev/char/108:0': No such file or directory
/dev/uhid ls: cannot access '/dev/char/10:239': No such file or directory
/dev/uinput ls: cannot access '/dev/char/10:223': No such file or directory
/dev/vhci ls: cannot access '/dev/char/10:137': No such file or directory
/dev/vhost-vsock ls: cannot access '/dev/char/10:241': No such file or directory (there might be some more char devices in subdirectories of /dev) For block devices, I haven't found any that does not have a symlink in /dev/block. Perhaps what we should do is to try using device path set in spec, in case /dev/char/MM:mm is not found. WDYT @cyphar |
func systemdProperties will set CharDevice path like /dev/char/0:0, but NVIDIA devices with major 195:* and minor 507:* can not be found in path /dev/char/x:x, getNVIDIAEntryPath will fix this problem. Signed-off-by: yangfeiyu20102011 <[email protected]>
e28355c
to
819cecc
Compare
Now the NVIDIA devices in DeviceAllow.conf are not as expected. This PR can solve some NVIDIA GPU rw problems and it is a improved method at least. |
@yangfeiyu20102011 can you please provide OCI spec example with NVidia devices added? |
cc @kolyshkin
systemd conf: |
@thaJeztah @kolyshkin Hi,is there a better solution for solving this problem? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to being hardcoded, there is no reason to assume that these major and minor numbers will always be associated with NVIDIA devices. The kernel provides basically no guarantees as to what major and minor numbers will be associated with which driver.
At the very least this should be double-checked against /proc/devices
-- but even then, I'm not convinced. This also has the potential to be a security issue -- a container might allow 195:254
access for another driver but then this code would cause runc to allow access to the nvidia-related device files because of this hardcoded check.
Unfortunately the correct way to implement this would've been to have the systemd-style semantics (or some other non-major/minor-based semantics) in the runtime-spec so we could've avoided this whole issue. But obviously it's a bit late for that discussion...
Hmmm. This would work for most device configurations (though not all of them), but we should absolutely double-check that the path on the host has the same major/minor numbers as the rule that references it (otherwise we may end up with a security issue). |
@cyphar Thanks. Is there a plan for solving this issue? I can use this patch in my personal project, but I still hope this problem can be fixed in the latest runc. |
Checking is not an issue. The fact that Now I'm thinking about creating a device file and passing it to systemd; this might be easier and less error prone. |
Any better solution about this issue ? |
The same problem refers to NVIDIA/nvidia-docker#1730 and a fix will be present in the next patch release of all supported NVIDIA GPU drivers. |
This is now being fixed by #3842. |
cgroup/systemd: func systemdProperties will set CharDevice path like /dev/char/0:0,
but NVIDIA devices with major 195:* and minor 507:* can not be found in path /dev/char/x:x,
getNVIDIAEntryPath will fix this problem.
Signed-off-by: yangfeiyu20102011 [email protected]