-
Notifications
You must be signed in to change notification settings - Fork 1.4k
rootless: support Google Container-Optimized OS (Fix Options:[rbind ro]}]: operation not permitted errors)
#3097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d7b7db2 to
eaacc9e
Compare
Options:[rbind ro]}]: operation not permitted errors)
eaacc9e to
49adb9f
Compare
|
Interesting! This doesn't actually fix the issue on Bottlerocket; the It's great that it works on GKE and GCOS though. I wonder if it's because the backing directory for @AkihiroSuda any chance you could check your GCOS host (via |
Dockerfile has `VOLUME /home/user/.local/share/buildkit` by default too, but the default VOLUME does not work with rootless on Google's Container-Optimized OS as it is mounted with `nosuid,nodev`. So the volume has to be explicitly mounted as an `emptyDir` volume. Tested with GKE Autopilot 1.24.3-gke.200 (kernel 5.10.123+, containerd 1.6.6). Fix issue 879 Thanks to Andrew Grigorev (ei-grad) and Ben Cressey (bcressey). Signed-off-by: Akihiro Suda <[email protected]>
49adb9f to
b36488e
Compare
Options:[rbind ro]}]: operation not permitted errors) Options:[rbind ro]}]: operation not permitted errors)
Thanks for the info 👀 , removed Bottlerocket from the PR description.
With emptyDir: /dev/sda1 on /home/user/.local/share/buildkit type ext4 (rw,relatime,commit=30)
$ kubectl exec buildkitd -- mount
W0909 17:28:16.661963 2250 gcp.go:119] WARNING: the gcp auth plugin is deprecated in v1.22+, unavailable in v1.26+; use gcloud instead.
To learn more, consult https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke
overlay on / type overlay (rw,relatime,lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/208/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/207/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/206/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/205/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/204/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/203/fs,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/209/fs,workdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/209/work)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev type tmpfs (rw,nosuid,size=65536k,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime)
tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,relatime,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (ro,nosuid,nodev,noexec,relatime,xattr,name=systemd)
cgroup on /sys/fs/cgroup/blkio type cgroup (ro,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (ro,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/perf_event type cgroup (ro,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (ro,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/pids type cgroup (ro,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/cpuset type cgroup (ro,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/freezer type cgroup (ro,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (ro,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/memory type cgroup (ro,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (ro,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/rdma type cgroup (ro,nosuid,nodev,noexec,relatime,rdma)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
/dev/sda1 on /etc/hosts type ext4 (rw,relatime,commit=30)
/dev/sda1 on /dev/termination-log type ext4 (rw,relatime,commit=30)
/dev/sda1 on /etc/hostname type ext4 (rw,nosuid,nodev,relatime,commit=30)
/dev/sda1 on /etc/resolv.conf type ext4 (rw,nosuid,nodev,relatime,commit=30)
tmpfs on /run/secrets/kubernetes.io/serviceaccount type tmpfs (ro,relatime,size=2097152k)
/dev/sda1 on /home/user/.local/share/buildkit type ext4 (rw,relatime,commit=30)
proc on /proc/bus type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/fs type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/irq type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sys type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sysrq-trigger type proc (ro,nosuid,nodev,noexec,relatime)
tmpfs on /proc/acpi type tmpfs (ro,relatime)
tmpfs on /proc/kcore type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/keys type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/timer_list type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/scsi type tmpfs (ro,relatime)
tmpfs on /sys/firmware type tmpfs (ro,relatime)Without emptyDir: /dev/sda1 on /home/user/.local/share/buildkit type ext4 (rw,nosuid,nodev,relatime,commit=30)
$ kubectl exec buildkitd-bad -- mount
W0909 17:31:03.192574 2257 gcp.go:119] WARNING: the gcp auth plugin is deprecated in v1.22+, unavailable in v1.26+; use gcloud instead.
To learn more, consult https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke
overlay on / type overlay (rw,relatime,lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/210/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/209/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/208/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/207/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/206/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/205/fs,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/211/fs,workdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/211/work)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev type tmpfs (rw,nosuid,size=65536k,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime)
tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,relatime,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (ro,nosuid,nodev,noexec,relatime,xattr,name=systemd)
cgroup on /sys/fs/cgroup/freezer type cgroup (ro,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (ro,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (ro,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/devices type cgroup (ro,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/rdma type cgroup (ro,nosuid,nodev,noexec,relatime,rdma)
cgroup on /sys/fs/cgroup/memory type cgroup (ro,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/cpuset type cgroup (ro,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (ro,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/blkio type cgroup (ro,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (ro,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/pids type cgroup (ro,nosuid,nodev,noexec,relatime,pids)
/dev/sda1 on /etc/hosts type ext4 (rw,relatime,commit=30)
/dev/sda1 on /dev/termination-log type ext4 (rw,relatime,commit=30)
/dev/sda1 on /etc/hostname type ext4 (rw,nosuid,nodev,relatime,commit=30)
/dev/sda1 on /etc/resolv.conf type ext4 (rw,nosuid,nodev,relatime,commit=30)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
tmpfs on /run/secrets/kubernetes.io/serviceaccount type tmpfs (ro,relatime,size=2097152k)
/dev/sda1 on /home/user/.local/share/buildkit type ext4 (rw,nosuid,nodev,relatime,commit=30)
proc on /proc/bus type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/fs type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/irq type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sys type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sysrq-trigger type proc (ro,nosuid,nodev,noexec,relatime)
tmpfs on /proc/acpi type tmpfs (ro,relatime)
tmpfs on /proc/kcore type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/keys type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/timer_list type tmpfs (rw,nosuid,size=65536k,mode=755)
tmpfs on /proc/scsi type tmpfs (ro,relatime)
tmpfs on /sys/firmware type tmpfs (ro,relatime) |
|
For the long-term solution, we will have to copy this to somewhere in containerd's pkg mount pkg https://github.com/moby/moby/blob/v20.10.17/daemon/oci_linux.go#L420-L470 // Get the set of mount flags that are set on the mount that contains the given
// path and are locked by CL_UNPRIVILEGED. This is necessary to ensure that
// bind-mounting "with options" will not fail with user namespaces, due to
// kernel restrictions that require user namespace mounts to preserve
// CL_UNPRIVILEGED locked flags.
func getUnprivilegedMountFlags(path string) ([]string, error) { |
|
Can we merge this? (w/ docker/buildx#1310) |
Dockerfile has
VOLUME /home/user/.local/share/buildkitby default too, but the default VOLUME does not work with rootless on Google's Container-Optimized OS as it is mounted withnosuid,nodev.So the volume has to be explicitly mounted as an
emptyDirvolume.Tested with GKE Autopilot 1.24.3-gke.200 (kernel 5.10.123+, containerd 1.6.6).
Fix #879
Thanks to Andrew Grigorev (@ei-grad) and Ben Cressey (@bcressey).