-
Notifications
You must be signed in to change notification settings - Fork 1.3k
rootless: update docs and examples #5765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cdbe758 to
1d9f5ab
Compare
6ccf8c3 to
dc687ed
Compare
Fix issue 5763 - Discourage `--oci-worker-no-process-sandbox`, due to the leakage of the processes (by design). Instead, encourage setting `systempaths=unconfined` in `docker run`. This corresponds to `securityContext.procMount: Unmasked` in Kubernetes, however, the configuration is hard on Kubernetes, as it has to be used in conjunction with `hostUsers: false`. - Remove `--device /dev/fuse`, as fuse-overlayfs is no longer used typically. - Use the new Kubernetes struct for AppArmor - Add a hint about `kernel.apparmor_restrict_unprivileged_userns` - Remove `$` from command snippets for ease of copypasting - Make `job.*.yaml` more practical - Add `*.userns.yaml`. Needs `UserNamespaceSupport` feature gate to be enabled. Signed-off-by: Akihiro Suda <[email protected]>
dc687ed to
3a91b50
Compare
tonistiigi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with the updates but isn't there a way to fix the specific process leak case if we have a reproducer? Even if we can't make it 100% guaranteed for other cases.
Potentially we may use seccomp (or ptrace) to catch |
That's an option, but maybe there is something simpler. What about cgroups? I don't know what the exact case is in here. |
Not sure $ docker run -it --rm --security-opt seccomp=unconfined --security-opt apparmor=unconfined --user ubuntu ubuntu
ubuntu@552de0932c50:/$ unshare -rmC
# mount -t cgroup2 none /sys/fs/cgroup
mount: /sys/fs/cgroup: none already mounted or mount point busy.
dmesg(1) may have more information after failed mount system call.
# mount -t tmpfs none /sys/fs/cgroup
# mount -t cgroup2 none /sys/fs/cgroup
# mkdir /sys/fs/cgroup/foo
mkdir: cannot create directory '/sys/fs/cgroup/foo': Permission denied |
|
In containerd v2.1 we will get writable cgroups though: |
|
Can we merge? |
Fix #5763
Discourage
--oci-worker-no-process-sandbox, due to the leakage of the processes (by design). Instead, encourage settingsystempaths=unconfinedindocker run. This corresponds tosecurityContext.procMount: Unmaskedin Kubernetes, however, the configuration is hard on Kubernetes, as it has to be used in conjunction withhostUsers: false.Remove
--device /dev/fuse, as fuse-overlayfs is no longer used typically.Use the new Kubernetes struct for AppArmor
Add a hint about
kernel.apparmor_restrict_unprivileged_usernsRemove
$from command snippets for ease of copypastingMake
job.*.yamlmore practicalAdd
*.userns.yaml. NeedsUserNamespaceSupportfeature gate to be enabled.TODO: update buildx to support UserNS mode too