Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested podman ignores error when mounting container root file system and requires --security-opt=seccomp=unconfined in addition to --privileged. #8849

Closed
mgoltzsche opened this issue Dec 28, 2020 · 12 comments · Fixed by #8863
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@mgoltzsche
Copy link
Contributor

mgoltzsche commented Dec 28, 2020

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description
When running podman 2.2.1 nested within a podman 2.2.1 container on an Ubuntu 20.04 host the file system of any inner container is empty and no error is logged.
The error can be solved by mounting the storage directory and setting --security-opt=seccomp=unconfined - though this differs from docker's behaviour where the --privileged option is sufficient.
More importantly, if not enough privileges have been granted, the mount should fail with a corresponding error instead of proceeding with an empty directory.

This is a follow-up issue of #4056 (comment).

Steps to reproduce the issue:

Run a podman container within a podman container (using this build on Ubuntu 20.04):

$ podman run --rm --privileged -u podman:podman quay.io/podman/stable podman run docker.io/alpine echo hello
Trying to pull docker.io/library/alpine:latest...
Getting image source signatures
Copying blob sha256:801bfaa63ef2094d770c809815b9e2b9c1194728e5e754ef7bc764030e140cea
Copying config sha256:389fef7118515c70fd6c0e0d50bb75669942ea722ccb976507d7b087e54d5a23
Writing manifest to image destination
Storing signatures
Error: executable file `echo` not found in $PATH: No such file or directory: OCI not found

Verify that the file system of any nested container is empty:

$ podman run --rm --privileged -u podman:podman quay.io/podman/stable /bin/sh -c 'ls -la $(podman mount $(podman create docker.io/alpine))'
Trying to pull docker.io/library/alpine:latest...
Getting image source signatures
Copying blob sha256:801bfaa63ef2094d770c809815b9e2b9c1194728e5e754ef7bc764030e140cea
Copying config sha256:389fef7118515c70fd6c0e0d50bb75669942ea722ccb976507d7b087e54d5a23
Writing manifest to image destination
Storing signatures
total 8
drwx------ 2 root root 4096 Dec 28 21:50 .
drwx------ 5 root root 4096 Dec 28 21:50 ..

When the storage directory is mounted to the outer podman container it seems to pass the previous error and I get another one (also in this case it doesn't matter if run with or without root) - this is related to #4131:

$ podman run -v `pwd`/test-storage:/home/podman/.local/share/containers/storage --rm --privileged -u podman:podman quay.io/podman/stable podman run docker.io/alpine echo hello
...
Error: cannot chown /home/podman/.local/share/containers/storage/overlay/70043fe8d63e411b1640c9b345800377286a26056f3724cde07feece4c27e2b6/merged to 0:0: chown /home/podman/.local/share/containers/storage/overlay/70043fe8d63e411b1640c9b345800377286a26056f3724cde07feece4c27e2b6/merged: operation not permitted

The last error can be solved by providing the option --security-opt seccomp=unconfined - relates to #4056.

Describe the results you received:
While the outer container works fine the nested container's file system is empty but no error is logged about it.

Describe the results you expected:
The nested container's file system should be mounted properly containing the alpine image's contents.
If that does not work due to system restrictions podman should fail with a corresponding error which it currently seems to ignore silently.
Also it seems --privileged isn't granting enough privileges since --security-opt seccomp=unconfined is required although, when using docker, only specifying the --privileged option is sufficient. Ideally this should work with podman in the same way.

Additional information you deem important (e.g. issue happens only occasionally):

I am using a custom, static, alpine-based podman build.
Though the only difference I noticed regarding the binaries within the official podman image quay.io/podman/stable is the fusermount3 version and changing that to use the same as the official podman image (3.9.3) also doesn't fix the problem.

Output of podman version:

podman version
Version:      2.2.1
API Version:  2.1.0
Go Version:   go1.14.13
Built:        Thu Jan  1 01:00:00 1970
OS/Arch:      linux/amd64

Output of podman info --debug:

$ podman info --debug
host:
  arch: amd64
  buildahVersion: 1.18.0
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: Unknown
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.22, commit: 9c34a8663b85e479e0c083801e89a2b2835228ed'
  cpus: 6
  distribution:
    distribution: ubuntu
    version: "20.04"
  eventLogger: file
  hostname: max-desktop
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 200000
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 200000
  kernel: 5.4.0-58-generic
  linkmode: dynamic
  memFree: 8202813440
  memTotal: 16790204416
  ociRuntime:
    name: runc
    package: 'containerd.io: /usr/bin/runc'
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc10
      commit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
      spec: 1.0.1-dev
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/local/bin/slirp4netns
    package: Unknown
    version: |-
      slirp4netns version 1.1.8
      commit: d361001f495417b880f20329121e3aa431a8f90f
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.4.3
  swapFree: 16249774080
  swapTotal: 16249774080
  uptime: 4h 4m 8.76s (Approximately 0.17 days)
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
store:
  configFile: /home/max/.config/containers/storage.conf
  containerStore:
    number: 16
    paused: 0
    running: 0
    stopped: 16
  graphDriverName: overlay
  graphOptions:
    overlay.ignore_chown_errors: "true"
    overlay.mount_program:
      Executable: /usr/local/bin/fuse-overlayfs
      Package: Unknown
      Version: |-
        fuse-overlayfs: version 1.3
        fusermount3 version: 3.9.3
        FUSE library version 3.9.3
        using FUSE kernel interface version 7.31
  graphRoot: /home/max/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 90
  runRoot: /run/user/1000/containers
  volumePath: /home/max/.local/share/containers/storage/volumes
version:
  APIVersion: 2.1.0
  Built: 0
  BuiltTime: Thu Jan  1 01:00:00 1970
  GitCommit: ""
  GoVersion: go1.14.13
  OsArch: linux/amd64
  Version: 2.2.1

Package info (e.g. output of rpm -q podman or apt list podman):

Custom, static, alpine-based podman build

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):
Bare metal Ubuntu 20.04 host.

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Dec 28, 2020
@mgoltzsche mgoltzsche changed the title Container file system is empty due to ignored error when running podman within podman Container file system is empty due to ignored error when running podman within podman container Dec 28, 2020
@mgoltzsche mgoltzsche changed the title Container file system is empty due to ignored error when running podman within podman container Nested podman ignores error when mounting container root file system and requires --security-opt=seccomp=unconfined in addition to --privileged. Dec 28, 2020
@rhatdan
Copy link
Member

rhatdan commented Dec 29, 2020

Could you attach the AVCs that you saw?

@mgoltzsche
Copy link
Contributor Author

mgoltzsche commented Dec 29, 2020

I don't see any. I ran the following after the failing examples listed above:

$ sudo ausearch -m AVC,USER_AVC -ts today
<no matches>

Though on my Ubuntu 20.04 system auditd is configured with the defaults which seem to include logins only.

However sudo autrace podman run ... seems to do the trick but I don't find any AVCs nor failed syscalls in its log either except for a failed prctl syscall which however also occurs when podman runs successfully:

$ sudo ausearch -i -p 11679 | grep -E 'SECCOMP'
type=SYSCALL msg=audit(29.12.2020 22:02:11.241:93513) : arch=x86_64 syscall=prctl success=no exit=EFAULT(Ungültige Adresse) a0=PR_SET_SECCOMP a1=0x2 a2=0x0 a3=0x0 items=0 ppid=11677 pid=11679 auid=max uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts0 ses=3 comm=podman exe=/usr/local/bin/podman key=(null)

I am just concluding that an error is skipped here because it works if I add more options as described above. Also the fact that the --security-opt seccomp=unconfined solves the problem points in that direction pretty clearly.

Is there a particular command you want me to call or an auditd configuration I should apply that could provide more details?
Eventually I need to run the audit trace within the outer podman container or does autrace follow child processes as well?

@rhatdan
Copy link
Member

rhatdan commented Dec 30, 2020

I am sorry, I was not paying attention. Yes it makes sense that seccomp would need to be disabled. We should do this automatically within the podman/stable container. You should not turn on seccomp when running within a container. Although since you are running a privileged container, this looks like a bug. We should not be turning on seccomp filtering when running in --privileged mode.

@rhatdan
Copy link
Member

rhatdan commented Dec 30, 2020

Looks like current podman works correctly.

$ podman -v
podman version 2.2.1
$ podman run -d --privileged fedora sleep 100
0c7ce184d57b5f31afcab18db53c4607fdd4d4822ada4ecec1e6e45a87f2b62b
$ podman top -l seccomp
SECCOMP
disabled

@rhatdan
Copy link
Member

rhatdan commented Dec 30, 2020

$ podman run fedora grep -i Seccomp /proc/self/status
Seccomp:	2
Seccomp_filters:	1
$ podman run --privileged fedora grep -i Seccomp /proc/self/status
Seccomp:	0
Seccomp_filters:	0

@mgoltzsche
Copy link
Contributor Author

mgoltzsche commented Dec 30, 2020

This behaves different on my host:

$ podman -v
podman version 2.2.1
$ podman run -d --privileged fedora sleep 100
7e2136479b8c09433b8ccab1d242d15f8afa004027a60cb807616b18991148b0
$ podman top -l seccomp
SECCOMP
filter

Apparently the --privileged option has no impact on seccomp:

$ podman run fedora grep -i Seccomp /proc/self/status
Seccomp:	2
$ podman run --privileged fedora grep -i Seccomp /proc/self/status
Seccomp:	2

I ll check the tags I've built the podman binary with...

@rhatdan
Copy link
Member

rhatdan commented Dec 30, 2020

Just to make sure, could you also check those fields on your terminal?

$ grep -i Seccomp /proc/self/status
Seccomp: 0
Seccomp_filters: 0

Just making sure your user account does not have seccomp rules applied.

@mgoltzsche
Copy link
Contributor Author

mgoltzsche commented Dec 30, 2020

sure:

$ grep -i Seccomp /proc/self/status
Seccomp:	0

When I run the commands above as root I get the same behaviour.

I don't see any relevant difference regarding the build tags I've used.
What else could be causing this?

@rhatdan
Copy link
Member

rhatdan commented Dec 31, 2020

Very strange.

@rhatdan
Copy link
Member

rhatdan commented Dec 31, 2020

Here is the code. Did you change the seccomp path in your containers.conf?

	// Clear default Seccomp profile from Generator for unconfined containers
	// and privileged containers which do not specify a seccomp profile.
	if s.SeccompProfilePath == "unconfined" || (s.Privileged && (s.SeccompProfilePath == config.SeccompOverridePath || s.SeccompProfilePath == config.SeccompDefaultPath)) {
		configSpec.Linux.Seccomp = nil
	}

@mgoltzsche
Copy link
Contributor Author

mgoltzsche commented Jan 1, 2021

hm, no, I don't specify a seccomp profile explicitly. My containers.conf looks as follows:

[engine]
cgroup_manager = "cgroupfs"
events_logger="file"

...and the podman debug log says Loading default seccomp profile.

Apparently s.SeccompProfilePath is empty. I created PR #8863 to fix it.

mgoltzsche added a commit to mgoltzsche/libpod that referenced this issue Jan 2, 2021
When running a privileged container and `SeccompProfilePath` is empty no seccomp profile should be applied.
(Previously this was the case only if `SeccompProfilePath` was set to a non-empty default path.)

Closes containers#8849

Signed-off-by: Max Goltzsche <[email protected]>
@mgoltzsche
Copy link
Contributor Author

mgoltzsche commented Jan 4, 2021

While my PR fixed the seccomp behaviour for cases where the seccomp profile path is empty or a non-default path it would still cause the same buggy behaviour as described above if one would explicitly specify the default seccomp profile path as I understand the code.

Also I think that an error is still ignored here somewhere. It should be reproducible after this merge using a --privileged container with an explicitly specified a seccomp profile that equals the default one but uses a non-default path. When attempting podman mount within that container it would succeed but the directory be empty as I've shown in the description of this issue...

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants