Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/dev/std* don't work with --userns=auto #1019

Closed
M1cha opened this issue Oct 2, 2022 · 5 comments · Fixed by #1020
Closed

/dev/std* don't work with --userns=auto #1019

M1cha opened this issue Oct 2, 2022 · 5 comments · Fixed by #1020

Comments

@M1cha
Copy link

M1cha commented Oct 2, 2022

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description
When using --userns=auto and NOT allocating a tty, podman will use pipes for stdout and stderr. Those work fine when using them through the file descriptors(1, 2) inherited from the parent. But you get an error when you try to open them through /dev/stderr or /dev/stdout.

This is e.g. used by the nginx container. error.log is a symlink to /dev/stderr.

Steps to reproduce the issue:

  1. podman run --rm --userns=auto:size=65534 alpine sh -c 'echo hello >> /dev/stdout'

Describe the results you received:
/bin/sh: can't create /dev/stdout: Permission denied

Describe the results you expected:
hello

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Client:       Podman Engine
Version:      4.2.1
API Version:  4.2.1
Go Version:   go1.19
Git Commit:   62b324ddf718411b1d4d0ba8117c632f7f984a38-dirty
Built:        Thu Sep  8 08:52:54 2022
OS/Arch:      linux/amd64

Output of podman info:

host:
  arch: amd64
  buildahVersion: 1.27.0
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: /usr/bin/conmon is owned by conmon 1:2.1.4-1
    path: /usr/bin/conmon
    version: 'conmon version 2.1.4, commit: bd1459a3ffbb13eb552cc9af213e1f56f31ba2ee'
  cpuUtilization:
    idlePercent: 95.2
    systemPercent: 2.1
    userPercent: 2.7
  cpus: 8
  distribution:
    distribution: arch
    version: unknown
  eventLogger: journald
  hostname: m1champc
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 1000000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 1000000
      size: 65536
  kernel: 5.19.10-zen1-1-zen
  linkmode: dynamic
  logDriver: journald
  memFree: 689975296
  memTotal: 16161898496
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: /usr/bin/crun is owned by crun 1.6-1
    path: /usr/bin/crun
    version: |-
      crun version 1.6
      commit: 18cf2efbb8feb2b2f20e316520e0fd0b6c41ef4d
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: /usr/bin/slirp4netns is owned by slirp4netns 1.2.0-1
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.4
  swapFree: 0
  swapTotal: 0
  uptime: 229h 41m 9.00s (Approximately 9.54 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries: {}
store:
  configFile: /home/m1cha/.config/containers/storage.conf
  containerStore:
    number: 8
    paused: 0
    running: 1
    stopped: 7
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/m1cha/.local/share/containers/storage
  graphRootAllocated: 965891801088
  graphRootUsed: 905064628224
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 34
  runRoot: /run/user/1000/containers
  volumePath: /home/m1cha/.local/share/containers/storage/volumes
version:
  APIVersion: 4.2.1
  Built: 1662619974
  BuiltTime: Thu Sep  8 08:52:54 2022
  GitCommit: 62b324ddf718411b1d4d0ba8117c632f7f984a38-dirty
  GoVersion: go1.19
  Os: linux
  OsArch: linux/amd64
  Version: 4.2.1

Package info (e.g. output of rpm -q podman or apt list podman or brew info podman):

4.2.1-1

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

(arch uses the latest version already)
Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

  • physical
  • happens both with and without root
@M1cha
Copy link
Author

M1cha commented Oct 3, 2022

That's what the file looks like, it uses the special symlink dereference handling for pipes:

# nsenter -t 702369 -m /bin/sh -c '/bin/ls -lah /proc/1/fd/1'
l-wx------    1 1000000  1000000       64 Oct  3 06:23 /proc/1/fd/1 -> pipe:[5284174]

The proc man page also has some information about this:

Note that for file descriptors referring to inodes (pipes
and sockets, see above), those inodes still have
permission bits and ownership information distinct from
those of the /proc/[pid]/fd entry, and that the owner may
differ from the user and group IDs of the process.  An
unprivileged process may lack permissions to open them, as
in this example:

    $ echo test | sudo -u nobody cat
    test
    $ echo test | sudo -u nobody cat /proc/self/fd/0
    cat: /proc/self/fd/0: Permission denied

File descriptor 0 refers to the pipe created by the shell
and owned by that shell's user, which is not nobody, so
cat does not have permission to create a new file
descriptor to read from that inode, even though it can
still read from its existing file descriptor 0.

this leads me to think that whoever creates the pipe creates it with the wrong permissions because the user-namespace isn't being considered. Am I right in that the pipe gets created by the container runtime? For me that'd be crun. If that is the case, this issue should probably be reported on that project instead.

@M1cha
Copy link
Author

M1cha commented Oct 3, 2022

okay so I did the following test (rootless):
podman run --rm --userns=auto:size=65534 busybox sh -c 'sleep infinity'
The PID of sleep was 707087, so I did:

# nsenter -t 707087 -a /bin/sh -c '/bin/echo hello > /proc/1/fd/1'
/bin/sh: can't create /proc/1/fd/1: Permission denied
# nsenter -t 707087 -m /bin/sh -c '/bin/stat -L /proc/1/fd/1'
  File: /proc/1/fd/1
  Size: 0               Blocks: 0          IO Block: 4096   fifo
Device: dh/13d  Inode: 5617919     Links: 1
Access: (0600/prw-------)  Uid: ( 1000/ UNKNOWN)   Gid: ( 1000/ UNKNOWN)
Access: 2022-10-03 07:44:12.110595051 +0000
Modify: 2022-10-03 07:44:12.110595051 +0000
Change: 2022-10-03 07:44:12.110595051 +0000
# nsenter -t 707087 -m /bin/sh -c '/bin/chown 1000000:1000000 /proc/1/fd/1'
# nsenter -t 707087 -a /bin/sh -c '/bin/echo hello > /proc/1/fd/1'

That seems to confirm, that the pipe is simply owned by the wrong user. The weird part is that crun should be doing this already: #755
I'll keep looking 👀

@mheon
Copy link
Member

mheon commented Oct 3, 2022

@giuseppe PTAL

@giuseppe
Copy link
Member

giuseppe commented Oct 4, 2022

what happens is that the fchown performed by crun fails because the pipe is owned by an user that is not mapped inside the user namespace:

153644 fchown(0, 0, 0)                  = -1 EPERM (Operation not permitted)
153644 fchown(1, 0, 0)                  = -1 EPERM (Operation not permitted)

conmon creates the pipe so it is a bit more complicated to get a clue about the user namespace there. Let me try if I can solve it in crun.

@giuseppe giuseppe transferred this issue from containers/podman Oct 4, 2022
giuseppe added a commit to giuseppe/crun that referenced this issue Oct 4, 2022
attempt to chown the std streams file descriptors before joining the
user namespace. The files' owner might not be mapped inside the user
namespace, thus causing the chown inside the user namespace to fail.

Closes: containers#1019

Signed-off-by: Giuseppe Scrivano <[email protected]>
@giuseppe
Copy link
Member

giuseppe commented Oct 4, 2022

here the proposed fix: #1020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants