Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 19 additions & 3 deletions libcontainer/nsenter/nsexec.c
Original file line number Diff line number Diff line change
Expand Up @@ -712,14 +712,24 @@ void nsexec(void)
/*
* Unshare all of the namespaces. Now, it should be noted that this
* ordering might break in the future (especially with rootless
* containers). But for now, it's not possible to split this into
* CLONE_NEWUSER + [the rest] because of some RHEL SELinux issues.
* containers).
*
* CLONE_NEWIPC needs special handling here when CLONE_NEWUSER is
* also specified to ensure that the user namespace root owns
* mqueue. For that case, we unshare CLONE_NEWIPC after
* switching to root user in the user namespace, so we can apply
* the SELinux label to mqueue.
*
* Note that we don't merge this with clone() because there were
* some old kernel versions where clone(CLONE_PARENT | CLONE_NEWPID)
* was broken, so we'll just do it the long way anyway.
*/
if (unshare(config.cloneflags) < 0)
uint32_t apply_cloneflags = config.cloneflags;
if ((config.cloneflags & CLONE_NEWUSER) && (config.cloneflags & CLONE_NEWIPC)) {
apply_cloneflags &= ~CLONE_NEWIPC;
}

if (unshare(apply_cloneflags) < 0)
bail("failed to unshare namespaces");

/*
Expand Down Expand Up @@ -841,6 +851,12 @@ void nsexec(void)
bail("setgroups failed");
}

/* Unshare CLONE_NEWIPC after becoming root in user namespace */
if ((config.cloneflags & CLONE_NEWUSER) && (config.cloneflags & CLONE_NEWIPC)) {
if (unshare(CLONE_NEWIPC) < 0)
bail("unshare ipc failed");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to unshare ipc this late, Can't this be done in "runc:[1:CHILD]" process after unsharing other namespaces? We only need to fork to actually join pid namespace but not user namespace right?

Copy link
Member

@cyphar cyphar Aug 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the reasoning is that you need to have this run after setuid(0) and setgid(0). #975 was meant to make it possible to have this section earlier, by doing setresuid and setresgid immediately after the necessary unshares.

@mrunalp @dqminh Do you mind if I carry this and #975 and make a new PR that combines both?


s = SYNC_CHILD_READY;
if (write(syncfd, &s, sizeof(s)) != sizeof(s))
bail("failed to sync with patent: write(SYNC_CHILD_READY)");
Expand Down