-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open bind mount sources from the host userns #2576
Open bind mount sources from the host userns #2576
Conversation
8ec9e25
to
040f486
Compare
This was not a bug but normal kernel behaviour because I tested without mount flags |
040f486
to
d5a10e2
Compare
Unit tests seem to fail on the master branch too, it does not seem related to this PR. |
Can you please rebase it on top of current master (merged #2580 should fix CI) |
d5a10e2
to
80a3299
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments.
In general, can we maybe use a single env var to pass on all the mount fds at once (instead of introducing a variable per each fd)? That would probably be faster and more resource-wise. |
80a3299
to
7c0529f
Compare
@kolyshkin Thanks for the reviews! I addressed all the comments except the one about the single env var to pass on all the mount fds. I added that in the TODO list. The unit tests on Travis still fail after a rebase but I don't see the error message. Do you know why? If not, I'll continue to investigate... |
e14baed
to
54eca43
Compare
Added, with |
70b28f5
to
165745b
Compare
The unit tests now pass fine. Changes:
|
libcontainer/factory_linux.go
Outdated
envMountFileFds, err) | ||
} | ||
mountFile := os.NewFile(uintptr(mountFileFd), "mount-file") | ||
defer mountFile.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These defer
s will never be called because this function ends up calling Execve
. There are two things we should do:
- Explicitly set
O_CLOEXEC
on all of the file descriptors (fcntl(F_SETFL, FD_CLOEXEC)
is what you want). - Close them as soon as we can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I added a patch that does the following:
- the O_CLOEXEC is now set in
nsexec.c
bydup3()
- close them earlier in libcontainer/init_linux.go (for the initSetns case) and in libcontainer/standard_init_linux.go (for the initStandard case).
The "defer FOO.Close()" in this function follows the same code pattern of closing other fds in case i.Init()
somehow fails to run Execve
.
6900793
to
d7854e6
Compare
c87cbd6
to
1413f82
Compare
@kolyshkin no problem, fixed! Pushed several times to see if the DCO check runs (it says expected, but stays like that for a long time), but didn't help. PTAL :) |
ebdd748
to
1efa5f2
Compare
Pushing again as |
|
||
if strings.HasPrefix(tempBase, path) { | ||
// We can't safely change permissions if it is not below tempBase. | ||
if stats.Mode()&0o5 == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the check is wrong. Consider the case when only r
or x
is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha, I'm embarrassed I missed it 🙈 . I was unlucky and in practice it worked, because the dirs didn't have any permissions for others. But thanks for catching this, fixed now!
continue | ||
} | ||
|
||
if stats.Mode()&0o5 != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed too, thanks! 🙈
1efa5f2
to
04a4822
Compare
@kolyshkin PTAL |
Add a unit test to check that bind mounts that have a part of its path non accessible by others still work when using user namespaces. To do this, we also modify newRoot() to return rootfs directories that can be traverse by others, so the rootfs created works for all test (either running in a userns or not). Signed-off-by: Mauricio Vásquez <[email protected]> Signed-off-by: Rodrigo Campos <[email protected]> Co-authored-by: Rodrigo Campos <[email protected]>
04a4822
to
8542322
Compare
@kolyshkin Thanks, PTAL. Hopefully this is ready now 🤞 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks all for the time and review! I think we never added release notes for this. Here they are: Fix using bind mounts when the user in the user namespace doesn't have permission to traverse the mount path (#2484) |
|
||
runc run test_busybox | ||
[ "$status" -eq 0 ] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't seem to work on Fedora 35
#3258
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned it in several places already, but just in case someone is looking here, the PR fixing this is: #3260
The source of the bind mount might not be accessible in a different usernamespace because a component of the source path might not be traversed under the users and groups mapped inside the user namespace. This caused errors such as the following:
To solve this problem, this patch performs the following:
Passing the fds with SCM_RIGHTS is necessary because once the child process is in the container mntns, it is already in the container userns so it cannot temporarily join the host mntns.
This patch uses the existing mechanism with LIBCONTAINER* environment variables to pass the file descriptors from runc to runc init.
This patch uses the existing mechanism with the Netlink-style bootstrap to pass information about the list of source mounts to nsexec.c.
Fixes: #2484
TODO:
config.json
.