Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nsexec: cloned_binary: remove bindfd logic entirely #3931
nsexec: cloned_binary: remove bindfd logic entirely #3931
Changes from all commits
b999376
0d890ad
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😭 Maybe there is a bug for many years? In some old kernels, we will use
O_TMPFILE
ormkostemp
, but at this time, the runc state dir has not been created yet. So we will got an error in these old kernels:FATA[0000] nsexec[18089]: could not ensure we are a cloned binary: Permission denied
So, we should deal with it.
/tmp
directly;or
getenv("_LIBCONTAINER_STATEDIR")
, but we should create first, and delete it when got an error, I think it's very complex.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I think we should split this function to 3 functions: make_memfd, make_o_tmpfile_fd, make_ostemp_fd, it's convenient to write unit test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's pretty embarrassing 😅.
The logic for using the state directory is that we are guaranteed it is writable -- if we try to use some other directory it might be mounted as ro or the user might not have write permission. I guess it's okay to use
/tmp
though.It's very odd that this wasn't caught before -- I tested the
O_TMPFILE
logic pretty extensively when I first implemented this code (on some SLES versions we don't have memfds). Some aspect of the statedir code must've changed after we added bindfd (which masked the breakage).Permission denied
seems like a strange error for a missing directory -- I would expectENOENT
instead...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of this code is going to be deleted and moved to Go in #3953. I'd prefer to keep this patch as simple as possible, and we can talk about how to break up the implementation in that PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quite agree with you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bug is not relate to this PR, I think this PR could be merged, and open a new PR to fix this error or waiting you fix it in Go code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll fix the
O_TMPFILE
bug here. Let me take a look...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lifubang I can't reproduce the issue you mentioned -- if I delete the memfd code,
ensure_cloned_binary()
still works. If I delete theO_TMPFILE
code too, it also still works.Since you're getting
-EACCES
it seems likely this is an LSM-related issue. Can you open a separate bug report about that, since it seems unlikely this is a common problem. I think we can merge this as-is. WDYT?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let's merge this and open an new issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened here: #3965
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seals seem added in Linux 3.17, but we still need to support kernel 3.10 at least until the EOL of CentOS 7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Slightly off-topic, but we have to have an official documentation to clarify the minimum supported kernel version)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memfd_create()
was added in 3.17, the seals were part of the initial implementation. We handle older kernel versions withO_TMPFILE
(3.11 and later) ormkostemp(3)
(glibc) for ancient kernels.This PR doesn't change any of this behaviour, it just removes bindfd and passes
MFD_EXEC
as well as applying the two newer seals but ignores errors when adding them because they're not necessary. IOW the behaviour on older kernels is unchanged.