-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid killing runc init early #2855
Conversation
Some repo (as configured by github) appears to be broken:
Restarted; got a different but similar error:
|
a59f526
to
6c408c9
Compare
(validate CI is failing here -- fixed in #2860) @AkihiroSuda @cyphar @lifubang PTAL |
@AkihiroSuda @cyphar @lifubang PTAL (please ignore failed CI -- it is fixed in #2860) |
Could you rebase, LGTM then |
Add some minimal validation for cgroups. The following checks are implemented: - cgroup name and/or prefix (or path) is set; - for cgroup v1, unified resources are not set; - for cgroup v2, if memorySwap is set, memory is also set, and memorySwap > memory. This makes some invalid configurations fail earlier (before runc init is started), which is better. Signed-off-by: Kir Kolyshkin <[email protected]>
The stars can be aligned in a way that results in runc to leave a stale bind mount in container's state directory, which manifests itself later, while trying to remove the container, in an error like this: > remove /run/runc/test2: unlinkat /run/runc/test2/runc.W24K2t: device or resource busy The stale mount happens because runc start/run/exec kills runc init while it is inside ensure_cloned_binary(). One such scenario is when a unified cgroup resource is specified for cgroup v1, a cgroup manager's Apply returns an error (as of commit b006f4a), and when (*initProcess).start() kills runc init just after it was started. One solution is NOT to kill runc init too early. To achieve that, amend the libcontainer/nsenter code to send a \0 byte to signal that it is past the initial setup, and make start() (for both run/start and exec) wait for this byte before proceeding with kill on an error path. While at it, improve some error messages. Signed-off-by: Kir Kolyshkin <[email protected]>
6c408c9
to
4ecff8d
Compare
Rebased to include #2860 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Two commits, each one solving a problem of a stale bind mount
left by runc init after unsuccessful container start.
See #2843 for more details
and the initial investigation.
Closes: #2843
start: don't kill runc init too early
The stars can be aligned in a way that results in runc to leave a stale
bind mount in container's state directory, which manifests itself later,
while trying to remove the container, in an error like this:
The stale mount happens because runc start/run/exec kills runc init
while it is inside ensure_cloned_binary(). One such scenario is when
a unified cgroup resource is specified for cgroup v1, a cgroup manager's
Apply returns an error (as of commit b006f4a), and when
(*initProcess).start() kills runc init just after it was started.
One solution is NOT to kill runc init too early. To achieve that,
amend the libcontainer/nsenter code to send a \0 byte to signal
that it is past the initial setup, and make start() (for both
run/start and exec) wait for this byte before proceeding with
kill on an error path.
While at it, improve some error messages.
libct/configs/validator: add some cgroup support
Add some minimal validation for cgroups. The following checks
are implemented:
and memorySwap > memory.
This makes some invalid configurations fail earlier (before runc init
is started), which is better, as this should prevent killing runc init
in the middle of ensure_cloned_binary().