Better errors from `runc init` #4928

kolyshkin · 2025-10-12T01:11:20Z

~~This currently includes #4930 (and serves as a test for it). Draft until that one is merged.~~

Inspired by the discussion in #4905.

In case early stage of runc init (nsenter) fails for some reason, it
logs error(s) with FATAL log level, via bail().

The runc init log is read by a parent (runc create/run/exec) and is
logged via normal logrus mechanism, which is all fine and dandy, except
when runc init fails, we return the error from the parent (which is
usually not too helpful, for example):

runc run failed: unable to start container process: can't get final child's PID from pipe: EOF

Now, the actual underlying error is from runc init and it was logged
earlier; here's how full runc output looks like:

FATA[0000] nsexec-1[3247792]: failed to unshare remaining namespaces: No space left on device
FATA[0000] nsexec-0[3247790]: failed to sync with stage-1: next state: Success
ERRO[0000] runc run failed: unable to start container process: can't get final child's PID from pipe: EOF

The problem is, upper level runtimes tend to ignore everything except
the last line from runc, and thus error reported by e.g. docker is not
very helpful.

This patch tries to improve the situation by collecting FATAL errors
from runc init and appending those to the error returned (instead of
logging). With it, the above error will look like this:

ERRO[0000] runc run failed: unable to start container process: can't get final child's PID from pipe: EOF; runc init error(s): nsexec-1[141549]: failed to unshare remaining namespaces: No space left on device; nsexec-0[141547]: failed to sync with stage-1: next state: Success

Yes, it is long and ugly, but at least the upper level runtime will
report it.

In addition, this slightly improves error reporting in nsexec itself
(see commits prefixed with "libct/nsenter:" for details).

Since sane_kill after a failed read or write, but before reporting the error from that read or write, it may change the errno value in case kill(2) fails. Save and restore the errno around the call to kill. Signed-off-by: Kir Kolyshkin <[email protected]>

We use bail to report fatal errors, and bail always append %m (aka strerror(errno)). In case an error condition did not set errno, the log message will end up with ": Success" or an error from a stale errno value. Either case is confusing for users. Introduce bailx which is the same as bail except it does not append %m, and use it where appropriate. The naming follows libc's err(3) and errx(3). PS we still use bail in a few cases after read or write, even if that read/write did not return an error, because the code does not distinguish between short read/write and error (-1). This will be addressed by the next commit. Signed-off-by: Kir Kolyshkin <[email protected]>

Introduce and use CHECK_IO and CHECK_IO_KILL macros so that we can call either bail or bailx on error, depending on read/write return. This prevents the "Success" prefix in errors like: failed to sync with stage-1: next state: Success Signed-off-by: Kir Kolyshkin <[email protected]>

In case early stage of runc init (nsenter) fails for some reason, it logs error(s) with FATAL log level, via bail(). The runc init log is read by a parent (runc create/run/exec) and is logged via normal logrus mechanism, which is all fine and dandy, except when `runc init` fails, we return the error from the parent (which is usually not too helpful, for example): runc run failed: unable to start container process: can't get final child's PID from pipe: EOF Now, the actual underlying error is from runc init and it was logged earlier; here's how full runc output looks like: FATA[0000] nsexec-1[3247792]: failed to unshare remaining namespaces: No space left on device FATA[0000] nsexec-0[3247790]: failed to sync with stage-1: next state: Success ERRO[0000] runc run failed: unable to start container process: can't get final child's PID from pipe: EOF The problem is, upper level runtimes tend to ignore everything except the last line from runc, and thus error reported by e.g. docker is not very helpful. This patch tries to improve the situation by collecting FATAL errors from runc init and appending those to the error returned (instead of logging). With it, the above error will look like this: ERRO[0000] runc run failed: unable to start container process: can't get final child's PID from pipe: EOF; runc init error(s): nsexec-1[141549]: failed to unshare remaining namespaces: No space left on device; nsexec-0[141547]: failed to sync with stage-1: next state: Success Yes, it is long and ugly, but at least the upper level runtime will report it. Signed-off-by: Kir Kolyshkin <[email protected]>

lifubang · 2025-10-24T02:05:47Z

libcontainer/nsenter/log.h

 	} while(0)

+/* bailx is the same as bail, except it does not add ": %m" (errno). */
+#define bailx(fmt, ...)                                                     \


Very nice!

I think the first three commits are bug fixes — could you please move them to a separate PR?

lifubang · 2025-10-24T02:08:54Z

libcontainer/nsenter/log.h

 void write_log(int level, const char *format, ...) __attribute__((format(printf, 2, 3)));

 extern int logfd;
 #define bail(fmt, ...)                                               \


Maybe we can add an internal macro to reduce code duplication?
For example:

/** * Write a fatal error message to stderr or logfd. * * This internal macro handles the common logic of output destination * selection based on the value of logfd. */ #define __log_fatal(fmt, ...) \ do { \ if (logfd < 0) \ fprintf(stderr, "FATAL: " fmt "\n", ##__VA_ARGS__); \ else \ write_log(FATAL, fmt, ##__VA_ARGS__); \ } while (0) /** * Terminate the program with a fatal error message including errno. * * Use this macro when a system call fails and you want to include * the corresponding strerror(errno) message via %m. * * Example: if (fork() < 0) bail("failed to fork"); */ #define bail(fmt, ...) \ do { \ __log_fatal(fmt ": %m", ##__VA_ARGS__); \ exit(1); \ } while (0) /** * Terminate the program with a fatal error message without errno. * * Use this macro for configuration errors, programming errors, or any * condition not related to errno. This is the same as bail(), except * it does not append ": %m" (errno description). * * Example: if (!app) bailx("mapping tool not present"); */ #define bailx(fmt, ...) \ do { \ __log_fatal(fmt, ##__VA_ARGS__); \ exit(1); \ } while (0)

lifubang · 2025-10-24T02:30:40Z

libcontainer/nsenter/nsexec.c

+ * CHECK_IO_KILL is a variant of CHECK_IO that kills PIDs before bailing.
+ * Use this when you need to kill child process(es) on I/O failure.
+ */
+#define CHECK_IO_KILL(op, fd, buf, count, pid1, pid2, ...) \


From my personal opinion, I'd rather use a function to replace these two macros to improve readability. For example:

void check_io(int ret, size_t size, char *err_msg, pid_t stage1_pid, pid_t stage2_pid) { if (ret != size) { sane_kill(stage1_pid, SIGKILL); sane_kill(stage2_pid, SIGKILL); if (ret < 0) bail(err_msg); bailx("%s: %d byte(s), expected: %d byte(s)", err_msg, ret, size); } }

kolyshkin mentioned this pull request Oct 12, 2025

Error when starting the containers: "can't get final child's PID from pipe" #4905

Open

kolyshkin force-pushed the better-init-errors branch 2 times, most recently from 08fb065 to 0200b76 Compare October 13, 2025 19:01

kolyshkin marked this pull request as draft October 13, 2025 22:41

kolyshkin force-pushed the better-init-errors branch from 0200b76 to af1e5f2 Compare October 13, 2025 23:00

kolyshkin mentioned this pull request Oct 13, 2025

libct: close child fds on prepareCgroupFD error #4930

Merged

kolyshkin marked this pull request as ready for review October 14, 2025 00:05

kolyshkin force-pushed the better-init-errors branch from af1e5f2 to 0871366 Compare October 14, 2025 18:45

kolyshkin marked this pull request as draft October 14, 2025 18:47

cyphar mentioned this pull request Oct 15, 2025

[1.4] libct: close child fds on prepareCgroupFD error #4936

Merged

kolyshkin force-pushed the better-init-errors branch from 0871366 to 8d2e079 Compare October 15, 2025 23:01

kolyshkin marked this pull request as ready for review October 15, 2025 23:02

kolyshkin requested review from AkihiroSuda, cyphar, lifubang and rata and removed request for cyphar October 15, 2025 23:02

kolyshkin force-pushed the better-init-errors branch from 8d2e079 to c735358 Compare October 15, 2025 23:17

rst0git mentioned this pull request Oct 17, 2025

runc tests fail with criu-dev checkpoint-restore/criu#2781

Closed

kolyshkin force-pushed the better-init-errors branch from c735358 to abf4958 Compare October 18, 2025 22:21

kolyshkin added 4 commits October 23, 2025 18:48

kolyshkin force-pushed the better-init-errors branch from abf4958 to ef31851 Compare October 24, 2025 01:48

lifubang reviewed Oct 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Better errors from `runc init` #4928

Better errors from `runc init` #4928

kolyshkin commented Oct 12, 2025 •

edited

Loading

Uh oh!

lifubang Oct 24, 2025

Uh oh!

lifubang Oct 24, 2025

Uh oh!

lifubang Oct 24, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Better errors from runc init #4928

Are you sure you want to change the base?

Better errors from runc init #4928

Conversation

kolyshkin commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lifubang Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

lifubang Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

lifubang Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Better errors from `runc init` #4928

Better errors from `runc init` #4928

kolyshkin commented Oct 12, 2025 •

edited

Loading

lifubang Oct 24, 2025 •

edited

Loading