std::process::unix: Command: Do not unwind past fork(), in child #80263

ijackson · 2020-12-21T15:03:02Z

Unwinding past fork() in the child causes whatever traps the unwind
to return twice. This is very strange and clearly not desirable.

With the default behaviour of the thread library, this can even result
in a panic in the child being transformed into zero exit status (ie,
success) as seen in the parent!

If unwinding reaches the fork point, the child should abort.

rust-highfive · 2020-12-21T15:03:05Z

r? @m-ou-se

(rust-highfive has picked a reviewer for you, use r? to override)

ijackson · 2020-12-21T15:03:43Z

@rustbot modify labels +T-libs +A-runtime

cuviper · 2020-12-21T18:17:02Z

I don't think the current behavior is strange, or at least no more than fork itself is strange. You've created a new process as a copy of the first, and it could certainly return from the fork point, so why wouldn't unwinding do that too? Think of a forking daemon, for example.

cuviper · 2020-12-21T18:21:10Z

Oh, but you're only changing the fork in Command -- that's much more specific than your PR description implies. I thought you meant all forks.

ijackson · 2020-12-21T18:22:39Z

Josh Stone writes ("Re: [rust-lang/rust] std::process::unix: Do not unwind past fork(), in child (#80263)"):

I don't think the current behavior is strange, or at least no more than fork itself is strange. You've created a new process as a copy of the first, and it could certainly return from the fork point, so why wouldn't unwinding do that too? Think of a forking daemon, for example.

I think you must have misunderstood the context. The code I am changing here is part of the implementation of the (portable) `Command` facility. The user who is using Command should not expect panics (whether from some bug in Command, or some pre_exec hook) to have this behaviour. I agree that a user who was using raw libc::fork might well want to unwind past fork in the child. But `Command` is not `libc::fork`. Considering your daemon example: `Command` can't (sensibly) be used for trad unix daemonisation. Someone who wants to do trad unix daemonisation is well-served by libc (and perhaps higher-level but still non-portable facilities in non-std crates). Ian.

…

-- Ian Jackson <[email protected]> These opinions are my own. Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk, that is a private address which bypasses my fierce spamfilter.

ijackson · 2020-12-21T18:23:17Z

Josh Stone writes ("Re: [rust-lang/rust] std::process::unix: Do not unwind past fork(), in child (#80263)"):

Oh, but you're only changing the fork in Command -- that's much more specific than your PR description implies.

Ah, yes. Our messages crossed. There is no other call to libc::fork in std.

…

-- Ian Jackson <[email protected]> These opinions are my own. Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk, that is a private address which bypasses my fierce spamfilter.

ijackson · 2020-12-21T18:29:41Z

I edited the title to clarify the scope. Thanks for your attention.

m-ou-se

Thanks for working on this!

library/std/src/sys/unix/process/process_unix/tests.rs

m-ou-se · 2020-12-21T19:50:46Z

library/std/src/sys/unix/process/process_unix.rs

@@ -53,7 +57,8 @@ impl Command {

        let pid = unsafe {
            match result {
-                0 => {
+                0 => (#[unwind(aborts)]


This would be the first usage of #[unwind(aborts)] outside a test, and the first usage of #[unwind] on a closure. Not sure if this is stable enough.

(I believe this attribute was originally only meant for extern functions.)

@Mark-Simulacrum Do you know, or do you know who would know?

Thanks for the review.

In defence of using this here: my MR adds a test case which exercises this panic -> unwind -> abort path, and of course there is no non-abort return path because the closure ends with _exit (and returns !)

If #[unwind(aborts)] is not stable enough, should I open-code something with catch_unwind? It seemed to me that using a built-in library facility (even an unstable one) for this was better than an ad-hoc reimplementation of the same functionality.

ijackson · 2021-01-07T23:09:05Z

@rustbot modify labels -S-waiting-on-author +S-waiting-on-review

m-ou-se · 2021-01-25T18:18:41Z

r? @Mark-Simulacrum to validate this usage of #[unwind(aborts)]. See above.

Mark-Simulacrum · 2021-01-26T14:19:41Z

Yes, catch_unwind is probably the better option. I'm pretty sure the unwind attribute in theory should work, but I wouldn't want to rely on it in this context personally. catch_unwind should still compile to similar or identical assembly as the unwind attr.

Mark-Simulacrum · 2021-01-26T14:21:06Z

@ijackson it also looks like your git commits aren't associated with your github account, fwiw (fine for contributing, just saying in case it's unintentional).

If you want to squash commits on the next push that'll also avoid another runaround.

rust-log-analyzer · 2021-01-27T17:37:30Z

The job mingw-check failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

Successfully built 3b6c825b9799
Successfully tagged rust-ci:latest
Built container sha256:3b6c825b9799bc30f348d39c6ca4570b9feb1f8611b2087d36eab3c6fc2a1baa
Uploading finished image to https://ci-caches.rust-lang.org/docker/80afe504501370b4d310121e20e04a989f302196b07831c4375b96e05bc067556c2046e20ab2062b28a9dc9b2ae132b37d419cc55a065dfcd25501527e829ab9
upload failed: - to s3://rust-lang-ci-sccache2/docker/80afe504501370b4d310121e20e04a989f302196b07831c4375b96e05bc067556c2046e20ab2062b28a9dc9b2ae132b37d419cc55a065dfcd25501527e829ab9 Unable to locate credentials
[CI_JOB_NAME=mingw-check]
---
configure: rust.channel         := nightly
configure: rust.debug-assertions := True
configure: llvm.assertions      := True
configure: dist.missing-tools   := True
configure: build.configure-args := ['--enable-sccache', '--disable-manage-submodu ...
configure: writing `config.toml` in current directory
configure: 
configure: run `python /checkout/x.py --help`
configure: 
---
skip untracked path cpu-usage.csv during rustfmt invocations
skip untracked path src/doc/book/ during rustfmt invocations
skip untracked path src/doc/rust-by-example/ during rustfmt invocations
skip untracked path src/llvm-project/ during rustfmt invocations
Diff in /checkout/library/std/src/sys/unix/process/process_unix.rs at line 80:
                     libc::_exit(1)
                 })) {
                     Err(_) => crate::process::abort(),
+                },
                 n => n,
             }
         };
         };
Running `"/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/rustfmt" "--config-path" "/checkout" "--edition" "2018" "--unstable-features" "--skip-children" "--check" "/checkout/library/std/src/sys/unix/process/process_unix.rs"` failed.
If you're running `tidy`, try again with `--bless`. Or, if you just want to format code, run `./x.py fmt` instead.
Build completed unsuccessfully in 0:00:22

ijackson · 2021-01-28T02:04:56Z

The failure is this, from my new test case:

---- sys::unix::process::process_inner::tests::test_command_fork_no_unwind stdout ----
ExitStatus(ExitStatus(139))
got=Ok(ExitStatus(ExitStatus(139)))
thread 'sys::unix::process::process_inner::tests::test_command_fork_no_unwind' panicked at 'assertion failed: signal == libc::SIGABRT || signal == libc::SIGILL', library/std/src/sys/unix/process/process_unix/tests.rs:21:5

139 is SIGSEGV with a coredump.

I could add SIGSEGV to the permitted list. But, is this really right? It seems odd. I think std::process::abort ought not to produce SIGSEGV, at least unless I am missing something. If this is expected then, fine, I'll add it to the list. But if it is not expected then I think there may be a problem with catch_unwind or abort or something.

@Mark-Simulacrum do you know if this is an expected result from abort ?

jonas-schievink · 2021-01-30T16:37:22Z

@bors r-

ijackson · 2021-02-01T17:36:34Z

I want to reproduce this failure locally. Can someone help me with a build/install problem, which is blocking me?

I think I need a cross rustc targeting i686-unknown-linux-musl. I think if I manage to make a working rustc with that target I will be able to run the resulting binaries on my x86_64 laptop, so I will be able to debug the situation properly and decide if the SIGSEGV is expected.

But I had trouble finding how to build that cross toolchain. Rust's ./x.py wants a "musl root". Where would I get one of those? My distro (Debian) has some musl packages but they are in multiarch paths so not under a single root.

I looked for instructions in various places including general search engines and the in-tree README.md.

ijackson · 2021-02-01T17:38:48Z

@rustbot modify labels +E-help-wanted +O-musl +A-rustbuild -T-libs -A-runtime

library/std/src/sys/unix/process/process_unix.rs

Mark-Simulacrum · 2021-02-02T14:44:19Z

My recommendation is to test via the Docker container (./src/ci/docker/run.sh dist-i586-gnu-i586-i686-musl) it's possible you can reproduce the environment outside it, though, by inspecting the setup scripts called.

ijackson · 2021-02-04T20:15:00Z

My recommendation is to test via the Docker container (./src/ci/docker/run.sh dist-i586-gnu-i586-i686-musl) it's possible you can reproduce the environment outside it, though, by inspecting the setup scripts called.

I managed to get a musl i686 build by inspecting the rules and doing stuff by hand. It worked just fine both before and after my change. I'm now wrestling docker.

ijackson · 2021-02-05T17:33:48Z

Well, eventually my formal docker run with ./src/ci/docker/run.sh dist-i586-gnu-i586-i686-musl (as recommended by @Mark-Simulacrum) completed. Unfortunately, my new test case succeeded:

test sys::unix::process::process_inner::tests::test_command_fork_no_unwind ... ok

That's with 20e2172 which is tree-identical to the 73ae9d635b31311e980b2d8fad7277e384f6f2cb which bors built. (Strictly, I added one further change, to drop --rm from the arguments to docker in src/ci/docker/run.sh.)

So I apparently cannot reproduce this failure locally. I hesitate to suggest just trying it again since "random lossage" is really not a very convincing explanation. Is there some way to debug this in something more closely resembling the CI environment ?

ijackson · 2021-02-05T17:41:25Z

OTOH this does give me confidence that the test case is correct not to consider SIGSEGV a pass.

So the fact that this segfaulted in the CI suggests a real bug. I doubt that's in my patch to the stdlib (which doesn't even introduce any new unsafe). It also seems to me that there is nothing in my test case which ought to cause UB. The ony unsafe is this:

        unsafe {
            c.pre_exec(|| panic!("crash now!"));
        }

So I am led to think there is a pre-existing bug in panic handling :-/

ghost · 2021-02-05T19:04:44Z

I doubt that's in my patch to the stdlib (which doesn't even introduce any new unsafe). It also seems to me that there is nothing in my test case which ought to cause UB.

I think panicking from the child process is already UB (in a multithreaded program)?

The documentation of CommandExt::pre_exec() says:

This closure will be run in the context of the child process after a fork.

The fork(2) manual page says:

After a fork() in a multithreaded program, the child can safely call only async-signal-safe functions (see signal-safety(7)) until such time as it calls execve(2).

(EDIT: May not be relevant here - and the signal-safety(7) manual page explains why violating signal safety rules may cause UBs)

An async-signal-safe function is one that can be safely called from within a signal handler. Many functions are not async-signal-safe. In particular, nonreentrant functions are generally unsafe to call from a signal handler.

The kinds of issues that render a function unsafe can be quickly understood when one considers the implementation of the stdio library, all of whose functions are not async-signal-safe.

When performing buffered I/O on a file, the stdio functions must maintain a statically allocated data buffer along with associated counters and indexes (or pointers) that record the amount of data and the current position in the buffer. Suppose that the main program is in the middle of a call to a stdio function such as printf(3) where the buffer and associated variables have been partially updated. If, at that moment, the program is interrupted by a signal handler that also calls printf(3), then the second call to printf(3) will operate on inconsistent data, with unpredictable results.

I believe panicking or catching unwinding is not signal-safe, because std::panic::catch_unwind returns a Box if an unwinding is caught, which means they must allocate some memory, which is not signal-safe.

ijackson · 2021-02-05T19:56:59Z

Hmmm.

I don't think I really agree. In C, malloc after fork is not, in general, UB. The spec says

If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.

When the application calls fork() from a signal handler and any of the fork handlers registered by pthread_atfork() calls a function that is not async-signal-safe, the behavior is undefined.

"Errors" is quite vague but my interpretation is that eg trying to acquire a stdio lock might fail because it was locked by a thread in the parent at the time of the fork. But maybe musl is more restrictive about this.

What can be done about this? I don't think we can really have a programming environment where panicking is UB.

Perhaps the right answer is to install an aborting panic hook so that we never unwind. Even with std::panic::set_hook, panic! would still go wrong if you asked it to format. I bet there are things in the stdlib that do that (eg, array bounds check). So perhaps some kind of special case with a private (or unstable) raw panic hook that doesn't require panic! to allocate before aborting.

[edited to link to SuS]

ijackson · 2021-02-05T20:31:39Z

More digging found me these links:

https://lists.uclibc.org/pipermail/uclibc/2011-March/045130.html
https://wiki.strongswan.org/issues/990

The first one is from the musl authors. One striking statement there is that malloc after fork in a multithreaded program used to be specified to work but nowadays the libc is allowed to make it UB. I'm glad that I'm not going completely mad!

In practical terms it seems that at least for musl this is difficult to make work and not likely to be done any time soon, if at all. And the Rust stdlib should be conservative. So I think that means preventing any calls to malloc after fork. I'll see what I can do to ensure that at least in a plausible subset of cases (which I think has to include at least array bounds violations!).

Maybe I can also somehow nobble the global allocator to abort immediately. The footgun here is quite large...

ijackson · 2021-02-07T18:11:57Z

I think I have succeeded. The result doesn't look much like this MR and I think it is probably more sensible to start afresh. So I will close this one and make a new MR shortly.

Do not allocate or unwind after fork ### Objective scenarios * Make (simple) panics safe in `Command::pre_exec_hook`, including most `panic!` calls, `Option::unwrap`, and array bounds check failures. * Make it possible to `libc::fork` and then safely panic in the child (needed for the above, but this requirement means exposing the new raw hook API which the `Command` implementation needs). * In singlethreaded programs, where panic in `pre_exec_hook` is already memory-safe, prevent the double-unwinding malfunction rust-lang#79740. I think we want to make panic after fork safe even though the post-fork child environment is only experienced by users of `unsafe`, beause the subset of Rust in which any panic is UB is really far too hazardous and unnatural. #### Approach * Provide a way for a program to, at runtime, switch to having panics abort. This makes it possible to panic without making *any* heap allocations, which is needed because on some platforms malloc is UB in a child forked from a multithreaded program (see rust-lang#80263 (comment), and maybe also the SuS [spec](https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html)). * Make that change in the child spawned by `Command`. * Document the rules comprehensively enough that a programmer has a fighting chance of writing correct code. * Test that this all works as expected (and in particular, that there aren't any heap allocations we missed) Fixes rust-lang#79740 #### Rejected (or previously attempted) approaches * Change the panic machinery to be able to unwind without allocating, at least when the payload and message are both `'static`. This seems like it would be even more subtle. Also that is a potentially-hot path which I don't want to mess with. * Change the existing panic hook mechanism to not convert the message to a `String` before calling the hook. This would be a surprising change for existing code and would not be detected by the type system. * Provide a `raw_panic_hook` function to intercept panics in a way that doesn't allocate. (That was an earlier version of this MR.) ### History This MR could be considered a v2 of rust-lang#80263. Thanks to everyone who commented there. In particular, thanks to `@m-ou-se,` `@Mark-Simulacrum` and `@hyd-dev.` (Tagging you since I think you might be interested in this new MR.) Compared to rust-lang#80263, this MR has very substantial changes and additions. Additionally, I have recently (2021-04-20) completely revised this series following very helpful comments from `@m-ou-se.` r? `@m-ou-se`

rust-highfive assigned m-ou-se Dec 21, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 21, 2020

rustbot added A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Dec 21, 2020

ijackson changed the title ~~std::process::unix: Do not unwind past fork(), in child~~ std::process::unix: Command: Do not unwind past fork(), in child Dec 21, 2020

m-ou-se reviewed Dec 21, 2020

View reviewed changes

m-ou-se added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 30, 2020

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 7, 2021

JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 25, 2021

rust-highfive assigned Mark-Simulacrum and unassigned m-ou-se Jan 25, 2021

Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 26, 2021

ijackson force-pushed the fork-no-unwind branch from 26bcd81 to 4bafe7b Compare January 27, 2021 17:22

ijackson force-pushed the fork-no-unwind branch from 4bafe7b to a38d929 Compare January 27, 2021 17:38

bors added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 28, 2021

bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 30, 2021

ijackson marked this pull request as draft February 1, 2021 17:30

tesuji reviewed Feb 2, 2021

View reviewed changes

library/std/src/sys/unix/process/process_unix.rs Show resolved Hide resolved

ijackson closed this Feb 7, 2021

ijackson deleted the fork-no-unwind branch February 7, 2021 18:12

ijackson mentioned this pull request Feb 7, 2021

Do not allocate or unwind after fork #81858

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

std::process::unix: Command: Do not unwind past fork(), in child #80263

std::process::unix: Command: Do not unwind past fork(), in child #80263

ijackson commented Dec 21, 2020 •

edited by Mark-Simulacrum

Loading

rust-highfive commented Dec 21, 2020

ijackson commented Dec 21, 2020

cuviper commented Dec 21, 2020

cuviper commented Dec 21, 2020 •

edited

Loading

ijackson commented Dec 21, 2020 via email

ijackson commented Dec 21, 2020 via email

ijackson commented Dec 21, 2020

m-ou-se left a comment

m-ou-se Dec 21, 2020

ijackson Jan 4, 2021

ijackson commented Jan 7, 2021

m-ou-se commented Jan 25, 2021

Mark-Simulacrum commented Jan 26, 2021

Mark-Simulacrum commented Jan 26, 2021

rust-log-analyzer commented Jan 27, 2021

ijackson commented Jan 28, 2021

jonas-schievink commented Jan 30, 2021

ijackson commented Feb 1, 2021 •

edited

Loading

ijackson commented Feb 1, 2021

Mark-Simulacrum commented Feb 2, 2021

ijackson commented Feb 4, 2021 •

edited

Loading

ijackson commented Feb 5, 2021

ijackson commented Feb 5, 2021

ghost commented Feb 5, 2021 •

edited by ghost

Loading

ijackson commented Feb 5, 2021 •

edited

Loading

ijackson commented Feb 5, 2021

ijackson commented Feb 7, 2021

std::process::unix: Command: Do not unwind past fork(), in child #80263

std::process::unix: Command: Do not unwind past fork(), in child #80263

Conversation

ijackson commented Dec 21, 2020 • edited by Mark-Simulacrum Loading

rust-highfive commented Dec 21, 2020

ijackson commented Dec 21, 2020

cuviper commented Dec 21, 2020

cuviper commented Dec 21, 2020 • edited Loading

ijackson commented Dec 21, 2020 via email

ijackson commented Dec 21, 2020 via email

ijackson commented Dec 21, 2020

m-ou-se left a comment

Choose a reason for hiding this comment

m-ou-se Dec 21, 2020

Choose a reason for hiding this comment

ijackson Jan 4, 2021

Choose a reason for hiding this comment

ijackson commented Jan 7, 2021

m-ou-se commented Jan 25, 2021

Mark-Simulacrum commented Jan 26, 2021

Mark-Simulacrum commented Jan 26, 2021

rust-log-analyzer commented Jan 27, 2021

ijackson commented Jan 28, 2021

jonas-schievink commented Jan 30, 2021

ijackson commented Feb 1, 2021 • edited Loading

ijackson commented Feb 1, 2021

Mark-Simulacrum commented Feb 2, 2021

ijackson commented Feb 4, 2021 • edited Loading

ijackson commented Feb 5, 2021

ijackson commented Feb 5, 2021

ghost commented Feb 5, 2021 • edited by ghost Loading

ijackson commented Feb 5, 2021 • edited Loading

ijackson commented Feb 5, 2021

ijackson commented Feb 7, 2021

ijackson commented Dec 21, 2020 •

edited by Mark-Simulacrum

Loading

cuviper commented Dec 21, 2020 •

edited

Loading

ijackson commented Feb 1, 2021 •

edited

Loading

ijackson commented Feb 4, 2021 •

edited

Loading

ghost commented Feb 5, 2021 •

edited by ghost

Loading

ijackson commented Feb 5, 2021 •

edited

Loading