Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unix process spawn may trigger UB for pthread lock behavior. #82879

Closed
ehuss opened this issue Mar 7, 2021 · 4 comments
Closed

Unix process spawn may trigger UB for pthread lock behavior. #82879

ehuss opened this issue Mar 7, 2021 · 4 comments
Labels
C-bug Category: This is a bug.

Comments

@ehuss
Copy link
Contributor

ehuss commented Mar 7, 2021

When spawning a process in the case where posix_spawn cannot be used, the spawn code uses fork/execvp. It acquires the environment lock before the fork, with a drop handler that unlocks it. Unfortunately, this is done in both the parent and child processes, and unlocking a lock acquired from another thread is undefined behavior (see pthread_rwlock_unlock and pthread_mutex_unlock).

An example of where this can happen is rustdoc running doctests. It has N threads all spawning rustc processes. It was observed in #82221 that this was frequently causing deadlocks (i686, on a Docker image with glibc 2.23).

PR #82877 reverts the change from mutex to rwlock, but pthread_mutex_unlock is also undefined behavior, we just fortunately have not run into any problems. This should be fixed. This can probably be done by mem::forget'ing the guard.

https://stackoverflow.com/questions/61976745/why-does-rust-rwlock-behave-unexpectedly-with-fork also provides some insight into why unlocking a rwlock after a fork doesn't work.

rustc 1.52.0-nightly (caca212 2021-03-05)

@joshtriplett
Copy link
Member

Based on some discussion on Zulip, I think the right approach would be to reinitialize the lock in the child, immediately after the fork.

We can't just forget the lock; that will break usage of the environment within a pre_exec function. Reinitializing the lock in the child should work.

@ghost
Copy link

ghost commented Mar 8, 2021

Is this a duplicate of #64718?

(Also, as mentioned in #64718 (comment), it seems none of the exec*p* functions is signal-safe:)

$ man signal-safety | rg -F exec
       trant or because it is atomic with respect to signals (i.e., its execu‐
       execl(3)               Added in POSIX.1-2008; see notes below
       execle(3)              See notes below
       execv(3)               Added in POSIX.1-2008
       execve(2)
       fexecve(3)             Added in POSIX.1-2008
       *  If a signal handler interrupts the execution of an unsafe  function,
       *  Before glibc 2.24, execl(3) and execle(3) employed realloc(3) inter

@sfackler
Copy link
Member

sfackler commented Mar 8, 2021

We can't just forget the lock; that will break usage of the environment within a pre_exec function. Reinitializing the lock in the child should work.

But getenv and setenv are not async-signal-safe, so isn't usage of the environment after a multi-threaded fork inherently broken?

@ehuss
Copy link
Contributor Author

ehuss commented Mar 8, 2021

Indeed this is a duplicate of #64718, thanks for the link!

Closing, let's keep the discussion consolidated there.

@ehuss ehuss closed this as completed Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug.
Projects
None yet
Development

No branches or pull requests

3 participants