test: Fix multi-threading + exception handling #189

dkl · 2020-12-08T21:28:27Z

If a unit test throws an exception, usually the test framework will catch it cleanly. However, for multi-threaded tests, the test code needs to catch exceptions explicitly, because apparently the Boost Unit Test Framework is not thread-safe.

Here I've seen the multithreaded_async_pipe unit test triggering multiple exceptions in parallel (because the async_pipe constructor failed), which resulted in multiple parallel std::terminate() calls. Then each std::terminate() triggered a SIGABRT for that thread, causing each thread to invoke the Boost Unit Test Framework's signal handler (boost_execution_monitor_jumping_signal_handler()), which however is not thread-safe (doing the same longjmp() in all threads, resulting in stack corruption, SIGSEGV, sometimes even hangs).

As solution, I think using std::future should work – it supports transporting the exceptions from the thread to the parent and will re-throw them there (but note that obviously we'll only see the 1st, not all exceptions from all threads).

Although I've only had problems with the multithreaded_async_pipe test, and that was essentially only triggered by coincidence (it tries to create N_cores * 100 pipes, which fails here due to hitting the maximum number of open files soft-limit (ulimit -n, 1024)), I still think it's worth it to fix the test suite behaviour in this case, especially because it sometimes ended up hanging to the stack smashing. Thus I've also converted all other unit tests from std::thread to std::future + std::async.

If a unit test throws an exception, usually the test framework will catch it cleanly. However, for multi-threaded tests, the test code needs to catch exceptions explicitly, because apparently the Boost Unit Test Framework is not thread-safe. Here I've seen the multithreaded_async_pipe unit test triggering multiple exceptions in parallel (because the async_pipe constructor failed), which resulted in multiple parallel std::terminate() calls. Then each std::terminate() triggered a SIGABRT for that thread, causing each thread to invoke the Boost Unit Test Framework's signal handler (boost_execution_monitor_jumping_signal_handler()), which however is not thread-safe (doing the same longjmp() in all threads, resulting in stack corruption, SIGSEGV, sometimes even hangs). As solution, I think using std::future should work – it supports transporting the exceptions from the thread to the parent and will re-throw them there (but note that obviously we'll only see the 1st, not all exceptions from all threads). Although I've only had problems with the multithreaded_async_pipe test, and that was essentially only triggered by coincidence (it tries to create N_cores * 100 pipes, which fails here due to hitting the maximum number of open files soft-limit (ulimit -n, 1024)), I still think it's worth it to fix the test suite behaviour in this case, especially because it sometimes ended up hanging to the stack smashing. Thus I've also converted all other unit tests from std::thread to std::future + std::async.

klemens-morgenstern · 2021-10-14T06:29:34Z

What is this addressing? A false negative or just a crash on test failure?

dkl · 2021-10-14T18:47:47Z

The main problem is undefined behaviour in the test suite, in case of multi-threading and exceptions. It could cause the test suite to randomly pass, fail, print garbage messages, crash, or even hang. Obviously that's not great for debugging, regardless of whether the actual error that has happened is a false-positive or real bug. In this patch I tried to fix all potential cases which could run into that problem, to ensure the test suite has consistent behaviour.

klemens-morgenstern · 2021-10-15T02:50:16Z

I think adding noexcept to the threads is better, causing a reliable std::terminate .

dkl · 2021-10-15T17:47:56Z

Interesting idea; if it works, it should be applied to all the threads in the test suite, not only in test/async.cpp. (thinking of test/async_pipe.cpp, which is where I saw the problem)

However I'm not sure about std::terminate. As mentioned in the description, multiple parallel std::terminate was part of the problem (something weird about it using abort()/SIGABRT internally, causing it to invoke the test framework's non-thread-safe signal handler...). I tried std::future with std::launch::async specifically to avoid the multiple parallel std::terminate calls. (std::thread::~thread calls std::terminate if the thread is still running during stack unwinding, while std::future from std::launch::async waits for the thread to finish)

Squash after invalid branch & merge conflict. * Fixed file_descriptor move assignment operator to return a reference to 'this'. Issue # 219 * Returning *this instead of erroneous *this. Issue # 219 * Removed unneeded WNOHANG. * Closes #190 * Closes #121 * Attempting to fix wchar_t build error on circle. * Closes #197. * Changed child(pid_t) signature. * Multiple fixes. * Closes #189. * Closes #191. * Added missing work guard on windows. * Trying to catch windows early complete. * Increased log level on windows. * Multiple windows test fixes * Removed overly constraint tests. * fix missing headers * Closes klemens-morgenstern/boost-process#218 * Update executor.hpp explicit cast to int to silence this: `error: non-constant-expression cannot be narrowed from type 'unsigned long' to 'int' in initializer list [-Wc++11-narrowing]` * Fix posix implementation of move constructor/assignment in file_descriptor * Adjust docs `@boost` relative paths * Fixed UB for large environment names. * Closes #207. * Drone setup * Added include for filesystem::fstream. * Disabled useless tests. * Fixed environment length checks. * Pipe test & warning fixes. * Disabled warnings & added windows include fix. * More test fixes. * Removed some tests from apple build. * Removed some tests from apple build. * Disabled OSX tests via build script & fixed windows examples. * TSA fix attempt. Co-authored-by: James Baker <[email protected]> Co-authored-by: silent <[email protected]> Co-authored-by: ikrijan <[email protected]> Co-authored-by: Shauren <[email protected]> Co-authored-by: alandefreitas <[email protected]>

dkl force-pushed the fix-test-suite-crashes branch from eb4f66b to 0b3346c Compare December 8, 2020 22:06

dkl mentioned this pull request Dec 14, 2020

Intermittent multithreaded_async_pipe test failure (SIGABRT) #40

Open

klemens-morgenstern closed this in klemens-morgenstern/boost-process@cd4ef69 Oct 15, 2021

klemens-morgenstern added a commit that referenced this pull request May 18, 2022

Closes #189.

1f464b3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test: Fix multi-threading + exception handling #189

test: Fix multi-threading + exception handling #189

Uh oh!

dkl commented Dec 8, 2020

Uh oh!

klemens-morgenstern commented Oct 14, 2021

Uh oh!

dkl commented Oct 14, 2021

Uh oh!

klemens-morgenstern commented Oct 15, 2021

Uh oh!

dkl commented Oct 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

test: Fix multi-threading + exception handling #189

test: Fix multi-threading + exception handling #189

Uh oh!

Conversation

dkl commented Dec 8, 2020

Uh oh!

klemens-morgenstern commented Oct 14, 2021

Uh oh!

dkl commented Oct 14, 2021

Uh oh!

klemens-morgenstern commented Oct 15, 2021

Uh oh!

dkl commented Oct 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants