From ac19e2a8bae2e1221e537455cac0678bbd9737c2 Mon Sep 17 00:00:00 2001 From: Saikrishna Arcot Date: Wed, 5 Oct 2022 18:14:10 -0700 Subject: [PATCH] [docker-wait-any]: Exit worker thread if main thread is expected to exit (#12255) There's an odd crash that intermittently happens after the teamd container exits, and a signal is raised to the main thread to exit. This thread (watching teamd) continues execution because it's in a `while True`. The subsequent wait call on the teamd container very likely returns immediately, and it calls `is_warm_restart_enabled` and `is_fast_reboot_enabled`. In either of these cases, sometimes, there is a crash in the transition from C code to Python code (after the function gets executed). Python sees that this thread got a signal to exit, because the main thread is exiting, and tells pthread to exit the thread. However, during the stack unwinding, _something_ is telling the unwinder to call `std::terminate`. The reason is unknown. This then results in a python3 SIGABRT, and systemd then doesn't call the stop script to actually stop the container (possibly because the main process exited with a SIGABRT, so it's a hard crash). This means that the container doesn't actually get stopped or restarted, resulting in an inconsistent state afterwards. The workaround appears to be that if we know the main thread needs to exit, just return here, and don't continue execution. This at least tries to avoid it from getting into the problematic code path. However, it's still feasible to get a SIGABRT, depending on thread/process timings (i.e. teamd exits, signals the main thread to exit, and then syncd exits, and syncd calls one of the two C functions, potentially hitting the issue). Signed-off-by: Saikrishna Arcot Signed-off-by: Saikrishna Arcot --- files/image_config/misc/docker-wait-any | 1 + 1 file changed, 1 insertion(+) diff --git a/files/image_config/misc/docker-wait-any b/files/image_config/misc/docker-wait-any index 3a00a2c610d1..f4001d7e02c5 100755 --- a/files/image_config/misc/docker-wait-any +++ b/files/image_config/misc/docker-wait-any @@ -61,6 +61,7 @@ def wait_for_container(docker_client, container_name): # Signal the main thread to exit g_thread_exit_event.set() + return def main():