Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop in either atexit or std::signal causes core dumped. #198

Open
YuanYuYuan opened this issue Sep 3, 2024 · 3 comments
Open

Drop in either atexit or std::signal causes core dumped. #198

YuanYuYuan opened this issue Sep 3, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@YuanYuYuan
Copy link
Contributor

YuanYuYuan commented Sep 3, 2024

Description

This issue occurs when a user tries to drop a zenoh entity in atexit or std::signal. This is also the case in rmw_zenoh_cpp. To resolve this issue, we might need to add a compilation flag to turn off the z_drop in the destructor.

To reproduce

  1. Apply the following patch
diff --git a/examples/universal/z_sub.cxx b/examples/universal/z_sub.cxx
index a53ae23..bf7a4e8 100644
--- a/examples/universal/z_sub.cxx
+++ b/examples/universal/z_sub.cxx
@@ -14,15 +14,27 @@
 #include <stdio.h>

 #include <chrono>
+#include <csignal>
+#include <cstdlib>
 #include <iostream>
+#include <optional>
 #include <thread>

 #include "../getargs.h"
 #include "zenoh.hxx"
+#include "zenoh/api/session.hxx"
+#include "zenoh/api/subscriber.hxx"

 using namespace zenoh;
 using namespace std::chrono_literals;

+typedef struct Context {
+    std::optional<Session> sess;
+    std::optional<Subscriber<void>> sub;
+} Context;
+
+static Context context;
+
 const char *kind_to_str(SampleKind kind) {
     switch (kind) {
         case SampleKind::Z_SAMPLE_KIND_PUT:
@@ -40,6 +52,14 @@ void data_handler(const Sample &sample) {
               << "')\n";
 }

+void handle_sigint(int signal) {
+    if (signal == SIGINT) {
+        std::cout << "Received SIGINT signal. Performing cleanup tasks..." << std::endl;
+        context.sub.reset();
+        exit(0);
+    }
+}
+
 int _main(int argc, char **argv) {
     const char *expr = "demo/example/**";
     const char *locator = nullptr;
@@ -79,14 +99,17 @@ int _main(int argc, char **argv) {
     KeyExpr keyexpr(expr);

     std::cout << "Opening session..." << std::endl;
-    auto session = Session::open(std::move(config));
+    context.sess = Session::open(std::move(config));

     std::cout << "Declaring Subscriber on '" << keyexpr.as_string_view() << "'..." << std::endl;
-    auto subscriber = session.declare_subscriber(keyexpr, &data_handler, closures::none);
+    context.sub = context.sess.value().declare_subscriber(keyexpr, &data_handler, closures::none);
 #ifdef ZENOHCXX_ZENOHC
-    std::cout << "Subscriber on '" << subscriber.get_keyexpr().as_string_view() << "' declared" << std::endl;
+    std::cout << "Subscriber on '" << context.sub.value().get_keyexpr().as_string_view() << "' declared" << std::endl;
 #endif

+    // Register the signal handler for SIGTERM
+    std::signal(SIGINT, handle_sigint);
+
     std::cout << "Press CTRL-C to quit...\n";
     while (true) {
         std::this_thread::sleep_for(1s);
  1. Build and run z_sub. And then press Ctrl-C to stop the program.

Error log

./build/examples/zenohc/z_sub
Opening session...
2024-09-03T06:58:51.977472Z  INFO ThreadId(02) zenoh::net::runtime: Using ZID: 41ae6c14329a3f858ac7e4d7843933d3
2024-09-03T06:58:51.978257Z  INFO ThreadId(02) zenoh::net::runtime::orchestrator: Zenoh can be reached at: tcp/[fe80::694a:ae45:bf2b:33a0]:33921
2024-09-03T06:58:51.978277Z  INFO ThreadId(02) zenoh::net::runtime::orchestrator: Zenoh can be reached at: tcp/192.168.1.140:33921
2024-09-03T06:58:51.978458Z  INFO ThreadId(02) zenoh::net::runtime::orchestrator: zenohd listening scout messages on 224.0.0.224:7446
Declaring Subscriber on 'demo/example/**'...
Subscriber on 'demo/example/**' declared
Press CTRL-C to quit...
^CReceived SIGINT signal. Performing cleanup tasks...
thread '<unnamed>' panicked at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/thread/local.rs:246:26:
cannot access a Thread Local Storage value during or after destruction: AccessError
stack backtrace:
   0:     0x7983211079fc - std::backtrace_rs::backtrace::libunwind::trace::ha637c64ce894333a
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/../../backtrace/src/backtrace/libunwind.rs:104:5
   1:     0x7983211079fc - std::backtrace_rs::backtrace::trace_unsynchronized::h47f62dea28e0c88d
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x7983211079fc - std::sys_common::backtrace::_print_fmt::h9eef0abe20ede486
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:67:5
   3:     0x7983211079fc - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hed7f999df88cc644
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x798320e95710 - core::fmt::rt::Argument::fmt::h1539a9308b8d058d
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/fmt/rt.rs:142:9
   5:     0x798320e95710 - core::fmt::write::h3a39390d8560d9c9
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/fmt/mod.rs:1120:17
   6:     0x7983210d4ce2 - std::io::Write::write_fmt::h5fc9997dfe05f882
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/io/mod.rs:1762:15
   7:     0x7983211095ae - std::sys_common::backtrace::_print::h894006fb5c6f3d45
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:47:5
   8:     0x7983211095ae - std::sys_common::backtrace::print::h23a2d212c6fff936
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:34:9
   9:     0x798321108d20 - std::panicking::default_hook::{{closure}}::h8a1d2ee00185001a
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:272:22
  10:     0x798321109f08 - std::panicking::default_hook::h6038f2eba384e475
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:292:9
  11:     0x798321109f08 - std::panicking::rust_panic_with_hook::h2b5517d590cab22e
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:779:13
  12:     0x79832110990c - std::panicking::begin_panic_handler::{{closure}}::h233112c06e0ef43e
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:657:13
  13:     0x798321109866 - std::sys_common::backtrace::__rust_end_short_backtrace::h6e893f24d7ebbff8
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:170:18
  14:     0x798321109851 - rust_begin_unwind
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:645:5
  15:     0x798320cd9044 - core::panicking::panic_fmt::hbf0e066aabfa482c
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72:14
  16:     0x798320cd95b2 - core::result::unwrap_failed::hddb4fea594200c52
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/result.rs:1653:5
  17:     0x798321184a67 - zenoh_runtime::ZRuntime::block_in_place::hf9959efffc8868ff
  18:     0x7983211c4763 - <zenoh::api::session::Session as core::ops::drop::Drop>::drop::h0e1d6bad44ae52c6
  19:     0x798320d4cbfc - core::ptr::drop_in_place<zenoh::api::session::Session>::h3d19805dd420053d
  20:     0x798320d55a4d - alloc::sync::Arc<T,A>::drop_slow::h32dc1c11ecc8f032
  21:     0x5f7945fcc4e2 - _Z6z_dropP17z_moved_session_t
                               at /home/user/workspace/src/zenoh-cpp/local/include/zenoh_macros.h:584:62
  22:     0x5f7945fcedd8 - _ZN5zenoh5OwnedI17z_owned_session_tED2Ev
                               at /home/user/workspace/src/zenoh-cpp/include/zenoh/api/base.hxx:81:24
  23:     0x5f7945fcd8e0 - _ZN5zenoh7SessionD2Ev
                               at /home/user/workspace/src/zenoh-cpp/include/zenoh/api/session.hxx:43:7
  24:     0x5f7945fd16d4 - _ZNSt22_Optional_payload_baseIN5zenoh7SessionEE10_M_destroyEv
                               at /usr/include/c++/14.2.1/optional:283:35
  25:     0x5f7945fd0944 - _ZNSt22_Optional_payload_baseIN5zenoh7SessionEE8_M_resetEv
                               at /usr/include/c++/14.2.1/optional:314:14
  26:     0x5f7945fcee64 - _ZNSt17_Optional_payloadIN5zenoh7SessionELb0ELb0ELb0EED2Ev
                               at /usr/include/c++/14.2.1/optional:437:65
  27:     0x5f7945fcd9bc - _ZNSt14_Optional_baseIN5zenoh7SessionELb0ELb0EED2Ev
                               at /usr/include/c++/14.2.1/optional:508:12
  28:     0x5f7945fcd9d8 - _ZNSt8optionalIN5zenoh7SessionEED2Ev
                               at /usr/include/c++/14.2.1/optional:703:11
  29:     0x5f7945fd342c - _ZN7ContextD2Ev
                               at /home/user/workspace/src/zenoh-cpp/examples/universal/z_sub.cxx:31:16
  30:     0x79832064e891 - <unknown>
  31:     0x79832064e95e - exit
  32:     0x5f7945fca947 - _Z13handle_siginti
                               at /home/user/workspace/src/zenoh-cpp/examples/universal/z_sub.cxx:64:13
  33:     0x79832064c1d0 - <unknown>
  34:     0x7983206f0733 - clock_nanosleep
  35:     0x7983206fc827 - nanosleep
  36:     0x5f7945fcfaa3 - _ZNSt11this_thread9sleep_forIlSt5ratioILl1ELl1EEEEvRKNSt6chrono8durationIT_T0_EE
                               at /usr/include/c++/14.2.1/bits/this_thread_sleep.h:80:20
  37:     0x5f7945fcb12f - _Z5_mainiPPc
                               at /home/user/workspace/src/zenoh-cpp/examples/universal/z_sub.cxx:120:36
  38:     0x5f7945fcb3ac - main
                               at /home/user/workspace/src/zenoh-cpp/examples/universal/z_sub.cxx:128:14
  39:     0x798320634e08 - <unknown>
  40:     0x798320634ecc - __libc_start_main
  41:     0x5f7945fca645 - _start
  42:                0x0 - <unknown>
Error: nu::shell::coredump_error

System Info

@YuanYuYuan YuanYuYuan added the bug Something isn't working label Sep 3, 2024
@DenisBiryukov91
Copy link
Contributor

DenisBiryukov91 commented Sep 3, 2024

Maybe instead of atexit, we could use a static variable that will call all drop logic in its destructor, at least this guaranteed by C++ standard to work correctly. No apparently it is the same as calling function registered with atexit.

@yellowhatter
Copy link
Contributor

There are some (a lot of) system APIs that should not be called in signal handlers, so it is really a bad idea to drop big objects in signal handlers. Moreover, signal handler can be executed concurrently, so thread safety also comes into place. We need to handle signals in a safe way, I can take this task

@YuanYuYuan
Copy link
Contributor Author

There are some (a lot of) system APIs that should not be called in signal handlers, so it is really a bad idea to drop big objects in signal handlers. Moreover, signal handler can be executed concurrently, so thread safety also comes into place. We need to handle signals in a safe way, I can take this task

Yes. I agree with your point. Dropping objects within signal handlers isn't a good pattern. Rust has put much effort into addressing this safely within the normal scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants