Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shutdown OTLP HTTP Exporters may crash in async mode. #1976

Closed
owent opened this issue Feb 10, 2023 · 0 comments · Fixed by #1977
Closed

Shutdown OTLP HTTP Exporters may crash in async mode. #1976

owent opened this issue Feb 10, 2023 · 0 comments · Fixed by #1977
Labels
bug Something isn't working

Comments

@owent
Copy link
Member

owent commented Feb 10, 2023

Describe your environment

OS: Linux
Version 1.8.1
Compiler: gcc
Build system: cmake

Steps to reproduce

Shutdown when some http sessions is finishing, there is thread-safety problem and crash.

Additional context

Thread 1:

#0  Curl_dyn_free (s=s@entry=0x228) at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/libcurl-7.87.0/lib/dynbuf.c:60
#1  0x00007ff92215d255 in Curl_http2_done (data=data@entry=0x7ff87fc2c000, premature=premature@entry=true)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/libcurl-7.87.0/lib/http2.c:1234
#2  0x00007ff922155f33 in Curl_http_done (data=0x7ff87fc2c000, status=CURLE_ABORTED_BY_CALLBACK, premature=<optimized out>)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/libcurl-7.87.0/lib/http.c:1613
#3  0x00007ff92216b70e in multi_done (data=data@entry=0x7ff87fc2c000, status=CURLE_ABORTED_BY_CALLBACK, premature=true, premature@entry=false)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/libcurl-7.87.0/lib/multi.c:646
#4  0x00007ff92216bf04 in multi_runsingle (multi=multi@entry=0x7ff9153031c0, nowp=nowp@entry=0x7ff911a7cfe0, data=data@entry=0x7ff87fc2c000)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/libcurl-7.87.0/lib/multi.c:2544
#5  0x00007ff92216d26e in curl_multi_perform (multi=0x7ff9153031c0, running_handles=0x7ff911a7d0ec)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/libcurl-7.87.0/lib/multi.c:2690
#6  0x00007ff91c3afbfa in operator() (__closure=<optimized out>, self=0x7ff915241618)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/opentelemetry-cpp-v1.8.1/ext/src/http/client/curl/http_client_curl.cc:277
#7  _M_invoke<0ul> (this=<optimized out>) at /usr/include/c++/4.8.2/functional:1732
#8  operator() (this=<optimized out>) at /usr/include/c++/4.8.2/functional:1720
#9  std::thread::_Impl<std::_Bind_simple<opentelemetry::v1::ext::http::client::curl::HttpClient::MaybeSpawnBackgroundThread()::__lambda8(opentelemetry::v1::ext::http::client::curl::HttpClient*)> >::_M_run(void) (this=<optimized out>) at /usr/include/c++/4.8.2/thread:115
#10 0x00007ff91ec5e340 in ?? () from /lib64/libstdc++.so.6
#11 0x00007ff91ff31ea5 in start_thread () from /lib64/libpthread.so.0
#12 0x00007ff91e5d7b0d in clone () from /lib64/libc.so.6

Thread 2:

#0  0x00007ff929cf2dc2 in strcmp () from /lib64/ld-linux-x86-64.so.2
#1  0x00007ff929ce25b9 in check_match.9525 () from /lib64/ld-linux-x86-64.so.2
#2  0x00007ff929ce2dbb in do_lookup_x () from /lib64/ld-linux-x86-64.so.2
#3  0x00007ff929ce309f in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2
#4  0x00007ff929ce7dee in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
#5  0x00007ff929cefaaa in _dl_runtime_resolve_xsavec () from /lib64/ld-linux-x86-64.so.2
#6  0x00007ff91c6525b8 in AdjustWaitForTimeout<long, std::ratio<1l, 1000000l> > (indefinite_value=..., timeout=...)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/opentelemetry-cpp-v1.8.1/api/include/opentelemetry/common/timestamp.h:190
#7  opentelemetry::v1::exporter::otlp::OtlpHttpClient::ForceFlush (this=this@entry=0x7ff91522f600, timeout=timeout@entry=...)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/opentelemetry-cpp-v1.8.1/exporters/otlp/src/otlp_http_client.cc:792
#8  0x00007ff91c6528e1 in opentelemetry::v1::exporter::otlp::OtlpHttpClient::Shutdown (this=0x7ff91522f600, timeout=...)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/opentelemetry-cpp-v1.8.1/exporters/otlp/src/otlp_http_client.cc:837
#9  0x00007ff91b94899b in opentelemetry::v1::sdk::trace::BatchSpanProcessor::Shutdown (this=0x7ff914d89aa0, timeout=...)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/opentelemetry-cpp-v1.8.1/sdk/src/trace/batch_span_processor.cc:278
#10 0x00007ff91b93e484 in opentelemetry::v1::sdk::trace::MultiSpanProcessor::Shutdown (this=<optimized out>, timeout=...)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/opentelemetry-cpp-v1.8.1/sdk/include/opentelemetry/sdk/trace/multi_span_processor.h:131
#11 0x00007ff91b93e2b8 in opentelemetry::v1::sdk::trace::TracerContext::Shutdown (this=<optimized out>)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/third_party/packages/opentelemetry-cpp-v1.8.1/sdk/src/trace/tracer_context.cc:58
#12 0x00007ff9254e14a7 in rpc::telemetry::(anonymous namespace)::__lambda103::operator() (__closure=0x7ff915214010, provider=...)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/src/server_frame/rpc/telemetry/rpc_global_service.cpp:965
#13 0x00007ff9254e878d in std::_Function_handler<void(const opentelemetry::v1::nostd::shared_ptr<opentelemetry::v1::trace::TracerProvider>&), rpc::telemetry::(anonymous namespace)::_opentelemetry_create_trace_provider(std::vector<std::unique_ptr<opentelemetry::v1::sdk::trace::SpanProcessor> >&&, std::unique_ptr<opentelemetry::v1::sdk::trace::Sampler>&&, opentelemetry::v1::sdk::resource::Resource)::__lambda103>::_M_invoke(const std::_Any_data &, const opentelemetry::v1::nostd::shared_ptr<opentelemetry::v1::trace::TracerProvider> &) (__functor=..., __args#0=...) at /usr/include/c++/4.8.2/functional:2071
#14 0x00007ff9254fe701 in std::function<void (opentelemetry::v1::nostd::shared_ptr<opentelemetry::v1::trace::TracerProvider> const&)>::operator()(opentelemetry::v1::nostd::shared_ptr<opentelemetry::v1::trace::TracerProvider> const&) const (this=0x7ff9153e2280, __args#0=...) at /usr/include/c++/4.8.2/functional:2471
#15 0x00007ff9254e3c92 in rpc::telemetry::(anonymous namespace)::_opentelemetry_cleanup_local_caller_info_t (
    app_info_cache=std::shared_ptr (count 1, weak 0) 0x7ff9153e2018)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/src/server_frame/rpc/telemetry/rpc_global_service.cpp:1265
#16 0x00007ff9254e3fb1 in rpc::telemetry::(anonymous namespace)::_opentelemetry_cleanup_global_provider (app=...)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/src/server_frame/rpc/telemetry/rpc_global_service.cpp:1306
#17 0x00007ff9255086a1 in std::_Function_handler<void (atapp::app&), void (*)(atapp::app&)>::_M_invoke(std::_Any_data const&, atapp::app&) (__functor=..., __args#0=...)
    at /usr/include/c++/4.8.2/functional:2071
#18 0x00007ff923c3c7bb in std::function<void (atapp::app&)>::operator()(atapp::app&) const (this=0x7ff915210a60, __args#0=...) at /usr/include/c++/4.8.2/functional:2471
#19 0x00007ff923bfc11b in atapp::app::app_evt_on_finally (this=0x7fff2854d8a0)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/atframework/libatapp/src/atframe/atapp.cpp:4150
#20 0x00007ff923beda06 in atapp::app::run_inner (this=0x7fff2854d8a0, run_mode=0)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/atframework/libatapp/src/atframe/atapp.cpp:2330
#21 0x00007ff923be33d9 in atapp::app::run (this=0x7fff2854d8a0, ev_loop=0x7ff924039a40 <default_loop_struct>, argc=10, argv=0x7fff2854e258, priv_data=0x0)
    at /data/devops/workspace/p-4918a9c91c1b440da3baa2c3e8c3bc09/src/server/main/atframework/libatapp/src/atframe/atapp.cpp:304

When shutdown, a http session may be finished in another thread and waiting to be removed, but the thread call Shutdown may call Session::FinishOperation in HttpClient::CleanupSession, which may reset curl handle and cause crash.

@owent owent added the bug Something isn't working label Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant