Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SDK] Valgrind errors on std::atomic variables #2243

Closed
marcalff opened this issue Jul 20, 2023 · 2 comments · Fixed by #2244
Closed

[SDK] Valgrind errors on std::atomic variables #2243

marcalff opened this issue Jul 20, 2023 · 2 comments · Fixed by #2244
Assignees
Labels
bug Something isn't working

Comments

@marcalff
Copy link
Member

Using opentelemetry-cpp 1.10.0, clang 15.0.7, valgrind 3.18.1

When exporting metrics using a periodic metric readers,
and running valgrind,
the following errors are reported:

==18626== Thread 42:
==18626== Conditional jump or move depends on uninitialised value(s)
==18626==    at 0x2F8ADE00: opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::CollectAndExportOnce() (internal/extra/opentelemetry-cpp/opentelemetry-cpp-1.10.0/sdk/src/metrics/export/periodic_exporting_metric_reader.cc:102)
==18626==    by 0x2F8ADAAE: opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::DoBackgroundWork() (internal/extra/opentelemetry-cpp/opentelemetry-cpp-1.10.0/sdk/src/metrics/export/periodic_exporting_metric_reader.cc:55)
==18626==    by 0x2F8B1E0D: void std::__invoke_impl<void, void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*>(std::__invoke_memfun_deref, void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*&&)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*&&) (invoke.h:74)
==18626==    by 0x2F8B1D21: std::__invoke_result<void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*>::type std::__invoke<void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*>(void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*&&)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*&&) (invoke.h:96)
==18626==    by 0x2F8B1CE1: void std::thread::_Invoker<std::tuple<void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (std_thread.h:279)
==18626==    by 0x2F8B1C94: std::thread::_Invoker<std::tuple<void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*> >::operator()() (std_thread.h:286)
==18626==    by 0x2F8B1AF8: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*> > >::_M_run() (std_thread.h:231)
==18626==    by 0x9E27AC2: ??? (in /usr/lib64/libstdc++.so.6.0.30)
==18626==    by 0x96686E9: start_thread (in /lib64/libpthread-2.31.so)
==18626==    by 0xA4A794E: clone (in /lib64/libc-2.31.so)
==18626== 
==18626== Conditional jump or move depends on uninitialised value(s)
==18626==    at 0x2F8AE737: opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::DoBackgroundWork()::$_1::operator()() const (internal/extra/opentelemetry-cpp/opentelemetry-cpp-1.10.0/sdk/src/metrics/export/periodic_exporting_metric_reader.cc:64)
==18626==    by 0x2F8AE6A8: bool std::condition_variable::wait_until<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> >, opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::DoBackgroundWork()::$_1>(std::unique_lock<std::mutex>&, std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > > const&, opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::DoBackgroundWork()::$_1) (condition_variable:150)
==18626==    by 0x2F8ADEC3: bool std::condition_variable::wait_for<long, std::ratio<1l, 1000l>, opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::DoBackgroundWork()::$_1>(std::unique_lock<std::mutex>&, std::chrono::duration<long, std::ratio<1l, 1000l> > const&, opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::DoBackgroundWork()::$_1) (condition_variable:174)
==18626==    by 0x2F8ADCD8: opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::DoBackgroundWork() (internal/extra/opentelemetry-cpp/opentelemetry-cpp-1.10.0/sdk/src/metrics/export/periodic_exporting_metric_reader.cc:63)
==18626==    by 0x2F8B1E0D: void std::__invoke_impl<void, void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*>(std::__invoke_memfun_deref, void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*&&)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*&&) (invoke.h:74)
==18626==    by 0x2F8B1D21: std::__invoke_result<void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*>::type std::__invoke<void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*>(void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*&&)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*&&) (invoke.h:96)
==18626==    by 0x2F8B1CE1: void std::thread::_Invoker<std::tuple<void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (std_thread.h:279)
==18626==    by 0x2F8B1C94: std::thread::_Invoker<std::tuple<void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*> >::operator()() (std_thread.h:286)
==18626==    by 0x2F8B1AF8: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader::*)(), opentelemetry::v1::sdk::metrics::PeriodicExportingMetricReader*> > >::_M_run() (std_thread.h:231)
==18626==    by 0x9E27AC2: ??? (in /usr/lib64/libstdc++.so.6.0.30)
==18626==    by 0x96686E9: start_thread (in /lib64/libpthread-2.31.so)
==18626==    by 0xA4A794E: clone (in /lib64/libc-2.31.so)
==18626== 

The offending lines are:

  bool notify_force_flush = is_force_flush_pending_.exchange(false, std::memory_order_acq_rel);
  if (notify_force_flush) <---
      if (is_force_wakeup_background_worker_.load(std::memory_order_acquire)) <---
@marcalff marcalff added the bug Something isn't working label Jul 20, 2023
@marcalff marcalff self-assigned this Jul 20, 2023
@github-actions github-actions bot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 20, 2023
@marcalff
Copy link
Member Author

The issue is that std::atomic variables are only constructed using the default contructor.

According to cppreference,
this does not work, see the notes about LWG 2334.

A better way is to invoke a constructor that passes an initial value, instead of using the default constructor.

For example:

diff --git a/internal/extra/opentelemetry-cpp/opentelemetry-cpp-1.10.0/sdk/src/metrics/export/periodic_exporting_metric_reader.cc b/internal/extra/opentelemetry-cpp/opentelemetry-cpp-1.10.0/sdk/src/metrics/export/periodic_exporting_metric_reader.cc
index dfa4a5ee6b2..c471014a4fd 100644
--- a/internal/extra/opentelemetry-cpp/opentelemetry-cpp-1.10.0/sdk/src/metrics/export/periodic_exporting_metric_reader.cc
+++ b/internal/extra/opentelemetry-cpp/opentelemetry-cpp-1.10.0/sdk/src/metrics/export/periodic_exporting_metric_reader.cc
@@ -24,7 +24,10 @@ PeriodicExportingMetricReader::PeriodicExportingMetricReader(
     const PeriodicExportingMetricReaderOptions &option)
     : exporter_{std::move(exporter)},
       export_interval_millis_{option.export_interval_millis},
-      export_timeout_millis_{option.export_timeout_millis}
+      export_timeout_millis_{option.export_timeout_millis},
+      is_force_flush_pending_(false),
+      is_force_wakeup_background_worker_(false),
+      is_force_flush_notified_(false)
 {
   if (export_interval_millis_ <= export_timeout_millis_)
   {

Verified that this fixes the valgrind errors.

Please ignore the internal/extra/opentelemetry-cpp/opentelemetry-cpp-1.10.0/ path,
this comes from a super-project in CMake.

Every std::atomic used in the code base is potentially affected.

@marcalff
Copy link
Member Author

Verified with valgrind the following.

Default initialization

Ref: https://en.cppreference.com/w/cpp/language/default_initialization

class PeriodicExportingMetricReader {
  std::atomic<bool> is_force_flush_pending_;
};

This fails.

Value initialization

Ref: https://en.cppreference.com/w/cpp/language/value_initialization

class PeriodicExportingMetricReader {
  std::atomic<bool> is_force_flush_pending_{};
};

This seemed to work, not sure why.
Better avoid this form.

Direct initialization

Ref: https://en.cppreference.com/w/cpp/language/direct_initialization

class PeriodicExportingMetricReader {

  PeriodicExportingMetricReader::PeriodicExportingMetricReader()
  : is_force_flush_pending_(false) {}

  std::atomic<bool> is_force_flush_pending_;
};

This works.

Direct List initialization

Ref: https://en.cppreference.com/w/cpp/language/list_initialization

class PeriodicExportingMetricReader {

  std::atomic<bool> is_force_flush_pending_{false};
};

This works.

marcalff added a commit to marcalff/opentelemetry-cpp that referenced this issue Jul 22, 2023
@marcalff marcalff removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant