You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using metrics I've started seeing some segmentation faults related to the InstrumentationScopeunique_ptr created in GetMeter. I was initially not being able to capture the segfault in GDB which implies some synchronization issue, so I added some debug prints to the pointer that InstrumentationScope::Create creates, which in the case of the segmentation fault differs from the raw pointer returned by Meter::GetInstrumentationScope in the SDK MetricCollector::Collect function, the latter of which points to an invalid address.
Eventually I was able to create a reproducible example found below. It's a bit flaky, so you may want to play around with num_meters parameter in this example to find a sweet spot where no error logs are being generated but the segfault still occurs.
I noticed when I am using the same meter name for all instruments, I can no longer reproduce this problem, but it may just be because the loop in MetricCollector::Collect becomes very short and the synchronization issue doesn't manifest itself.
I based this off the commit of this morning: dd7e257b6de71eeaf9e3149530962301705b9a0d
Started looking into this a little closer today and I noticed also the meter in the for loop of MetricCollector::Collect also becomes a nullptr sometimes.
The problem is with MeterContext::GetMeters which returns a nostd::span viewing into the MeterContextmeters_ field. However, while that span is being used by the background thread, the backing buffer of the vector could be in the process of being reallocated while new meters are added.
A simple fix is to construct and return a copy of meters_ in MeterContext::GetMeters while it is locked, so the background thread can work on the copied shared_ptrs to the meters while another thread can safely mutate the meters_ field at the same time.
I can imagine such a fix has performance implications as the refcounts for all meters need to be atomically updated for the copy. I can imagine one other approach where the span is kept but somehow locking the MeterContextstorage_lock_ while iterating over the meters in MetricCollector::Collect.
I've found that for observables I had to RemoveCallback before shutting down at various places.
I will have a look at it, and raise a issue if able to reproduce. Else, if you have a reproducible code with Observables please feel free to create separate issue for it.
When using metrics I've started seeing some segmentation faults related to the
InstrumentationScope
unique_ptr
created inGetMeter
. I was initially not being able to capture the segfault in GDB which implies some synchronization issue, so I added some debug prints to the pointer thatInstrumentationScope::Create
creates, which in the case of the segmentation fault differs from the raw pointer returned byMeter::GetInstrumentationScope
in the SDKMetricCollector::Collect
function, the latter of which points to an invalid address.Eventually I was able to create a reproducible example found below. It's a bit flaky, so you may want to play around with
num_meters
parameter in this example to find a sweet spot where no error logs are being generated but the segfault still occurs.I noticed when I am using the same meter name for all instruments, I can no longer reproduce this problem, but it may just be because the loop in
MetricCollector::Collect
becomes very short and the synchronization issue doesn't manifest itself.I based this off the commit of this morning:
dd7e257b6de71eeaf9e3149530962301705b9a0d
With the following backtrace:
The text was updated successfully, but these errors were encountered: