-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #1632 - Occasional Segfault with LongCounter instrument #1638
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #1638 +/- ##
==========================================
- Coverage 85.10% 85.09% -0.01%
==========================================
Files 159 159
Lines 4999 5001 +2
==========================================
+ Hits 4254 4255 +1
- Misses 745 746 +1
|
…etry-cpp into attribute-hashmap-crash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good findings on the root cause.
Please see questions / comments on the fix.
sdk/include/opentelemetry/sdk/metrics/state/async_metric_storage.h
Outdated
Show resolved
Hide resolved
sdk/include/opentelemetry/sdk/metrics/state/sync_metric_storage.h
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the fixes.
Fixes #1632
Changes
The problem occurs when metric collection is happening while measurements are getting recorded in different threads ( PeriodicExportingMetricReader creates separate thread for collection)
Thread1:
Instrument::Record()
->SyncMetricStorage::Record()
->AttributeHashMap::GetOrSetDefault()
->mutex_lock
->map::insert()
Thread2:
SyncMetricStorage::Collect ()
->std::move(AttributeHashMap)
-> ...While thread1 is recording in attribute-hashmap, thread2 can sometime clear the attribute-hashmap (by move it to different hashmap), which cause thread1 to try recording in invalid memory.
The solution is to remove locks from attribute-hashmap, and add it to SyncMetricStorage/AsyncMetricStorage whereever it is reading/updating/moving attribute-hashmap
i.e.,
Thread1:
Instrument::Record()
->SyncMetricStorage::Record ()
->mutex_lock_
->AttributeHashMap::GetOrSetDefault()
->map::insert()
Thread2:
SyncMetricStorage::Collect ()
->mutex_lock
->std::move(AttributeHashMap)
-> ...For significant contributions please make sure you have completed the following items:
CHANGELOG.md
updated for non-trivial changes