-
Notifications
You must be signed in to change notification settings - Fork 858
Lock-free updates for Histogram #2951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lock-free updates for Histogram #2951
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2951 +/- ##
==========================================
+ Coverage 83.98% 84.02% +0.04%
==========================================
Files 254 254
Lines 8942 8946 +4
==========================================
+ Hits 7510 7517 +7
+ Misses 1432 1429 -3
|
Great analysis! 👍 |
| this.histogramBuckets.RunningBucketCounts[i]++; | ||
| if (Interlocked.Exchange(ref this.histogramBuckets.UsingHistogram, 1) == 0) | ||
| { | ||
| this.runningValue.AsLong++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't look into the underlying code, wonder if this ++ might throw (e.g. integer overflow case). If that's the case, we might need to make sure we release the spinlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't acquire any lock in the first place to be released. SpinWait is used to just smartly apply context switch for the thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one Ln335 we set this.histogramBuckets.UsingHistogram = 0, if we throw before this, all other threads would spin I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think unchecked would solve the problem here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry/Internal/CircularBuffer.cs#L139-L143 We could explore if we need to have an exit plan like done for CircularBuffer.
reyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Approach LGTM 🚀 |
cijothomas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add to changelog and we are good to go.
…om/utpilla/opentelemetry-dotnet into utpilla/Lock-free-Histogram-Update
Co-authored-by: Cijo Thomas <[email protected]>
Changes
HistogramMetricPointstruct as we add a newintwhich would be used for synchronizationIf this approach seems good, I can implement it for
HistogramSumCountas well.Stress Test Results
To better understand the perf improvement, we need to look at two scenarios:
MetricPoint. This was tested by running the following in theRunmethod of stress test:MetricPoint. This was tested by running the following in theRunmethod of stress test:While there is a perf improvement in both the cases, there is a substantial improvement in the second case where there is high contention to update the same
MetricPoint.For the first scenario, Loops/ second go up from ~20M to ~23M (~15% increase)
For the second scenario, Loops/second go up from ~5.8M to ~7.6M (~31% increase)
Here are the numbers for the 1st and 2nd scenario respectively:
main branch
With this PR