Skip to content

Conversation

@utpilla
Copy link
Contributor

@utpilla utpilla commented Feb 25, 2022

Changes

  • Implement lock-free updates for Histogram
  • This does increase the size of MetricPoint struct as we add a new int which would be used for synchronization

If this approach seems good, I can implement it for HistogramSumCount as well.

Stress Test Results

To better understand the perf improvement, we need to look at two scenarios:

  1. There is not much contention to update the same MetricPoint. This was tested by running the following in the Run method of stress test:
TestHistogram.Record(
           random.Next(MaxHistogramMeasurement),
           new("DimName1", DimensionValues[random.Next(0, ArraySize)]),
           new("DimName2", DimensionValues[random.Next(0, ArraySize)]),
           new("DimName3", DimensionValues[random.Next(0, ArraySize)]));
  1. There is high contention to update the same MetricPoint. This was tested by running the following in the Run method of stress test:
TestHistogram.Record(
           random.Next(MaxHistogramMeasurement),
           new("DimName1", "DimVal1"),
           new("DimName2", "DimVal2"),
           new("DimName3", "DimVal3"));

While there is a perf improvement in both the cases, there is a substantial improvement in the second case where there is high contention to update the same MetricPoint.

For the first scenario, Loops/ second go up from ~20M to ~23M (~15% increase)
For the second scenario, Loops/second go up from ~5.8M to ~7.6M (~31% increase)

Here are the numbers for the 1st and 2nd scenario respectively:

main branch

image

With this PR

image

@utpilla utpilla requested a review from a team February 25, 2022 22:29
@utpilla utpilla changed the title Lock-free updates for Histogram [Proposal] Lock-free updates for Histogram Feb 25, 2022
@codecov
Copy link

codecov bot commented Feb 25, 2022

Codecov Report

Merging #2951 (5e9a98d) into main (6981795) will increase coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2951      +/-   ##
==========================================
+ Coverage   83.98%   84.02%   +0.04%     
==========================================
  Files         254      254              
  Lines        8942     8946       +4     
==========================================
+ Hits         7510     7517       +7     
+ Misses       1432     1429       -3     
Impacted Files Coverage Δ
src/OpenTelemetry/Metrics/HistogramBuckets.cs 100.00% <ø> (ø)
...ementation/HttpHandlerMetricsDiagnosticListener.cs 94.11% <100.00%> (ø)
src/OpenTelemetry/Metrics/MetricPoint.cs 86.02% <100.00%> (+0.42%) ⬆️
src/OpenTelemetry/BatchExportProcessor.cs 87.36% <0.00%> (+3.15%) ⬆️

@reyang
Copy link
Member

reyang commented Feb 25, 2022

While there is a perf improvement in both the cases, there is a substantial improvement in the second case where there is high contention to update the same MetricPoint.

Great analysis! 👍

this.histogramBuckets.RunningBucketCounts[i]++;
if (Interlocked.Exchange(ref this.histogramBuckets.UsingHistogram, 1) == 0)
{
this.runningValue.AsLong++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't look into the underlying code, wonder if this ++ might throw (e.g. integer overflow case). If that's the case, we might need to make sure we release the spinlock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't acquire any lock in the first place to be released. SpinWait is used to just smartly apply context switch for the thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one Ln335 we set this.histogramBuckets.UsingHistogram = 0, if we throw before this, all other threads would spin I guess?

Copy link
Member

@reyang reyang Feb 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think unchecked would solve the problem here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CodeBlanch
Copy link
Member

Approach LGTM 🚀

Copy link
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add to changelog and we are good to go.

@cijothomas cijothomas changed the title [Proposal] Lock-free updates for Histogram Lock-free updates for Histogram Mar 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants