Skip to content
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 6 additions & 16 deletions docs/metrics/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,13 @@
* [Instruments](#instruments)
* [MeterProvider Management](#meterprovider-management)
* [Memory Management](#memory-management)
* [Example](#example)
* [Pre-Aggregation](#pre-aggregation)
* [Cardinality Limits](#cardinality-limits)
* [Memory Preallocation](#memory-preallocation)
* [Metrics Correlation](#metrics-correlation)
* [Metrics Enrichment](#metrics-enrichment)
* [Common issues that lead to missing metrics](#common-issues-that-lead-to-missing-metrics)

</details>

Expand Down Expand Up @@ -386,22 +388,10 @@ and the `MetricStreamConfiguration.CardinalityLimit` setting. Refer to this
[doc](../../docs/metrics/customizing-the-sdk/README.md#changing-the-cardinality-limit-for-a-metric)
for more information.

Given a metric, once the cardinality limit is reached, any new measurement which
cannot be independently aggregated because of the limit will be dropped or
aggregated using the [overflow
attribute](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#overflow-attribute)
(if enabled). When NOT using the overflow attribute feature a warning is written
to the [self-diagnostic log](../../src/OpenTelemetry/README.md#self-diagnostics)
the first time an overflow is detected for a given metric.

> [!NOTE]
> Overflow attribute was introduced in OpenTelemetry .NET
[1.6.0-rc.1](../../src/OpenTelemetry/CHANGELOG.md#160-rc1). It is currently an
experimental feature which can be turned on by setting the environment
variable `OTEL_DOTNET_EXPERIMENTAL_METRICS_EMIT_OVERFLOW_ATTRIBUTE=true`. Once
the [OpenTelemetry
Specification](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#overflow-attribute)
become stable, this feature will be turned on by default.
Starting with version 1.10.0, given a metric, once the cardinality limit is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think its best to mention the pre 1.10.0 behavior too here.
Versions prior to 1.10.0 had a different behavior when cardinality limit was hit. The measurements was either dropped (default) and an internal log was emitted once or aggregated using overflow attribute(opt-in basis)

(we have some places in this repo where were use such style in document)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's better if the README reflects the current scenario. Adding prior behavior may increase the length of the document.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the whole thing is a bit confusing 🤣

Here is my stab at improving it:

Given a metric, once the cardinality limit is reached, any new measurement which cannot be independently aggregated because of the limit will be dropped. Starting with version 1.10.0, when a measurement is dropped, the the overflow attribute is updated automatically.

I think what is important is the first statement "Given a metric, once the cardinality limit is reached, any new measurement which cannot be independently aggregated because of the limit will be dropped." That is a hard stop thing. The way it is currently written, it seems this is new to 1.10.0 😄

Copy link
Contributor Author

@xiang17 xiang17 Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this is tricky. One is the Previous behavior where there is default and experimental opt-in overflow attribute. The other one is the New behavior with the always on overflow attribute. It will also be too long if we explain everything in README.

I've changed the README to include minimal info to mention things has changed a few times, and put more details in the CHANGELOG in case anyone is really interested and want to better understand.

Copy link
Contributor Author

@xiang17 xiang17 Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what is important is the first statement "Given a metric, once the cardinality limit is reached, any new measurement which cannot be independently aggregated because of the limit will be dropped." That is a hard stop thing. The way it is currently written, it seems this is new to 1.10.0 😄

That's a good point. I've updated it to make that first statement more prominent and followed with how our SDK handles and how approaches changed.

One small thing, the spec seems to not see it as a "drop" when it's done with the overflow attribute, but rather a "synthetic aggregation". So it's still "aggregated", but not "independently aggregated".

An overflow attribute set is defined, containing a single attribute otel.metric.overflow having (boolean) value true, which is used to report a synthetic aggregation of the Measurements that could not be independently aggregated because of the limit.

reached, any new measurement which cannot be independently aggregated because of
the limit will be aggregated using the [overflow
attribute](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#overflow-attribute).

When [Delta Aggregation
Temporality](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/data-model.md#temporality)
Expand Down
7 changes: 7 additions & 0 deletions src/OpenTelemetry/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,13 @@ Notes](../../RELEASENOTES.md).

## Unreleased

* Promote overflow attribute from experimental to stable and removed the
`OTEL_DOTNET_EXPERIMENTAL_METRICS_EMIT_OVERFLOW_ATTRIBUTE` environment variable.
The overflow attribute feature is now enabled by default.
No internal logs will be emitted when the cardinality limit is reached, as the
`otel.metric.overflow` attribute can be used to tell that the limit is reached.
([#5909](https://github.com/open-telemetry/opentelemetry-dotnet/pull/5909))

## 1.10.0-beta.1

Released 2024-Sep-30
Expand Down
47 changes: 10 additions & 37 deletions src/OpenTelemetry/Metrics/AggregatorStore.cs
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,11 @@ internal sealed class AggregatorStore
internal readonly bool OutputDelta;
internal readonly bool OutputDeltaWithUnusedMetricPointReclaimEnabled;
internal readonly int NumberOfMetricPoints;
internal readonly bool EmitOverflowAttribute;
internal readonly ConcurrentDictionary<Tags, LookupData>? TagsToMetricPointIndexDictionaryDelta;
internal readonly Func<ExemplarReservoir?>? ExemplarReservoirFactory;
internal long DroppedMeasurements = 0;

private const ExemplarFilterType DefaultExemplarFilter = ExemplarFilterType.AlwaysOff;
private static readonly string MetricPointCapHitFixMessage = "Consider opting in for the experimental SDK feature to emit all the throttled metrics under the overflow attribute by setting env variable OTEL_DOTNET_EXPERIMENTAL_METRICS_EMIT_OVERFLOW_ATTRIBUTE = true. You could also modify instrumentation to reduce the number of unique key/value pair combinations. Or use Views to drop unwanted tags. Or use MeterProviderBuilder.SetMaxMetricPointsPerMetricStream to set higher limit.";
private static readonly Comparison<KeyValuePair<string, object?>> DimensionComparisonDelegate = (x, y) => x.Key.CompareTo(y.Key);

private readonly Lock lockZeroTags = new();
Expand All @@ -42,7 +40,6 @@ internal sealed class AggregatorStore
new();

private readonly string name;
private readonly string metricPointCapHitMessage;
private readonly MetricPoint[] metricPoints;
private readonly int[] currentMetricPointBatch;
private readonly AggregationType aggType;
Expand All @@ -56,7 +53,6 @@ internal sealed class AggregatorStore

private int metricPointIndex = 0;
private int batchSize = 0;
private int metricCapHitMessageLogged;
private bool zeroTagMetricPointInitialized;
private bool overflowTagMetricPointInitialized;

Expand All @@ -65,7 +61,6 @@ internal AggregatorStore(
AggregationType aggType,
AggregationTemporality temporality,
int cardinalityLimit,
bool emitOverflowAttribute,
bool shouldReclaimUnusedMetricPoints,
ExemplarFilterType? exemplarFilter = null,
Func<ExemplarReservoir?>? exemplarReservoirFactory = null)
Expand All @@ -77,7 +72,6 @@ internal AggregatorStore(
// Previously, these were included within the original cardinalityLimit, but now they are explicitly added to enhance clarity.
this.NumberOfMetricPoints = cardinalityLimit + 2;

this.metricPointCapHitMessage = $"Maximum MetricPoints limit reached for this Metric stream. Configured limit: {cardinalityLimit}";
this.metricPoints = new MetricPoint[this.NumberOfMetricPoints];
this.currentMetricPointBatch = new int[this.NumberOfMetricPoints];
this.aggType = aggType;
Expand Down Expand Up @@ -105,8 +99,6 @@ internal AggregatorStore(
this.tagsKeysInterestingCount = hs.Count;
}

this.EmitOverflowAttribute = emitOverflowAttribute;

this.exemplarFilter = exemplarFilter ?? DefaultExemplarFilter;
Debug.Assert(
this.exemplarFilter == ExemplarFilterType.AlwaysOff
Expand Down Expand Up @@ -245,17 +237,14 @@ internal void SnapshotDeltaWithMetricPointReclaim()
this.batchSize++;
}

if (this.EmitOverflowAttribute)
// TakeSnapshot for the MetricPoint for overflow
ref var metricPointForOverflow = ref this.metricPoints[1];
if (metricPointForOverflow.MetricPointStatus != MetricPointStatus.NoCollectPending)
{
// TakeSnapshot for the MetricPoint for overflow
ref var metricPointForOverflow = ref this.metricPoints[1];
if (metricPointForOverflow.MetricPointStatus != MetricPointStatus.NoCollectPending)
{
this.TakeMetricPointSnapshot(ref metricPointForOverflow, outputDelta: true);
this.TakeMetricPointSnapshot(ref metricPointForOverflow, outputDelta: true);

this.currentMetricPointBatch[this.batchSize] = 1;
this.batchSize++;
}
this.currentMetricPointBatch[this.batchSize] = 1;
this.batchSize++;
}

// Index 0 and 1 are reserved for no tags and overflow
Expand Down Expand Up @@ -994,16 +983,8 @@ private void UpdateLongMetricPoint(int metricPointIndex, long value, ReadOnlySpa
if (metricPointIndex < 0)
{
Interlocked.Increment(ref this.DroppedMeasurements);

if (this.EmitOverflowAttribute)
{
this.InitializeOverflowTagPointIfNotInitialized();
this.metricPoints[1].Update(value);
}
else if (Interlocked.CompareExchange(ref this.metricCapHitMessageLogged, 1, 0) == 0)
{
OpenTelemetrySdkEventSource.Log.MeasurementDropped(this.name, this.metricPointCapHitMessage, MetricPointCapHitFixMessage);
}
this.InitializeOverflowTagPointIfNotInitialized();
this.metricPoints[1].Update(value);

return;
}
Expand Down Expand Up @@ -1049,16 +1030,8 @@ private void UpdateDoubleMetricPoint(int metricPointIndex, double value, ReadOnl
if (metricPointIndex < 0)
{
Interlocked.Increment(ref this.DroppedMeasurements);

if (this.EmitOverflowAttribute)
{
this.InitializeOverflowTagPointIfNotInitialized();
this.metricPoints[1].Update(value);
}
else if (Interlocked.CompareExchange(ref this.metricCapHitMessageLogged, 1, 0) == 0)
{
OpenTelemetrySdkEventSource.Log.MeasurementDropped(this.name, this.metricPointCapHitMessage, MetricPointCapHitFixMessage);
}
this.InitializeOverflowTagPointIfNotInitialized();
this.metricPoints[1].Update(value);

return;
}
Expand Down
10 changes: 1 addition & 9 deletions src/OpenTelemetry/Metrics/MeterProviderSdk.cs
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ namespace OpenTelemetry.Metrics;

internal sealed class MeterProviderSdk : MeterProvider
{
internal const string EmitOverFlowAttributeConfigKey = "OTEL_DOTNET_EXPERIMENTAL_METRICS_EMIT_OVERFLOW_ATTRIBUTE";
internal const string ReclaimUnusedMetricPointsConfigKey = "OTEL_DOTNET_EXPERIMENTAL_METRICS_RECLAIM_UNUSED_METRIC_POINTS";
internal const string ExemplarFilterConfigKey = "OTEL_METRICS_EXEMPLAR_FILTER";
internal const string ExemplarFilterHistogramsConfigKey = "OTEL_DOTNET_EXPERIMENTAL_METRICS_EXEMPLAR_FILTER_HISTOGRAMS";
Expand All @@ -22,7 +21,6 @@ internal sealed class MeterProviderSdk : MeterProvider
internal readonly IDisposable? OwnedServiceProvider;
internal int ShutdownCount;
internal bool Disposed;
internal bool EmitOverflowAttribute;
internal bool ReclaimUnusedMetricPoints;
internal ExemplarFilterType? ExemplarFilter;
internal ExemplarFilterType? ExemplarFilterForHistograms;
Expand Down Expand Up @@ -75,7 +73,7 @@ internal MeterProviderSdk(
this.viewConfigs = state.ViewConfigs;

OpenTelemetrySdkEventSource.Log.MeterProviderSdkEvent(
$"MeterProvider configuration: {{MetricLimit={state.MetricLimit}, CardinalityLimit={state.CardinalityLimit}, EmitOverflowAttribute={this.EmitOverflowAttribute}, ReclaimUnusedMetricPoints={this.ReclaimUnusedMetricPoints}, ExemplarFilter={this.ExemplarFilter}, ExemplarFilterForHistograms={this.ExemplarFilterForHistograms}}}.");
$"MeterProvider configuration: {{MetricLimit={state.MetricLimit}, CardinalityLimit={state.CardinalityLimit}, ReclaimUnusedMetricPoints={this.ReclaimUnusedMetricPoints}, ExemplarFilter={this.ExemplarFilter}, ExemplarFilterForHistograms={this.ExemplarFilterForHistograms}}}.");

foreach (var reader in state.Readers)
{
Expand All @@ -86,7 +84,6 @@ internal MeterProviderSdk(
reader.ApplyParentProviderSettings(
state.MetricLimit,
state.CardinalityLimit,
this.EmitOverflowAttribute,
this.ReclaimUnusedMetricPoints,
this.ExemplarFilter,
this.ExemplarFilterForHistograms);
Expand Down Expand Up @@ -486,11 +483,6 @@ protected override void Dispose(bool disposing)

private void ApplySpecificationConfigurationKeys(IConfiguration configuration)
{
if (configuration.TryGetBoolValue(OpenTelemetrySdkEventSource.Log, EmitOverFlowAttributeConfigKey, out this.EmitOverflowAttribute))
{
OpenTelemetrySdkEventSource.Log.MeterProviderSdkEvent("Overflow attribute feature enabled via configuration.");
}

if (configuration.TryGetBoolValue(OpenTelemetrySdkEventSource.Log, ReclaimUnusedMetricPointsConfigKey, out this.ReclaimUnusedMetricPoints))
{
OpenTelemetrySdkEventSource.Log.MeterProviderSdkEvent("Reclaim unused metric point feature enabled via configuration.");
Expand Down
2 changes: 0 additions & 2 deletions src/OpenTelemetry/Metrics/Metric.cs
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,6 @@ internal Metric(
MetricStreamIdentity instrumentIdentity,
AggregationTemporality temporality,
int cardinalityLimit,
bool emitOverflowAttribute,
bool shouldReclaimUnusedMetricPoints,
ExemplarFilterType? exemplarFilter = null,
Func<ExemplarReservoir?>? exemplarReservoirFactory = null)
Expand Down Expand Up @@ -193,7 +192,6 @@ internal Metric(
aggType,
temporality,
cardinalityLimit,
emitOverflowAttribute,
shouldReclaimUnusedMetricPoints,
exemplarFilter,
exemplarReservoirFactory);
Expand Down
5 changes: 0 additions & 5 deletions src/OpenTelemetry/Metrics/Reader/MetricReaderExt.cs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ public abstract partial class MetricReader
private Metric?[]? metrics;
private Metric[]? metricsCurrentBatch;
private int metricIndex = -1;
private bool emitOverflowAttribute;
private bool reclaimUnusedMetricPoints;
private ExemplarFilterType? exemplarFilter;
private ExemplarFilterType? exemplarFilterForHistograms;
Expand Down Expand Up @@ -82,7 +81,6 @@ internal virtual List<Metric> AddMetricWithNoViews(Instrument instrument)
metricStreamIdentity,
this.GetAggregationTemporality(metricStreamIdentity.InstrumentType),
this.cardinalityLimit,
this.emitOverflowAttribute,
this.reclaimUnusedMetricPoints,
exemplarFilter);
}
Expand Down Expand Up @@ -164,7 +162,6 @@ internal virtual List<Metric> AddMetricWithViews(Instrument instrument, List<Met
metricStreamIdentity,
this.GetAggregationTemporality(metricStreamIdentity.InstrumentType),
metricStreamConfig?.CardinalityLimit ?? this.cardinalityLimit,
this.emitOverflowAttribute,
this.reclaimUnusedMetricPoints,
exemplarFilter,
metricStreamConfig?.ExemplarReservoirFactory);
Expand All @@ -184,7 +181,6 @@ internal virtual List<Metric> AddMetricWithViews(Instrument instrument, List<Met
internal void ApplyParentProviderSettings(
int metricLimit,
int cardinalityLimit,
bool emitOverflowAttribute,
bool reclaimUnusedMetricPoints,
ExemplarFilterType? exemplarFilter,
ExemplarFilterType? exemplarFilterForHistograms)
Expand All @@ -193,7 +189,6 @@ internal void ApplyParentProviderSettings(
this.metrics = new Metric[metricLimit];
this.metricsCurrentBatch = new Metric[metricLimit];
this.cardinalityLimit = cardinalityLimit;
this.emitOverflowAttribute = emitOverflowAttribute;
this.reclaimUnusedMetricPoints = reclaimUnusedMetricPoints;
this.exemplarFilter = exemplarFilter;
this.exemplarFilterForHistograms = exemplarFilterForHistograms;
Expand Down
29 changes: 4 additions & 25 deletions test/OpenTelemetry.Tests/Metrics/AggregatorTestsBase.cs
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,14 @@ public abstract class AggregatorTestsBase
private static readonly ExplicitBucketHistogramConfiguration HistogramConfiguration = new() { Boundaries = Metric.DefaultHistogramBounds };
private static readonly MetricStreamIdentity MetricStreamIdentity = new(Instrument, HistogramConfiguration);

private readonly bool emitOverflowAttribute;
private readonly bool shouldReclaimUnusedMetricPoints;
private readonly AggregatorStore aggregatorStore;

protected AggregatorTestsBase(bool emitOverflowAttribute, bool shouldReclaimUnusedMetricPoints)
protected AggregatorTestsBase(bool shouldReclaimUnusedMetricPoints)
{
this.emitOverflowAttribute = emitOverflowAttribute;
this.shouldReclaimUnusedMetricPoints = shouldReclaimUnusedMetricPoints;

this.aggregatorStore = new(MetricStreamIdentity, AggregationType.HistogramWithBuckets, AggregationTemporality.Cumulative, 1024, emitOverflowAttribute, this.shouldReclaimUnusedMetricPoints);
this.aggregatorStore = new(MetricStreamIdentity, AggregationType.HistogramWithBuckets, AggregationTemporality.Cumulative, 1024, this.shouldReclaimUnusedMetricPoints);
}

[Fact]
Expand Down Expand Up @@ -253,7 +251,6 @@ public void HistogramBucketsDefaultUpdatesForSecondsTest(string meterName, strin
AggregationType.Histogram,
AggregationTemporality.Cumulative,
cardinalityLimit: 1024,
this.emitOverflowAttribute,
this.shouldReclaimUnusedMetricPoints);

KnownHistogramBuckets actualHistogramBounds = KnownHistogramBuckets.Default;
Expand Down Expand Up @@ -330,7 +327,6 @@ internal void ExponentialHistogramTests(AggregationType aggregationType, Aggrega
aggregationType,
aggregationTemporality,
cardinalityLimit: 1024,
this.emitOverflowAttribute,
this.shouldReclaimUnusedMetricPoints,
exemplarsEnabled ? ExemplarFilterType.AlwaysOn : null);

Expand Down Expand Up @@ -440,7 +436,6 @@ internal void ExponentialMaxScaleConfigWorks(int? maxScale)
AggregationType.Base2ExponentialHistogram,
AggregationTemporality.Cumulative,
cardinalityLimit: 1024,
this.emitOverflowAttribute,
this.shouldReclaimUnusedMetricPoints);

aggregatorStore.Update(10, Array.Empty<KeyValuePair<string, object?>>());
Expand Down Expand Up @@ -525,31 +520,15 @@ public ThreadArguments(MetricPoint histogramPoint, ManualResetEvent mreToEnsureA
public class AggregatorTests : AggregatorTestsBase
{
public AggregatorTests()
: base(emitOverflowAttribute: false, shouldReclaimUnusedMetricPoints: false)
{
}
}

public class AggregatorTestsWithOverflowAttribute : AggregatorTestsBase
{
public AggregatorTestsWithOverflowAttribute()
: base(emitOverflowAttribute: true, shouldReclaimUnusedMetricPoints: false)
: base(shouldReclaimUnusedMetricPoints: false)
{
}
}

public class AggregatorTestsWithReclaimAttribute : AggregatorTestsBase
{
public AggregatorTestsWithReclaimAttribute()
: base(emitOverflowAttribute: false, shouldReclaimUnusedMetricPoints: true)
{
}
}

public class AggregatorTestsWithBothReclaimAndOverflowAttributes : AggregatorTestsBase
{
public AggregatorTestsWithBothReclaimAndOverflowAttributes()
: base(emitOverflowAttribute: true, shouldReclaimUnusedMetricPoints: true)
: base(shouldReclaimUnusedMetricPoints: true)
{
}
}
Loading
Loading