Skip to content

[exporter/datadogexporter] Improve reset detection on cumulative metrics#6120

Merged
bogdandrutu merged 4 commits into
open-telemetry:mainfrom
DataDog:mx-psi/improve-reset
Nov 5, 2021
Merged

[exporter/datadogexporter] Improve reset detection on cumulative metrics#6120
bogdandrutu merged 4 commits into
open-telemetry:mainfrom
DataDog:mx-psi/improve-reset

Conversation

@mx-psi
Copy link
Copy Markdown
Member

@mx-psi mx-psi commented Nov 3, 2021

Description:

The Datadog exporter automatically calculates the diff between points on cumulative monotonic sums, cumulative histograms and the fields of summaries that grow cumulatively. Until now we didn't take the start timestamp into account, so we were not detecting certain resets. We were also using monotonic diffs on non-monotonic metrics (e.g. histogram sum or count), which would omit negative values.

This PR updates the reset-detection code to conform to the "Resets and gaps" section of the data model specification.

Note that we don't report the first value we get, to account for Collector restarts.

Link to tracking Issue: Relates to #5378 discussion (this can avoid the .sum metric being 0 in certain cases)

Testing: Added unit tests to test new behavior

- Take startTimestamp into account.
- Differentiate between monotonic and non-monotonic diffs
if delta {
as.InsertInterpolate(lowerBound, upperBound, uint(count))
} else if dx, ok := t.prevPts.putAndGetDiff(name, bucketTags, ts, float64(count)); ok {
} else if dx, ok := t.prevPts.Diff(name, bucketTags, startTs, ts, float64(count)); ok {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be MonotonicDiff, since the bucket count values should be increasing. However, in the case of a bug downstream, it could lead to reseting some bucket count values but not others, making the error harder to debug

if delta {
consumer.ConsumeTimeSeries(ctx, fullName, Count, ts, count, bucketTags, host)
} else if dx, ok := t.prevPts.putAndGetDiff(fullName, bucketTags, ts, count); ok {
} else if dx, ok := t.prevPts.Diff(fullName, bucketTags, startTs, ts, count); ok {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment applies here (this case would be easier to debug but for consistency I would keep it as is)

Comment thread exporter/datadogexporter/internal/translator/metrics_translator.go
Comment thread exporter/datadogexporter/internal/translator/metrics_translator.go
@mx-psi mx-psi marked this pull request as ready for review November 3, 2021 10:53
@mx-psi mx-psi requested review from a team and jpkrohling November 3, 2021 10:53
@mx-psi mx-psi requested a review from KSerrania November 3, 2021 15:55
Comment thread exporter/datadogexporter/internal/translator/metrics_translator.go Outdated
@mx-psi mx-psi requested a review from KSerrania November 4, 2021 13:01
Copy link
Copy Markdown
Contributor

@KSerrania KSerrania left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left questions about tests, otherwise LGTM

Comment thread exporter/datadogexporter/internal/translator/ttlcache_test.go
Comment thread exporter/datadogexporter/internal/translator/ttlcache_test.go
@mx-psi mx-psi requested a review from KSerrania November 5, 2021 09:00
@mx-psi mx-psi added the ready to merge Code review completed; ready to merge by maintainers label Nov 5, 2021
@bogdandrutu bogdandrutu merged commit eb3b3c4 into open-telemetry:main Nov 5, 2021
@mx-psi mx-psi deleted the mx-psi/improve-reset branch November 8, 2021 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready to merge Code review completed; ready to merge by maintainers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants