-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
explicitly add +inf bucket in withExemplarsMetric #1094
Conversation
Signed-off-by: Arun Mahendra <[email protected]>
c3fe743
to
2d3b601
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! Some comments, but generally I love this idea - thanks for troubleshooting this 💪🏽
prometheus/metric.go
Outdated
break | ||
// end looping after creating +inf bucket and adding one exemplar. | ||
// there could be other exemplars that are in the "inf" range but those will be ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want those comments, let's make them full sentence, but also I am not sure this is safe. Nowhere in the interface/signature we mention that exemplars will be sorted by anything. I think there is no harm to continue the loop, unless we want to optimize this some day. WDYT?
break | |
// end looping after creating +inf bucket and adding one exemplar. | |
// there could be other exemplars that are in the "inf" range but those will be ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you are right that there is no mention (or expectation) of sorting exemplars. If we leave the loop running we pick the last exemplar and if we leave it as is and terminate it here, we pick the first exemplar (in that range) - sort order of the exemplar value being arbitrary. I would consider terminating the loop as it would just avoid running through the remaining exemplars that we are not going to use anyway. But please let me know if you think otherwise or see any safety concerns, I can make the change.
Also, I made the comment into a sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not addressed completely. What do you about lack of sorted order invariant?
Nowhere in the interface/signature we mention that exemplars will be sorted by anything. I think there is no harm to continue the loop, unless we want to optimize this some day. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all the buckets we are comparing the exemplar values to the bucket bound and matching the right exemplar to right bucket (this is the exiting logic):
specifically the check here to get the index of the right bucket for that exemplar:
client_golang/prometheus/metric.go
Line 182 in a528aff
return pb.Histogram.Bucket[i].GetUpperBound() >= e.GetValue() |
And assigning the exemplar to the right bucket here:
client_golang/prometheus/metric.go
Line 185 in a528aff
pb.Histogram.Bucket[i].Exemplar = e |
At a high level the only change in this PR is that instead of a panic in the else condition, we add the +Inf
bucket and add one exemplar that is outside of all previous bucket range and break the loop.
If there are multiple exemplars for the +Inf
bucket, we could pick the exemplar that is more representative of the group such as a median - a future improvement, would require further discussion. Currently we are just picking the first in the array in the +inf bucket range.
I hope that addresses your concern, if I understood your question correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bwplotka waiting for your response :) Would be nice if we could include this in the upcoming release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed this discussion, sorry.
So I like the idea about median or tuning the exemplar we take from those belonging to +Inf. My only problem is that those inputs can be not sorted, that's it. Hope this PR #1100 makes sense to you.
Co-authored-by: Bartlomiej Plotka <[email protected]>
Co-authored-by: Bartlomiej Plotka <[email protected]>
Exemplar: e, | ||
} | ||
pb.Histogram.Bucket = append(pb.Histogram.Bucket, b) | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing this, but I think one thing is still not addressed: #1094 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @arun-shopify - otherwise this PR is rdy to merge (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix that for you if you don't mind - I am preparing next release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merging and will fix in separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, also response in #1094 (comment)
Problem:
Currently if there are exemplar values that are outside the maximum bucket bounds for a histogram metric, client_go lang will panic https://github.com/prometheus/client_golang/blob/main/prometheus/metric.go#L186-L189
When using this client in OpenTelemetry Collector, this panic leads to a crash and therefore we are unable to use client_golang in prometheus exporter in Open Telemetry Collector. There is a discussion about this issue in our PR in that repository.
Proposed solution:
In this PR we are proposing a solution to this issue by explicitly adding
+inf
bucket when there are exemplar values outside the max bucket bound and picking one of the exemplars if there are more than one in that range.@bwplotka @kakkoyun