Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exemplar support for const histogram #986

Merged
merged 7 commits into from
Mar 17, 2022

Conversation

wperron
Copy link
Contributor

@wperron wperron commented Feb 10, 2022

Relates to #868. We're looking to add support for exemplars on OpenTelemetry's collector contrib's prometheusexporter which uses the constHistogram internally. This change solves the case where metrics coming from another system (in our case the otel collector) need to be converted to the Prometheus format and retain their exemplars. Right now we've only implemented exemplars on the const histogram, I'm more than happy to do all the necessary adjustments.

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, thanks! Just one nit.

@beorn7 @kakkoyun any objections?

@@ -655,6 +670,48 @@ func MustNewConstHistogram(
return m
}

func NewConstHistogramWithExemplar(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use NewConstHistogramWithExemplar in NewConstHistogram for less code repetition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, did you mean to use NewConstHistogram inside NewConstHistogramWithExemplar? My goal was to keep the existing API unchanged but whatever you folks prefer is fine by me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, opposite. I think you misunderstood, let me show in PR to your PR 🙃

@beorn7
Copy link
Member

beorn7 commented Feb 15, 2022

Counters can also have exemplars. Instead of also adding NewConstMetricWithExemplar, how about using the NewMetricWithExemplar approach described in this comment? This wrapper would detect if the metrics to be wrapped is a histogram or a counter and then act accordingly (and panic for other metric types).

@beorn7
Copy link
Member

beorn7 commented Feb 15, 2022

It would also avoid having to add the Must... version separately. Instead of NewConstHistogramWithExemplar, MustNewConstHistogramWithExemplar, NewConstMetricWithExemplar, and MustNewConstMetricWithExemplar, we would just need the one NewMetricWithExemplar wrapper.

prometheus/histogram.go Outdated Show resolved Hide resolved
@wperron
Copy link
Contributor Author

wperron commented Feb 15, 2022

Counters can also have exemplars. Instead of also adding NewConstMetricWithExemplar, how about using the NewMetricWithExemplar approach described in this comment? This wrapper would detect if the metrics to be wrapped is a histogram or a counter and then act accordingly (and panic for other metric types).

It would also avoid having to add the Must... version separately. Instead of NewConstHistogramWithExemplar, MustNewConstHistogramWithExemplar, NewConstMetricWithExemplar, and MustNewConstMetricWithExemplar, we would just need the one NewMetricWithExemplar wrapper.

So I started looking into this last Friday and it's less straight forward than that comment makes it seem. NewMetricWithTimestamp is pretty straight forward because every Metric can have a timestamp associated with it, whereas not every Metric can have exemplars. Also, exemplars on Histograms are pretty different than on simple values. I agree it'd be the best outcome, I just thought I'd at least open the PR to start discussion and go from there. I'll get cracking on this and hopefully have something up soon that addresses those comments.

@beorn7
Copy link
Member

beorn7 commented Feb 15, 2022

Indeed, a NewMetricWithExemplar would need to do something about metric types that cannot have exemplars (either panicking or silently not adding an exemplar or returning an error – I can see points for and against each of the three).

And yes, the Write implementation of the wrapper needed to take into account that histograms get their exemplar added differently than counters. But that should be easy (in my imagination – perhaps I'm missing something :-).

@wperron
Copy link
Contributor Author

wperron commented Feb 16, 2022

Hey folks, so I've been looking into a potential func WithExemplar() (Metric, error) function that could be a general case, sorta like WithTimestamp does but the more I look at it, the less I'm convinced that there's a good way to implement it.

From what I understand there’s 4 Metrics that can have exemplars: ExemplarAdder, ExemplarObserver, constMetric and constHistogram. For the first two, the layer of abstraction doesn’t add a lot, it ends up just doing what ppl are already doing right now, asserting that the metric implements the interface, then just call AddWithExemplar or ObserveWithExemplar. For the last two, the data they need is so different I’m struggling to find a way to make a function signature that would work with both of them that would also be clear enough. For reference, here are what the current signatures look like:

NewConstHistogram(desc *Desc, count uint64, sum float64, buckets map[float64]uint64, labelValues ...string)
NewConstMetric(desc *Desc, valueType ValueType, value float64, labelValues ...string) (Metric, error)

To add exemplars, the first one needs a []*dto.Exemplar because each bucket has it’s own exemplar, and the second one needs *dto.Exemplar because, well, there's only one, and I can't use ...*dto.Exemplar because the existing functions already use a variadic parameter for the LabelValues.

I can add a WithExemplar variant to NewConstMetric and MustNewConstMetric to make this PR more complete, or just leave it as is too. Any thoughts?

@beorn7
Copy link
Member

beorn7 commented Feb 16, 2022

Hmm, I increasingly believe there is no real good way of solving this in the current context.

I really don't like to expose dto.Exemplar to the user of the library directly. In ObserveWithExemplar and AddWithExemplar, we use (value float64, exemplar Labels) as the signature, not (exemplar *dto.Exemplar). If we let users construct their own dto.Exemplar, we are not that far away from letting them create their own dto.Metric (which is not meant sarcastically, there are cases where that's just the right thing to do).

I think, from a pure interface perspective, the least bad solution would be a func NewMetricWithExemplar(value float64, exemplar Labels, metric Metric) Metric. It would work like func NewMetricWithTimestamp(t time.Time, m Metric) Metric, i.e. we would have a metricWithExemplar wrapper type corresponding to timestampedMetric, and we would put the exemplar into the protobuf at Write time. NewMetricWithExemplar would work on any Metric implementation. You could use it repeatedly to add multiple exemplars. At Write time, the following happens:

  • In case the written protobuf is a counter and the value is not negative, the exemplar is added to it, overwriting any existing exemplar, i.e. the outermost exemplar wins (in case NewMetricWithExemplar was used multiple times or NewMetricWithExemplar was used on an ExemplarAdder implementation that already had an examplar).
  • In case the written protobuf is a histogram, the suitable bucket for the value is found and the exemplar is added to it, again overwriting existing exemplars.
  • In all other cases, the exemplar is silently ignored.

The last point is the most "dirty" aspect of this, but maybe it's OK (think of the function as "add an exemplar to this metric if it makes sense"). It's slightly annoying that you have to call NewMetricWithExemplar multiple times to add many exemplars (most commonly to add an exemplar for each bucket of a multi-bucket histogram). But you can do that in a loop, so it might not be that bad. However, that leads to the serious concern I have, that the implementation cannot be very efficient. If you have a histogram with 20 buckets, you'll end up with a 20-fold wrapped histogram. Maybe that's not that bad for most use cases (and for the few that need optimal performance, those can fall back to build their own dto.Metric).

So maybe, overall, the approach in this PR is already the least bad. I don't know. @bwplotka and @kakkoyun you should make the call here. 😸

@wperron
Copy link
Contributor Author

wperron commented Feb 25, 2022

Hey folks, any updates on this PR?

@beorn7
Copy link
Member

beorn7 commented Feb 25, 2022

@bwplotka and @kakkoyun, I think all the pros and cons of the various approaches have been laid out above. We need you to make a call what direction to take.

@bwplotka
Copy link
Member

Ack, we will sync on it with @kakkoyun on Monday. Thanks!

arun-shopify and others added 4 commits March 7, 2022 08:38
Co-authored-by: William Perron <[email protected]>
Signed-off-by: William Perron <[email protected]>
Signed-off-by: William Perron <[email protected]>
reduce repetition in constHistogram w/ exemplar

Signed-off-by: William Perron <[email protected]>
Signed-off-by: William Perron <[email protected]>
@wperron wperron force-pushed the feat/exemplar_const_metrics branch from 65e8cde to 47d3c54 Compare March 7, 2022 21:09
wperron and others added 2 commits March 11, 2022 16:07
Co-authored-by: Francis Bogsanyi <[email protected]>

Signed-off-by: William Perron <[email protected]>
Co-authored-by: Arun Mahendra <[email protected]>

Signed-off-by: William Perron <[email protected]>
@wperron
Copy link
Contributor Author

wperron commented Mar 11, 2022

Hey @bwplotka I hate to be that guy but, any chance we can get a set of eyes on this soon?

@bwplotka
Copy link
Member

Ack, will look on Tuesday, adding to my todo list.

@bwplotka
Copy link
Member

Looking now, sorry for lag

@kakkoyun kakkoyun added the wip label Mar 16, 2022
@bwplotka
Copy link
Member

bwplotka commented Mar 16, 2022

@wperron I proposed a PR to your PR. Tests are failing as I broke histogram upper bound check, fizzing now: Shopify#3 but shows the alternative usage (:

@bwplotka
Copy link
Member

All fixed PTAL (:

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking this to decide what's the best approach before merging 🙃

Changes:
* Make sure to not "leak" dto.Metric
* Reused upper bounds we already have for histogram
* Common code for all types.

Signed-off-by: Bartlomiej Plotka <[email protected]>
@wperron
Copy link
Contributor Author

wperron commented Mar 16, 2022

Merged in your suggested changes, let me know if there's anything you want me to change 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants