Exemplar clarifications #2421

jack-berg · 2022-03-17T15:58:36Z

Doing a pass on the java exemplar implementation and found some places in the spec that need additional clarification IMO:

More clearly define built-in exemplar filters
The spec says to "See Defaults and Configuration for built-in filters", which eventually leads to a list of "none", "all", and "with_sampled_trace". These should be defined in the metric SDK document for improved clarify.

Define how an ExemplarFilter gets configured in the SDK
The Java SDK currently allows ExemplarFilter to be configured on a meter provider instance, which is presumably the right place, but that isn't explicitly stated anywhere.

Clarify whether views can configure exemplar reservoirs
There's a TODO that states "after we release the initial Stable version of Metrics SDK specification, we will explore how to allow configuring custom ExemplarReservoirs with the View API.", yet the view section specifies that exemplar reservoirs can optionally be specified.

Is the intention that views can be configured with variations of the built-in reservoirs, but custom reservoirs are disallowed for the time being?

Define the default size of SimpleFixedSizeExemplarReservoir
If exemplars are enabled, SimpleFixedSizeExemplarReservoir is used by all aggregations except explicit bucket histogram, yet the default size of the reservoir is left unspecified.

The text was updated successfully, but these errors were encountered:

cijothomas · 2023-03-07T01:41:53Z

@jack-berg #2919 has addressed the 1st part, that can be ~~cut~~ now.

MrAlias · 2023-03-08T21:00:14Z

Regarding point 2 and 3, this statement from the specification needs to be clarified:

A Metric SDK MUST provide a mechanism to sample Exemplars from measurements via the ExemplarFilter and ExemplarReservoir hooks.

MrAlias · 2023-03-08T21:01:26Z

The Java SDK currently allows ExemplarFilter to be configured on a meter provider instance, which is presumably the right place, but that isn't explicitly stated anywhere.

That sounds appropriate to me. I could also see in addition to this having views set an override.

MrAlias · 2023-03-08T21:08:29Z

yet the view section specifies that exemplar reservoirs can optionally be specified.

That is indeed confusing given there are only two option and configuring them differently would lead to incorrect behavior.

Is the intention that views can be configured with variations of the built-in reservoirs, but custom reservoirs are disallowed for the time being?

If we can verify the reservoir definition is mostly stable and extensible in a backwards compatible manner I think allowing users to provide their own would be a big benefit. This space is open to a lot of complex and tailored algorithms that I expect users want to include.

MrAlias · 2023-04-23T16:29:16Z

The scope of the reservoir sampling needs to also be clarified.

It is not explicitly stated, but it seems safe to assume an exemplar reservoir is scoped to a single instrument and not the entire SDK. Right?

Within the scope of an instrument, is an exemplar reservoir sampling scoped by attributes? For example, if an instrument measures a value with attributes {"user" -> "Alice"} and another measurement with attributes {"user" -> "Bob", "admin" -> true} are both measurements sampled by the same reservoir, or are they each sampled by their own reservoir?

Based on the Java implementation, I think the reservoir scope is all attribute sets across an instrument. However, the OTLP data model makes me wonder if that is correct. Messages like the NumberDataPoint contains a set of exemplars that all apply to same attribute set that scope NumberDataPoint.

jack-berg · 2023-04-23T21:36:30Z

Based on the Java implementation, I think the reservoir scope is all attribute sets across an instrument.

Its a bit hard to see in the source code, but the java SDK does actually create a reservoir for each unique set of attributes.

MrAlias · 2023-04-24T15:22:39Z

Its a bit hard to see in the source code, but the java SDK does actually create a reservoir for each unique set of attributes.

Ah, gotcha.

So if you have a fixed size exemplar reservoir that samples N exemplars, each unique set of attributes will have N exemplars (assuming more than N measurements are made for each)?

Is this the behavior we want? I like this strategy because it is easier to implement, but I worry it is going to generate a lot of exemplar data, right?

jack-berg · 2023-04-24T15:39:05Z

So if you have a fixed size exemplar reservoir that samples N exemplars, each unique set of attributes will have N exemplars (assuming more than N measurements are made for each)?

Correct.

Is this the behavior we want? I like this strategy because it is easier to implement, but I worry it is going to generate a lot of exemplar data, right?

Depends on the size of N. In java, we choose N to be equal to the number of available processors. This decision isn't specified anywhere, but it was included (I believe by @jsuereth) in the initial exemplar implementation and has stuck.

This is typically smaller than the number of exemplars for histograms, which is one per bucket.

Not sure what exactly I would expect the default to be. I suppose I would expect each unique set of attributes to typically receive at least on the order of hundreds of measurements for collection period. (Given that the default interval for PMR is 30s, 100 measurements would be >= 3.3 / second.) I think something in the range of 1-10 example measurements seems appropriate.

MrAlias · 2023-04-24T15:56:26Z

I think something in the range of 1-10 example measurements seems appropriate.

Isn't this then multiplied by the number of unique attribute sets though? So if a users is using the random fixed size reservoir with a size of 10, they measure across N unique attribute sets, and they make more than 10 measurements per unique attribute set per collection cycle they will have 10*N exemplars.

I could see users asking for a random fixed size reservoir with a size of 10 to expect they will get at most 10 exemplars per collection cycle.

jack-berg · 2023-04-24T15:59:49Z

Isn't this then multiplied by the number of unique attribute sets though?

Yes. 1 seems more appropriate if the number of measurements is around 100-1000. 10 seems more appropriate if the number of measurements is much larger, say 10_000+.

Might be good to specify that the default size of the fixed size reservoir is 1.

MrAlias · 2023-04-24T16:16:13Z

Yes. 1 seems more appropriate if the number of measurements is around 100-1000. 10 seems more appropriate if the number of measurements is much larger, say 10_000+.

Might be good to specify that the default size of the fixed size reservoir is 1.

Yeah, this kind of cardinality issues was what lead me to ask the original scope question. If we define the sampling scope to be the instrument (across all unique attributes) instead a user will have a better ability to set the output size they want.

MrAlias · 2023-04-24T16:19:28Z

If we want to stick with the sampling scope being "per unique attribute", my next question was if we want to keep passing the attributes to the Offer method? It seems like these will always be the same value and it won't be needed if the exemplar filter can continue to handle these.

This reservoir type is used for all aggregations other than a histogram with more than one bucket. Each attribute set the aggregation records will have reservoir. Therefore, limiting this to a small value by default when enabled is preferable. This does not address the way a user will configure this value. That is left for a future PR/Issue. Part of open-telemetry#2421

jsuereth · 2023-11-03T17:42:29Z

Just commenting on:

If we want to stick with the sampling scope being "per unique attribute", my next question was if we want to keep passing the attributes to the Offer method? It seems like these will always be the same value and it won't be needed if the exemplar filter can continue to handle these.

For Java at least (and I suspect it may be true in Go), passing the full set of attributes can lead to more optimal overall throughput.

Specifically:

You're not guaranteed to keep any particular exemplar. So if you delay diff-ing, you should diff O(m) attribute sets, where m=size of the reservoir vs O(n) where n=the number of measurements seen.
You're likely using a "shared reference" to the attribute set, so you're only paying the cost of a pointer.

Partially fixes: open-telemetry#2421

…n-telemetry#3760) Fixes open-telemetry#2205 Fixes open-telemetry#3674 Fixes open-telemetry#3669 Partially fixes open-telemetry#2421 ## Changes - Update example exemplar algorithm to account for initial reservoir fill - Update fixed-size defaults to account for memory contention / optimization in Java impl - Set a default for exponential histogram aggregation - Clarify that ExemplarFilter should be configured on MeterProvider - Make it clear that ONE reservoir is create PER timeseries datapoint (not one reservoir per view or metric name). - Allow flexibility in Reservoir `offer` definition based on feedback from Go impl. * Related issues open-telemetry#3756 --------- Co-authored-by: David Ashpole <[email protected]> Co-authored-by: Joshua MacDonald <[email protected]>

jack-berg added the spec:metrics Related to the specification/metrics directory label Mar 17, 2022

github-actions bot assigned jmacd Mar 17, 2022

jack-berg mentioned this issue Mar 17, 2022

Limit exemplar functionality until stable open-telemetry/opentelemetry-java#4272

Closed

reyang added area:sdk Related to the SDK release:allowed-for-ga Editorial changes that can still be added before GA since they don't require action by SIGs labels Mar 18, 2022

MrAlias mentioned this issue Jun 2, 2023

Mark Metrics SDK env vars as Stable #3530

Closed

MrAlias mentioned this issue Jul 27, 2023

Revise the exemplar default reservoirs #3627

Merged

MrAlias mentioned this issue Aug 21, 2023

Recommend a default size of 1 for the SimpleFixedSizeExemplarReservoir #3670

Merged

jsuereth mentioned this issue Nov 3, 2023

Tracking: Exemplar Specification - Stabilization #3756

Closed

8 tasks

jsuereth added a commit to jsuereth/opentelemetry-specification that referenced this issue Nov 10, 2023

Clarify that ExemplarFilter should be configred on MeterProvider.

431a7e3

Partially fixes: open-telemetry#2421

jsuereth mentioned this issue Nov 10, 2023

Clarifications and "flexibility" fixes in Exemplar Specification #3760

Merged

jmacd closed this as completed in #3760 Dec 1, 2023

jmacd closed this as completed in bb3d0a0 Dec 1, 2023

MrAlias mentioned this issue Dec 15, 2023

Add exemplars to the metric SDK as an experimental feature open-telemetry/opentelemetry-go#4455

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exemplar clarifications #2421

Exemplar clarifications #2421

jack-berg commented Mar 17, 2022

cijothomas commented Mar 7, 2023

MrAlias commented Mar 8, 2023

MrAlias commented Mar 8, 2023

MrAlias commented Mar 8, 2023

MrAlias commented Apr 23, 2023

jack-berg commented Apr 23, 2023

MrAlias commented Apr 24, 2023

jack-berg commented Apr 24, 2023

MrAlias commented Apr 24, 2023

jack-berg commented Apr 24, 2023

MrAlias commented Apr 24, 2023

MrAlias commented Apr 24, 2023

jsuereth commented Nov 3, 2023

Exemplar clarifications #2421

Exemplar clarifications #2421

Comments

jack-berg commented Mar 17, 2022

cijothomas commented Mar 7, 2023

MrAlias commented Mar 8, 2023

MrAlias commented Mar 8, 2023

MrAlias commented Mar 8, 2023

MrAlias commented Apr 23, 2023

jack-berg commented Apr 23, 2023

MrAlias commented Apr 24, 2023

jack-berg commented Apr 24, 2023

MrAlias commented Apr 24, 2023

jack-berg commented Apr 24, 2023

MrAlias commented Apr 24, 2023

MrAlias commented Apr 24, 2023

jsuereth commented Nov 3, 2023