Add observable (async) counter to the metrics API #1590

reyang · 2021-04-01T06:31:52Z

Related to #1578.

Changes

During the 03/25/2021 Metrics API/SDK SIG Meeting, we've agreed to tackle an async instrument after Counter is done.

After this PR, the only outstanding issues for the metrics API spec would be:

Figure out the remaining instruments
- How many do we need
- What should be the names
Hint API (we've discussed during the SIG meeting that we will move forward to the SDK spec and come back to this topic after SDK spec is in shape)

This will be discussed during the upcoming (4/1) Metrics API/SDK SIG meeting.

Related oteps OTEP146

specification/metrics/new_api.md

…allback state

specification/metrics/new_api.md

jsuereth

A few non-blocking comments.

jsuereth · 2021-04-02T15:46:48Z

specification/metrics/new_api.md

@@ -319,6 +421,14 @@ for the interaction between the API and SDK.
 * A value
 * [`Attributes`](../common/common.md#attributes)

+## Compatibility
+
+All the metrics components SHOULD allow new APIs to be added to existing


+1. We may need to provide guidance here during implementations, but glad to see this called out.

jsuereth · 2021-04-02T15:50:29Z

specification/metrics/new_api.md

+
+Example uses for `ObservableCounter`:
+
+* [CPU time](https://wikipedia.org/wiki/CPU_time), which could be reported for


I think It''d be good to call out use cases for where "ObservableCounter" is not good for tracking CPU time.

E.g. in a multi-tenant scenario where you want to track CPU usage / user, you'd actually want some kind of sycnhronous instrument that can report on request-level usage with an identified user.

Observable / Async metrics are really only useful for context-less metrics (those that cannot interact with other telemetry). I'd like to call that out here for folks deciding between Async/Sync as I think Otel really shines in the Sync scenario w/ Context.

Indeed, but this is tricky because you don't want users that report cpu_usage (not per request) to use the sync version and believe that is better :)

specification/metrics/new_api.md

bogdandrutu · 2021-04-02T19:03:17Z

specification/metrics/new_api.md

+
+Example uses for `ObservableCounter`:
+
+* [CPU time](https://wikipedia.org/wiki/CPU_time), which could be reported for


Indeed, but this is tricky because you don't want users that report cpu_usage (not per request) to use the sync version and believe that is better :)

specification/metrics/new_api.md

…o handle it

specification/metrics/new_api.md

hdost

Are any of the open comments actually un-resolved at this point?

specification/metrics/new_api.md

…of time

reyang · 2021-04-07T18:27:18Z

Are any of the open comments actually un-resolved at this point?

So far all the outstanding/blocking comments are resolved.

There are some open discussions (e.g. nice to have some high level document describing why we want to distinguish monotonic vs. non-monotonic, and providing examples/suggestions on when to use sync vs. async instrumentation), these are non-blocking and can be addressed in separate PRs.

specification/metrics/new_api.md

noahfalk · 2021-04-08T00:55:29Z

specification/metrics/new_api.md

+### CounterFunc
+
+`CounterFunc` is an asynchronous Instrument which reports
+[monotonically](https://wikipedia.org/wiki/Monotonic_function) increasing


Who do we believe has the onus to enforce that the values are monotonic? Is it undefined behavior if the callback returns a value that is smaller than the one it returned on the last invocation?

Alternatively, if the only purpose of saying the values are monotonic is to infer that the user prefers that the data is presented as a rate, I've yet to understand a problem with redefining the counter as a "uses rate presentation by default" counter. I can still make a mathematically well defined rate function that is sum_of_measurements / time and there is no requirement to only accept positive inputs.

(I'm fine accepting the PR as-is and resolving this as a follow up)

I would say something like the default SDK SHOULD enforce monotonicity, and that it is an error when a callback returns a value less than a previously returned value.

I can still make a mathematically well defined rate function that is sum_of_measurements / time and there is no requirement to only accept positive inputs

I take it you are considering whether we could drop the monotonicity idea. Why should we use separate instruments for monotonic and non-monotonic streams? There are legacy arguments that this information should be preserved, but I don't like to use those arguments. One good reason is simply to help the user. If you have a monotonic instrument, we can detect when you use it incorrectly.

I take it you are considering whether we could drop the monotonicity idea

Yeah, but very open to the notion that I may not have the full picture. What I have read so far in the conversation suggests that users don't care about monotonicity, they care whether the data is shown as a rate or as an absolute value. If its true that rate vs. absolute value is all they care about then I think pros/cons of enforcing monotonicity anyways would be:
pros:

some users may misuse the API and we correctly gave them error that helped them fix their code

cons:

some users might want to report a negative rate and they will be frustrated if the SDK prevents that usage

If we add more instruments to have separate monotonic and non-monotonic rate options this adds complexity to all users when choosing the appropriate instrument for their use case

There is a (tiny) perf cost to verify every measurement, probably only relevant in the synchronous instrument case.

We've covered this topic during the SIG meeting that monotonic vs. non-monotonic is important for the user experience, and it is also widely adopted by many well established metrics systems/libraries.

So here goes my suggestion:

Keep the current approach - have dedicated monotonic instruments (e.g. Counter/CounterFunc)

Clarify in the SDK spec whether we want to the SDK to enforce monotonicity or not (I think I agree with @jmacd that the default SDK SHOULD enforce monotonicity).

Use Editorial change - call out that Counter is monotonic #1593 to track the further clarification - for example, we might want to explain these concepts and put some examples in the README file.

We've covered this topic during the SIG meeting that monotonic vs. non-monotonic is important for the user experience

Perhaps I am being dense but that wasn't my understanding of the conversation. To me it felt like I put forth a question and an idea. Nobody responded saying they loved it nor did they respond saying its bad. Perhaps others thought it wasn't a good idea and were being polite, or wanted time to think it over, or I simply didn't understand the feedback : )

@jmacd, I know at the end of the SIG meeting you said you felt like you didn't have adequate opportunity to respond and mentioned some concern that the conversation was calling into question too much from the original spec. I wasn't clear whether you were refering to my comments or other comments/questions that were in the vicinity. Certainly my aim is not to push you (or anyone else) into a position you disagree with and I welcome your thoughts. So far I get the impression you aren't particularly excited about pivoting the definition of Counter to mean "displays as rate by default" with no enforcement of monotonicity. I don't know if anything I have said since was convincing or you still feel that being able to report negative measurements as errors is the most compelling concern at play so we should lay the issue to rest with that as the conclusion?

So here goes my suggestion:

To me the issue appeared unresolved. I don't want to hold up the PR and I am fine accepting that Counter defined to be monotonic is the presumptive outcome at this point. I am hoping there will be a little more discussion in #1593, ideally considering the alternatives, but if not that then at least to clarify and document the rationale. Assuming that Counter is monotonic, having the SDK be the enforcement makes sense to me. Thanks all!

specification/metrics/new_api.md

noahfalk

I added some suggestions, but I am also happy to accept the changes as-is and follow up on specific adjustments/questions in future targetted PRs.

jmacd · 2021-04-08T05:28:58Z

specification/metrics/new_api.md

+### CounterFunc
+
+`CounterFunc` is an asynchronous Instrument which reports
+[monotonically](https://wikipedia.org/wiki/Monotonic_function) increasing


I would say something like the default SDK SHOULD enforce monotonicity, and that it is an error when a callback returns a value less than a previously returned value.

I can still make a mathematically well defined rate function that is sum_of_measurements / time and there is no requirement to only accept positive inputs

I take it you are considering whether we could drop the monotonicity idea. Why should we use separate instruments for monotonic and non-monotonic streams? There are legacy arguments that this information should be preserved, but I don't like to use those arguments. One good reason is simply to help the user. If you have a monotonic instrument, we can detect when you use it incorrectly.

specification/metrics/new_api.md

… delta/rate

cijothomas

LGTM

* add observable counter * change the wording to allow language client to decide how to handle callback state * update based on discussion during the SIG meeting * add observer example * clarify that it is the instrument not meter being observed * clarify that duplicates are not allowed, and the sdk can decide how to handle it * address review feedback * fix typo * specify that the callback ensures a single timestamp * rename to CounterFunc * address review comments * clarify that the callback function should not take indefinite amount of time * update the wording for callback based on PR comments * add notes that CounterFunc callback returns absolute value instead of delta/rate * update the example based on feedback

add observable counter

21e4501

reyang requested review from a team April 1, 2021 06:31

github-actions bot assigned yurishkuro Apr 1, 2021

reyang added spec:metrics Related to the specification/metrics directory area:api Cross language API specification issue labels Apr 1, 2021

reyang commented Apr 1, 2021

View reviewed changes

specification/metrics/new_api.md Show resolved Hide resolved

reyang changed the title ~~Add observable counter~~ Add observable (async) counter to the metrics API Apr 1, 2021

noahfalk reviewed Apr 1, 2021

View reviewed changes

specification/metrics/new_api.md Outdated Show resolved Hide resolved

noahfalk reviewed Apr 1, 2021

View reviewed changes

specification/metrics/new_api.md Outdated Show resolved Hide resolved

change the wording to allow language client to decide how to handle c…

1264dcb

…allback state

reyang force-pushed the reyang/async-counter branch from 1fa59f9 to 1264dcb Compare April 1, 2021 20:34

Merge branch 'main' into reyang/async-counter

be6b362

victlu reviewed Apr 2, 2021

View reviewed changes

specification/metrics/new_api.md Outdated Show resolved Hide resolved

victlu reviewed Apr 2, 2021

View reviewed changes

specification/metrics/new_api.md Show resolved Hide resolved

update based on discussion during the SIG meeting

3c2ffc6

reyang commented Apr 2, 2021

View reviewed changes

specification/metrics/new_api.md Outdated Show resolved Hide resolved

reyang commented Apr 2, 2021

View reviewed changes

specification/metrics/new_api.md Outdated Show resolved Hide resolved

reyang commented Apr 2, 2021

View reviewed changes

specification/metrics/new_api.md Outdated Show resolved Hide resolved

add observer example

05d9ce8

jsuereth approved these changes Apr 2, 2021

View reviewed changes

bogdandrutu reviewed Apr 2, 2021

View reviewed changes

jonatan-ivanov reviewed Apr 2, 2021

View reviewed changes

specification/metrics/new_api.md Show resolved Hide resolved

reyang mentioned this pull request Apr 2, 2021

Editorial change - call out that Counter is monotonic #1593

Closed

reyang added 4 commits April 2, 2021 14:38

clarify that it is the instrument not meter being observed

9f97e83

clarify that duplicates are not allowed, and the sdk can decide how t…

f72826f

…o handle it

address review feedback

a8f3b39

fix typo

dca5ddc

victlu reviewed Apr 5, 2021

View reviewed changes

specification/metrics/new_api.md Show resolved Hide resolved

reyang force-pushed the reyang/async-counter branch from f69e4c4 to b06a5d9 Compare April 7, 2021 06:36

hdost approved these changes Apr 7, 2021

View reviewed changes

victlu reviewed Apr 7, 2021

View reviewed changes

specification/metrics/new_api.md Outdated Show resolved Hide resolved

reyang added 3 commits April 7, 2021 11:01

address review comments

be2c265

clarify that the callback function should not take indefinite amount …

97a01e8

…of time

Merge branch 'main' into reyang/async-counter

09c4ec8

cijothomas reviewed Apr 7, 2021

View reviewed changes

specification/metrics/new_api.md Outdated Show resolved Hide resolved

victlu reviewed Apr 8, 2021

View reviewed changes

specification/metrics/new_api.md Show resolved Hide resolved

victlu reviewed Apr 8, 2021

View reviewed changes

specification/metrics/new_api.md Show resolved Hide resolved

victlu reviewed Apr 8, 2021

View reviewed changes

specification/metrics/new_api.md Outdated Show resolved Hide resolved

victlu reviewed Apr 8, 2021

View reviewed changes

specification/metrics/new_api.md Show resolved Hide resolved

victlu reviewed Apr 8, 2021

View reviewed changes

specification/metrics/new_api.md Show resolved Hide resolved

noahfalk reviewed Apr 8, 2021

View reviewed changes

specification/metrics/new_api.md Outdated Show resolved Hide resolved

noahfalk reviewed Apr 8, 2021

View reviewed changes

specification/metrics/new_api.md Show resolved Hide resolved

noahfalk approved these changes Apr 8, 2021

View reviewed changes

jmacd approved these changes Apr 8, 2021

View reviewed changes

update the wording for callback based on PR comments

138281d

reyang force-pushed the reyang/async-counter branch from b0f134b to 138281d Compare April 8, 2021 20:31

reyang added 2 commits April 8, 2021 14:45

add notes that CounterFunc callback returns absolute value instead of…

1198549

… delta/rate

update the example based on feedback

0afe381

cijothomas approved these changes Apr 8, 2021

View reviewed changes

SergeyKanzhelev merged commit 5e27e1a into open-telemetry:main Apr 9, 2021

reyang deleted the reyang/async-counter branch April 9, 2021 16:41

victlu mentioned this pull request Apr 9, 2021

REQUEST: New membership for victlu open-telemetry/community#710

Closed

6 tasks

reyang mentioned this pull request Apr 14, 2021

Skeleton for the remaining metrics instruments #1617

Merged

jonatan-ivanov mentioned this pull request Jun 15, 2023

REQUEST: New membership for jonatan-ivanov open-telemetry/community#1548

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add observable (async) counter to the metrics API #1590

Add observable (async) counter to the metrics API #1590

reyang commented Apr 1, 2021

jsuereth left a comment

jsuereth Apr 2, 2021

jsuereth Apr 2, 2021

bogdandrutu Apr 2, 2021

bogdandrutu Apr 2, 2021

hdost left a comment

reyang commented Apr 7, 2021

noahfalk Apr 8, 2021 •

edited

Loading

jmacd Apr 8, 2021

noahfalk Apr 8, 2021 •

edited

Loading

reyang Apr 8, 2021

noahfalk Apr 9, 2021

noahfalk left a comment

jmacd Apr 8, 2021

cijothomas left a comment


		Example uses for `ObservableCounter`:

		* [CPU time](https://wikipedia.org/wiki/CPU_time), which could be reported for

Add observable (async) counter to the metrics API #1590

Add observable (async) counter to the metrics API #1590

Conversation

reyang commented Apr 1, 2021

Changes

jsuereth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hdost left a comment

Choose a reason for hiding this comment

reyang commented Apr 7, 2021

noahfalk Apr 8, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

noahfalk Apr 8, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

noahfalk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cijothomas left a comment

Choose a reason for hiding this comment

noahfalk Apr 8, 2021 •

edited

Loading

noahfalk Apr 8, 2021 •

edited

Loading