Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add observable (async) counter to the metrics API #1590
Add observable (async) counter to the metrics API #1590
Changes from 4 commits
21e4501
1264dcb
be6b362
3c2ffc6
05d9ce8
9f97e83
f72826f
a8f3b39
dca5ddc
70d4247
eb5cd99
b06a5d9
be2c265
97a01e8
09c4ec8
138281d
1198549
0afe381
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who do we believe has the onus to enforce that the values are monotonic? Is it undefined behavior if the callback returns a value that is smaller than the one it returned on the last invocation?
Alternatively, if the only purpose of saying the values are monotonic is to infer that the user prefers that the data is presented as a rate, I've yet to understand a problem with redefining the counter as a "uses rate presentation by default" counter. I can still make a mathematically well defined rate function that is sum_of_measurements / time and there is no requirement to only accept positive inputs.
(I'm fine accepting the PR as-is and resolving this as a follow up)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say something like the default SDK SHOULD enforce monotonicity, and that it is an error when a callback returns a value less than a previously returned value.
I take it you are considering whether we could drop the monotonicity idea. Why should we use separate instruments for monotonic and non-monotonic streams? There are legacy arguments that this information should be preserved, but I don't like to use those arguments. One good reason is simply to help the user. If you have a monotonic instrument, we can detect when you use it incorrectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but very open to the notion that I may not have the full picture. What I have read so far in the conversation suggests that users don't care about monotonicity, they care whether the data is shown as a rate or as an absolute value. If its true that rate vs. absolute value is all they care about then I think pros/cons of enforcing monotonicity anyways would be:
pros:
cons:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've covered this topic during the SIG meeting that monotonic vs. non-monotonic is important for the user experience, and it is also widely adopted by many well established metrics systems/libraries.
So here goes my suggestion:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I am being dense but that wasn't my understanding of the conversation. To me it felt like I put forth a question and an idea. Nobody responded saying they loved it nor did they respond saying its bad. Perhaps others thought it wasn't a good idea and were being polite, or wanted time to think it over, or I simply didn't understand the feedback : )
@jmacd, I know at the end of the SIG meeting you said you felt like you didn't have adequate opportunity to respond and mentioned some concern that the conversation was calling into question too much from the original spec. I wasn't clear whether you were refering to my comments or other comments/questions that were in the vicinity. Certainly my aim is not to push you (or anyone else) into a position you disagree with and I welcome your thoughts. So far I get the impression you aren't particularly excited about pivoting the definition of Counter to mean "displays as rate by default" with no enforcement of monotonicity. I don't know if anything I have said since was convincing or you still feel that being able to report negative measurements as errors is the most compelling concern at play so we should lay the issue to rest with that as the conclusion?
To me the issue appeared unresolved. I don't want to hold up the PR and I am fine accepting that Counter defined to be monotonic is the presumptive outcome at this point. I am hoping there will be a little more discussion in #1593, ideally considering the alternatives, but if not that then at least to clarify and document the rationale. Assuming that Counter is monotonic, having the SDK be the enforcement makes sense to me. Thanks all!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think It''d be good to call out use cases for where "ObservableCounter" is not good for tracking CPU time.
E.g. in a multi-tenant scenario where you want to track CPU usage / user, you'd actually want some kind of sycnhronous instrument that can report on request-level usage with an identified user.
Observable / Async metrics are really only useful for context-less metrics (those that cannot interact with other telemetry). I'd like to call that out here for folks deciding between Async/Sync as I think Otel really shines in the Sync scenario w/ Context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, but this is tricky because you don't want users that report cpu_usage (not per request) to use the sync version and believe that is better :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. We may need to provide guidance here during implementations, but glad to see this called out.