-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support recording metrics with "churning" labels without "leaking" memory. #3605
Comments
I don't believe we need an explicit operation to forget synchronous instruments. The Lightstep SDK does this for example. as did (roughly speaking) the early prototype written in this repo. It only works for delta temporality; I think we could extend the specification to support timestamp resets, to support more aggressive removal of stale cumulative temporality data. |
That makes sense to me from a "keep my memory usage in-check" perspective. From a user perspective I may still want a way to explicitly say that something doesn't exist anymore (e.g. so that my graphs visually "end" in the right place), rather than implicitly continuing to remain until marked "stale". Edit: The fact that someone might want this doesn't mean we have to give it to them. If we implement this proposal, we could just direct them towards using asynchronous instruments as Prometheus does). |
Is there any workarounds arounds for this at the moment for async observers? |
Discussed this at the SIG meeting today: We will move forward with the proposal above. I will work on implementing it. |
Problem Statement
I often want to record metrics about something which has unbounded (or bounded, but unreasonably high) cardinality across time, but has bounded cardinality within a time period (e.g. 24 hours). For example, on a typical VM, there are very large number of PIDs available, but there is only a relatively small number of PIDs in-use at any point in time.
I would like to be able to attach such information (e.g. a PID) to a metric as an attribute. An implementation which does not allow "forgetting" an attribute value (or set of values) will use O(cardinality across time) memory. I would like to keep the amount of memory required by the SDK to the O(current cardinality).
Non-Goal: Forgetting labels with Synchronous instruments
Delete() (or an equivalent) for Synchronous instruments is out-of-scope, as it would require a Spec change, and should be discussed in the Spec repo first.
Proposed Solution
A attribute set which is not observed during an asynchronous callback is immediately be removed from any caches in the OTel SDK, and are not sent to exporters.
As a user, I can define an asynchronous callback with a variable label, but constant cardinality each callback. Since memory associated with "forgotten" label sets is dropped after the next callback, the following won't produce a memory leak:
Subsequent callbacks (that would be passed to an exporter) produce:
Instead of (current state):
The above is an improvement for users (IMO), but raises a question: For async counters which are exported as cumulative temporality, how is the start time set if the attribute set reappears?
It would be incorrect to use the process start time, since the cumulative amount reported is assumed to not include the cumulative amounts from previous instantiations of that attribute set, but that start time choice would indicate that the cumulative amount includes everything since the process started. Instead, for all "new" attribute sets added as part of a callback, the start time is set to the previous time that callback was invoked. "New" includes attribute sets which have disappeared and reappeared, as well as (for example) the first observation ever recorded. If this is the first time the callback is invoked, the process start time is used as the start time. This ensures that the start time is a time before that attribute set began to exist, and is after a previous instantiation of that attribute set ceased to exist. I believe this start time logic can and should be extended to synchronous instruments as well.
To the best if my knowledge, the behavior described here is left unspecified in the specification.
Alternatives
Keep the current behavior, and add a separate way of deleting a series.
Prior Art
In Prometheus, a user would solve this problem using a "custom collector" to support dynamic label sets. Such a collector implements a
func Collect(chan<- Metric)
, where Metric is pre-aggregated metric data. This is similar to an asynchronous callback because it is invoked asynchronously at scrape time, and allows the user to record the cumulative value of a counter, for example. Here is an example from kubernetes for recording metrics about persistent volumes attached to a node: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/persistentvolume/metrics/metrics.go.The prometheus client only adds the metrics added during Collect to its endpoint. It is the scrapers' responsibility to notice that the metric is missing, and mark it stale.
Additional Context
This initially came up in the context of handling filtered attributes from async counters: #3549 (review)
I've made a similar comment in the spec issue: open-telemetry/opentelemetry-specification#1891 (comment).
The text was updated successfully, but these errors were encountered: