Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support recording metrics with "churning" labels without "leaking" memory. #3605

Closed
dashpole opened this issue Jan 19, 2023 · 5 comments · Fixed by #4290
Closed

Support recording metrics with "churning" labels without "leaking" memory. #3605

dashpole opened this issue Jan 19, 2023 · 5 comments · Fixed by #4290
Assignees
Labels
area:metrics Part of OpenTelemetry Metrics enhancement New feature or request pkg:SDK Related to an SDK package

Comments

@dashpole
Copy link
Contributor

Problem Statement

I often want to record metrics about something which has unbounded (or bounded, but unreasonably high) cardinality across time, but has bounded cardinality within a time period (e.g. 24 hours). For example, on a typical VM, there are very large number of PIDs available, but there is only a relatively small number of PIDs in-use at any point in time.

I would like to be able to attach such information (e.g. a PID) to a metric as an attribute. An implementation which does not allow "forgetting" an attribute value (or set of values) will use O(cardinality across time) memory. I would like to keep the amount of memory required by the SDK to the O(current cardinality).

Non-Goal: Forgetting labels with Synchronous instruments

Delete() (or an equivalent) for Synchronous instruments is out-of-scope, as it would require a Spec change, and should be discussed in the Spec repo first.

Proposed Solution

A attribute set which is not observed during an asynchronous callback is immediately be removed from any caches in the OTel SDK, and are not sent to exporters.

As a user, I can define an asynchronous callback with a variable label, but constant cardinality each callback. Since memory associated with "forgotten" label sets is dropped after the next callback, the following won't produce a memory leak:

i := 0
gauge, _ := meter.Float64ObservableGauge("bar")
meter.RegisterCallback(func(_ context.Context, o api.Observer) error {
	// The attribute set changes each callback.  Used for illustration.  No one should do this.
	o.ObserveFloat64(gauge,1.0, attribute.Key("Foo").Int(2*i))
	i++
	return nil
}, gauge)

Subsequent callbacks (that would be passed to an exporter) produce:

Callback @ t0: attrs: {"foo": 0}
---
Callback @ t1: attrs: {"foo": 2}
---
Callback @ t2: attrs: {"foo": 4}

Instead of (current state):

Callback @ t0: attrs: {"foo": 0}
---
Callback @ t1: attrs: {"foo": 0}, {"foo": 2}
---
Callback @ t2: attrs: {"foo": 0}, {"foo": 2}, {"foo": 4}

The above is an improvement for users (IMO), but raises a question: For async counters which are exported as cumulative temporality, how is the start time set if the attribute set reappears?

It would be incorrect to use the process start time, since the cumulative amount reported is assumed to not include the cumulative amounts from previous instantiations of that attribute set, but that start time choice would indicate that the cumulative amount includes everything since the process started. Instead, for all "new" attribute sets added as part of a callback, the start time is set to the previous time that callback was invoked. "New" includes attribute sets which have disappeared and reappeared, as well as (for example) the first observation ever recorded. If this is the first time the callback is invoked, the process start time is used as the start time. This ensures that the start time is a time before that attribute set began to exist, and is after a previous instantiation of that attribute set ceased to exist. I believe this start time logic can and should be extended to synchronous instruments as well.

To the best if my knowledge, the behavior described here is left unspecified in the specification.

Alternatives

Keep the current behavior, and add a separate way of deleting a series.

Prior Art

In Prometheus, a user would solve this problem using a "custom collector" to support dynamic label sets. Such a collector implements a func Collect(chan<- Metric), where Metric is pre-aggregated metric data. This is similar to an asynchronous callback because it is invoked asynchronously at scrape time, and allows the user to record the cumulative value of a counter, for example. Here is an example from kubernetes for recording metrics about persistent volumes attached to a node: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/persistentvolume/metrics/metrics.go.

The prometheus client only adds the metrics added during Collect to its endpoint. It is the scrapers' responsibility to notice that the metric is missing, and mark it stale.

Additional Context

This initially came up in the context of handling filtered attributes from async counters: #3549 (review)

I've made a similar comment in the spec issue: open-telemetry/opentelemetry-specification#1891 (comment).

@jmacd
Copy link
Contributor

jmacd commented Jan 19, 2023

Non-Goal: Forgetting labels with Synchronous instruments

I don't believe we need an explicit operation to forget synchronous instruments. The Lightstep SDK does this for example. as did (roughly speaking) the early prototype written in this repo. It only works for delta temporality; I think we could extend the specification to support timestamp resets, to support more aggressive removal of stale cumulative temporality data.

@dashpole
Copy link
Contributor Author

dashpole commented Jan 19, 2023

I think we could extend the specification to support timestamp resets, to support more aggressive removal of stale cumulative temporality data.

That makes sense to me from a "keep my memory usage in-check" perspective. From a user perspective I may still want a way to explicitly say that something doesn't exist anymore (e.g. so that my graphs visually "end" in the right place), rather than implicitly continuing to remain until marked "stale".

Edit: The fact that someone might want this doesn't mean we have to give it to them. If we implement this proposal, we could just direct them towards using asynchronous instruments as Prometheus does).

@owenhaynes
Copy link

Is there any workarounds arounds for this at the moment for async observers?

@dashpole
Copy link
Contributor Author

dashpole commented Jul 6, 2023

Discussed this at the SIG meeting today:

We will move forward with the proposal above. I will work on implementing it.

@MrAlias

This comment was marked as duplicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:metrics Part of OpenTelemetry Metrics enhancement New feature or request pkg:SDK Related to an SDK package
Projects
No open projects
4 participants