From 92800f4aa18216582201d68d0e22b0d50904e1b6 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 24 Sep 2021 13:38:23 -0700 Subject: [PATCH 01/16] add metrics spec supplementary guidelines --- .../metrics/supplementary-guidelines.md | 178 ++++++++++++++++++ 1 file changed, 178 insertions(+) create mode 100644 specification/metrics/supplementary-guidelines.md diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md new file mode 100644 index 00000000000..24f3dc637bb --- /dev/null +++ b/specification/metrics/supplementary-guidelines.md @@ -0,0 +1,178 @@ +# Supplementary Guidelines + +Note: this document is NOT a spec, it is provided to support the Metrics +[API](./api.md) and [SDK](./sdk.md) specifications, it does NOT add any extra +requirements to the existing specifications. + +Table of Contents: + +* [Guidelines for instrumentation library + authors](#guidelines-for-instrumentation-library-authors) +* [Guidelines for SDK authors](#guidelines-for-sdk-authors) + * [Memory efficiency](#memory-efficiency) + +## Guidelines for instrumentation library authors + +TBD + +## Guidelines for SDK authors + +### Memory efficiency + +The OpenTelemetry Metrics [Data Model](./datamodel.md) and [SDK](./sdk.md) are +designed to support both Cumulative and Delta +[Temporality](./datamodel.md#temporality). It is important to understand that +temporality will impact how the SDK could manage memory usage. Let's take the +following example: + +* During the time range (T0, T1]: + * HTTP verb = `GET`, status = `200`, duration = `50 (ms)` + * HTTP verb = `GET`, status = `200`, duration = `100 (ms)` + * HTTP verb = `GET`, status = `500`, duration = `1 (ms)` +* During the time range (T1, T2]: + * no HTTP request has been received +* During the time range (T2, T3] + * HTTP verb = `GET`, status = `500`, duration = `5 (ms)` + * HTTP verb = `GET`, status = `500`, duration = `2 (ms)` +* During the time range (T3, T4]: + * HTTP verb = `GET`, status = `200`, duration = `100 (ms)` +* During the time range (T4, T5]: + * HTTP verb = `GET`, status = `200`, duration = `100 (ms)` + * HTTP verb = `GET`, status = `200`, duration = `30 (ms)` + * HTTP verb = `GET`, status = `200`, duration = `50 (ms)` + +Let's imagine we export the metrics as [Histogram](./datamodel.md#histogram), +and to simplify the story we will only have one histogram bucket `(-Inf, +Inf)`: + +If we export the metrics using **Delta Temporality**: + +* (T0, T1] + * dimensions: {verb = `GET`, status = `200`}, count: `2`, min: `50 (ms)`, max: + `100 (ms)` + * dimensions: {verb = `GET`, status = `500`}, count: `1`, min: `1 (ms)`, max: + `1 (ms)` +* (T1, T2] + * nothing since we don't have any Measurement received +* (T2, T3] + * dimensions: {verb = `GET`, status = `500`}, count: `2`, min: `2 (ms)`, max: + `5 (ms)` +* (T3, T4] + * dimensions: {verb = `GET`, status = `200`}, count: `1`, min: `100 (ms)`, + max: `100 (ms)` +* (T4, T5] + * dimensions: {verb = `GET`, status = `200`}, count: `3`, min: `30 (ms)`, max: + `100 (ms)` + +You can see that the SDK **only needs to track what has happened after the +latest collection/export cycle**. For example, when the SDK started to process +measurements in (T1, T2], it can completely forget about +what has happened during (T0, T1]. + +If we export the metrics using **Cumulative Temporality**: + +* (T0, T1] + * dimensions: {verb = `GET`, status = `200`}, count: `2`, min: `50 (ms)`, max: + `100 (ms)` + * dimensions: {verb = `GET`, status = `500`}, count: `1`, min: `1 (ms)`, max: + `1 (ms)` +* (T0, T2] + * dimensions: {verb = `GET`, status = `200`}, count: `2`, min: `50 (ms)`, max: + `100 (ms)` + * dimensions: {verb = `GET`, status = `500`}, count: `1`, min: `1 (ms)`, max: + `1 (ms)` +* (T0, T3] + * dimensions: {verb = `GET`, status = `200`}, count: `2`, min: `50 (ms)`, max: + `100 (ms)` + * dimensions: {verb = `GET`, status = `500`}, count: `3`, min: `1 (ms)`, max: + `5 (ms)` +* (T0, T4] + * dimensions: {verb = `GET`, status = `200`}, count: `3`, min: `50 (ms)`, max: + `100 (ms)` + * dimensions: {verb = `GET`, status = `500`}, count: `3`, min: `1 (ms)`, max: + `5 (ms)` +* (T0, T5] + * dimensions: {verb = `GET`, status = `200`}, count: `6`, min: `30 (ms)`, max: + `100 (ms)` + * dimensions: {verb = `GET`, status = `500`}, count: `3`, min: `1 (ms)`, max: + `5 (ms)` + +If we choose **Cumulative Temporality**, the SDK **has to track what has +happened prior to the latest collection/export cycle**, in this worst case, the +SDK **will remember what has happened since the ever beginning of the process**. + +Imagine if we have a long running service and we collect metrics with 7 +dimensions, and each dimension can have 30 different values. We might eventually +end up having to remember the complete set of all `21,870,000,000` permutations! +This **cardinality explosion** a well known challenge in the metrics space. + +Making it even worse, if we export the permutations even if there is no recent +updates, the export batch could become huge and will be very costly. For +example, do we really need/want to export the same thing for (T0, +T2] in the above case? + +So here are some suggestions that we encourage SDK implementers to consider: + +* You want to control the memory usage rather than allow it to grow indefinitely + / unbounded - regardless of what aggregation temporality is being used. +* You want to improve the memory efficiency by being able to **forget about + things that are no longer needed**. +* You probably don't want to keep exporting the same thing over and over again, + if there is no updates. You might want to consider [Resets and + Gaps](./datamodel.md#resets-and-gaps). For example, if a Cumulative metrics + stream hasn't received any updates for a long period of time, would it be okay + to reset the start time? + +Now we can explore a more interesting topic, Delta->Cumulative and +Cumulative->Delta conversions. + +In the above case, we have Measurements reported by a [Histogram +Instrument](./api.md#histogram). What if we collect measurements from an +[Asynchronous Counter](./api.md#asynchronous-counter)? + +* During the time range (T0, T1]: + * ProcessId = `1001`, ThreadId = `1`, PageFaults = `50` + * ProcessId = `1001`, ThreadId = `2`, PageFaults = `30` +* During the time range (T1, T2]: + * ProcessId = `1001`, ThreadId = `1`, PageFaults = `53` + * ProcessId = `1001`, ThreadId = `2`, PageFaults = `38` +* During the time range (T2, T3] + * ProcessId = `1001`, ThreadId = `1`, PageFaults = `56` + * ProcessId = `1001`, ThreadId = `2`, PageFaults = `42` +* During the time range (T3, T4]: + * ProcessId = `1001`, ThreadId = `1`, PageFaults = `60` + * ProcessId = `1001`, ThreadId = `2`, PageFaults = `47` +* During the time range (T4, T5]: + * thread 1 died + * ProcessId = `1001`, ThreadId = `2`, PageFaults = `53` + * ProcessId = `1001`, ThreadId = `3`, PageFaults = `5` + +If we export the metrics using **Cumulative Temporality**, it is actually quite +straightforward - we just take the data being reported from the asynchronous +instruments and send them. We might want to consider if [Resets and +Gaps](./datamodel.md#resets-and-gaps) should be used to denote the end of a +metric stream - e.g. thread 1 died, the thread ID might be reused by the +operating system, and we probably don't want to confuse the metrics backend. + +If we export the metrics using **Delta Temporality**, we will have to remember +the last value of **everything single permutation we've encountered so far**, +because if we don't, we won't be able to calculate the delta value using +`current value - last value`. And you can tell, this is super expensive. + +Making it more interesting, if we have min/max value, it is **mathematically +impossible** to reliably deduce the Delta temporality from Cumulative +temporality. For example, if the maximum value is 10 during (T0, +T2] and the maximum value is 20 during (T0, +T3], we know that the maximum value druing (T2, +T3] must be 20. But if the maximum value is 20 during (T0, +T2] and the maximum value is also 20 during (T0, +T3], we wouldn't know what is the maximum value during +(T2, T3], unless we know that there is no value (count = +0). + +So here are some suggestions that we encourage SDK implementers to consider: + +* You probably don't want to encourage your users to do Cumulative to Delta + conversion. Actually you might want to discourage them from doing this. +* If you have to do Cumulative to Delta conversion, and you encountered min/max, + rather than drop the data on the floor, you might want to convert to something + useful - e.g. [Gauge](./datamodel.md#gauge). From 71eab3e1b8c958c128e2f70e40285f9b77d90199 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 24 Sep 2021 14:32:12 -0700 Subject: [PATCH 02/16] fix wording --- specification/metrics/supplementary-guidelines.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index 24f3dc637bb..ade5fd9e8a0 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -122,8 +122,7 @@ So here are some suggestions that we encourage SDK implementers to consider: stream hasn't received any updates for a long period of time, would it be okay to reset the start time? -Now we can explore a more interesting topic, Delta->Cumulative and -Cumulative->Delta conversions. +Now we can explore a more interesting topic, Cumulative->Delta conversions. In the above case, we have Measurements reported by a [Histogram Instrument](./api.md#histogram). What if we collect measurements from an From 15da2231a604900c6f48f1ab5cf8625ceacc59f1 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 24 Sep 2021 14:36:40 -0700 Subject: [PATCH 03/16] tweak wording --- specification/metrics/supplementary-guidelines.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index ade5fd9e8a0..fabe8e8167b 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -99,6 +99,7 @@ If we export the metrics using **Cumulative Temporality**: If we choose **Cumulative Temporality**, the SDK **has to track what has happened prior to the latest collection/export cycle**, in this worst case, the SDK **will remember what has happened since the ever beginning of the process**. +This is known as Delta->Cumulative conversion. Imagine if we have a long running service and we collect metrics with 7 dimensions, and each dimension can have 30 different values. We might eventually @@ -122,12 +123,15 @@ So here are some suggestions that we encourage SDK implementers to consider: stream hasn't received any updates for a long period of time, would it be okay to reset the start time? -Now we can explore a more interesting topic, Cumulative->Delta conversions. +Now we can explore a more interesting topic, Cumulative->Delta conversion. In the above case, we have Measurements reported by a [Histogram Instrument](./api.md#histogram). What if we collect measurements from an [Asynchronous Counter](./api.md#asynchronous-counter)? +The following example shows the number page faults of each thread since the +thread ever started: + * During the time range (T0, T1]: * ProcessId = `1001`, ThreadId = `1`, PageFaults = `50` * ProcessId = `1001`, ThreadId = `2`, PageFaults = `30` @@ -141,7 +145,7 @@ Instrument](./api.md#histogram). What if we collect measurements from an * ProcessId = `1001`, ThreadId = `1`, PageFaults = `60` * ProcessId = `1001`, ThreadId = `2`, PageFaults = `47` * During the time range (T4, T5]: - * thread 1 died + * thread 1 died, thread 3 started * ProcessId = `1001`, ThreadId = `2`, PageFaults = `53` * ProcessId = `1001`, ThreadId = `3`, PageFaults = `5` From 3f29273823694cce99b5c054e88cc65bb24ebaed Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 24 Sep 2021 14:38:13 -0700 Subject: [PATCH 04/16] improve readability --- .../metrics/supplementary-guidelines.md | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index fabe8e8167b..58ebdd9d891 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -163,14 +163,15 @@ because if we don't, we won't be able to calculate the delta value using Making it more interesting, if we have min/max value, it is **mathematically impossible** to reliably deduce the Delta temporality from Cumulative -temporality. For example, if the maximum value is 10 during (T0, -T2] and the maximum value is 20 during (T0, -T3], we know that the maximum value druing (T2, -T3] must be 20. But if the maximum value is 20 during (T0, -T2] and the maximum value is also 20 during (T0, -T3], we wouldn't know what is the maximum value during -(T2, T3], unless we know that there is no value (count = -0). +temporality. For example: + +* If the maximum value is 10 during (T0, T2] and the + maximum value is 20 during (T0, T3], we know that the + maximum value druing (T2, T3] must be 20. +* If the maximum value is 20 during (T0, T2] and the + maximum value is also 20 during (T0, T3], we wouldn't + know what is the maximum value during (T2, T3], unless + we know that there is no value (count = 0). So here are some suggestions that we encourage SDK implementers to consider: From 7dc9be28b65ef8a47a25d19947885c6a5bc72981 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 24 Sep 2021 14:40:05 -0700 Subject: [PATCH 05/16] fix typo --- specification/metrics/supplementary-guidelines.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index 58ebdd9d891..9981a996437 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -157,9 +157,9 @@ metric stream - e.g. thread 1 died, the thread ID might be reused by the operating system, and we probably don't want to confuse the metrics backend. If we export the metrics using **Delta Temporality**, we will have to remember -the last value of **everything single permutation we've encountered so far**, -because if we don't, we won't be able to calculate the delta value using -`current value - last value`. And you can tell, this is super expensive. +the last value of **every single permutation we've encountered so far**, because +if we don't, we won't be able to calculate the delta value using `current - +last`. And you can tell, this is super expensive. Making it more interesting, if we have min/max value, it is **mathematically impossible** to reliably deduce the Delta temporality from Cumulative From c8a1035fd3efa655e55f0fc8fa55bd726a481c2a Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 24 Sep 2021 14:48:14 -0700 Subject: [PATCH 06/16] spellcheck --- .../metrics/supplementary-guidelines.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index 9981a996437..769079732cf 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -104,9 +104,9 @@ This is known as Delta->Cumulative conversion. Imagine if we have a long running service and we collect metrics with 7 dimensions, and each dimension can have 30 different values. We might eventually end up having to remember the complete set of all `21,870,000,000` permutations! -This **cardinality explosion** a well known challenge in the metrics space. +This **cardinality explosion** a well-known challenge in the metrics space. -Making it even worse, if we export the permutations even if there is no recent +Making it even worse, if we export the permutations even if there are no recent updates, the export batch could become huge and will be very costly. For example, do we really need/want to export the same thing for (T0, T2] in the above case? @@ -149,7 +149,7 @@ thread ever started: * ProcessId = `1001`, ThreadId = `2`, PageFaults = `53` * ProcessId = `1001`, ThreadId = `3`, PageFaults = `5` -If we export the metrics using **Cumulative Temporality**, it is actually quite +If we export the metrics using **Cumulative Temporality**, it is quite straightforward - we just take the data being reported from the asynchronous instruments and send them. We might want to consider if [Resets and Gaps](./datamodel.md#resets-and-gaps) should be used to denote the end of a @@ -167,16 +167,16 @@ temporality. For example: * If the maximum value is 10 during (T0, T2] and the maximum value is 20 during (T0, T3], we know that the - maximum value druing (T2, T3] must be 20. + maximum value during (T2, T3] must be 20. * If the maximum value is 20 during (T0, T2] and the maximum value is also 20 during (T0, T3], we wouldn't - know what is the maximum value during (T2, T3], unless + know what the maximum value is during (T2, T3], unless we know that there is no value (count = 0). So here are some suggestions that we encourage SDK implementers to consider: * You probably don't want to encourage your users to do Cumulative to Delta - conversion. Actually you might want to discourage them from doing this. + conversion. Actually, you might want to discourage them from doing this. * If you have to do Cumulative to Delta conversion, and you encountered min/max, - rather than drop the data on the floor, you might want to convert to something - useful - e.g. [Gauge](./datamodel.md#gauge). + rather than drop the data on the floor, you might want to convert them to + something useful - e.g. [Gauge](./datamodel.md#gauge). From d54c3da2b7ec98eff0783449455345a88b9a0f6e Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 24 Sep 2021 15:13:00 -0700 Subject: [PATCH 07/16] fix typo --- specification/metrics/supplementary-guidelines.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index 769079732cf..6dc6a9a812e 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -98,8 +98,8 @@ If we export the metrics using **Cumulative Temporality**: If we choose **Cumulative Temporality**, the SDK **has to track what has happened prior to the latest collection/export cycle**, in this worst case, the -SDK **will remember what has happened since the ever beginning of the process**. -This is known as Delta->Cumulative conversion. +SDK **will have to remember what has happened since the ever beginning of the +process**. This is known as Delta->Cumulative conversion. Imagine if we have a long running service and we collect metrics with 7 dimensions, and each dimension can have 30 different values. We might eventually From ccb8703234251ef426fda25c8a24ce03f469cdd5 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 24 Sep 2021 15:28:48 -0700 Subject: [PATCH 08/16] add more examples --- .../metrics/supplementary-guidelines.md | 92 +++++++++++++------ 1 file changed, 64 insertions(+), 28 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index 6dc6a9a812e..df39c7df3ea 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -23,23 +23,23 @@ The OpenTelemetry Metrics [Data Model](./datamodel.md) and [SDK](./sdk.md) are designed to support both Cumulative and Delta [Temporality](./datamodel.md#temporality). It is important to understand that temporality will impact how the SDK could manage memory usage. Let's take the -following example: +following HTTP requests example: * During the time range (T0, T1]: - * HTTP verb = `GET`, status = `200`, duration = `50 (ms)` - * HTTP verb = `GET`, status = `200`, duration = `100 (ms)` - * HTTP verb = `GET`, status = `500`, duration = `1 (ms)` + * verb = `GET`, status = `200`, duration = `50 (ms)` + * verb = `GET`, status = `200`, duration = `100 (ms)` + * verb = `GET`, status = `500`, duration = `1 (ms)` * During the time range (T1, T2]: * no HTTP request has been received * During the time range (T2, T3] - * HTTP verb = `GET`, status = `500`, duration = `5 (ms)` - * HTTP verb = `GET`, status = `500`, duration = `2 (ms)` + * verb = `GET`, status = `500`, duration = `5 (ms)` + * verb = `GET`, status = `500`, duration = `2 (ms)` * During the time range (T3, T4]: - * HTTP verb = `GET`, status = `200`, duration = `100 (ms)` + * verb = `GET`, status = `200`, duration = `100 (ms)` * During the time range (T4, T5]: - * HTTP verb = `GET`, status = `200`, duration = `100 (ms)` - * HTTP verb = `GET`, status = `200`, duration = `30 (ms)` - * HTTP verb = `GET`, status = `200`, duration = `50 (ms)` + * verb = `GET`, status = `200`, duration = `100 (ms)` + * verb = `GET`, status = `200`, duration = `30 (ms)` + * verb = `GET`, status = `200`, duration = `50 (ms)` Let's imagine we export the metrics as [Histogram](./datamodel.md#histogram), and to simplify the story we will only have one histogram bucket `(-Inf, +Inf)`: @@ -129,37 +129,73 @@ In the above case, we have Measurements reported by a [Histogram Instrument](./api.md#histogram). What if we collect measurements from an [Asynchronous Counter](./api.md#asynchronous-counter)? -The following example shows the number page faults of each thread since the +The following example shows the number of [page +faults](https://en.wikipedia.org/wiki/Page_fault) of each thread since the thread ever started: * During the time range (T0, T1]: - * ProcessId = `1001`, ThreadId = `1`, PageFaults = `50` - * ProcessId = `1001`, ThreadId = `2`, PageFaults = `30` + * pid = `1001`, tid = `1`, #PF = `50` + * pid = `1001`, tid = `2`, #PF = `30` * During the time range (T1, T2]: - * ProcessId = `1001`, ThreadId = `1`, PageFaults = `53` - * ProcessId = `1001`, ThreadId = `2`, PageFaults = `38` + * pid = `1001`, tid = `1`, #PF = `53` + * pid = `1001`, tid = `2`, #PF = `38` * During the time range (T2, T3] - * ProcessId = `1001`, ThreadId = `1`, PageFaults = `56` - * ProcessId = `1001`, ThreadId = `2`, PageFaults = `42` + * pid = `1001`, tid = `1`, #PF = `56` + * pid = `1001`, tid = `2`, #PF = `42` * During the time range (T3, T4]: - * ProcessId = `1001`, ThreadId = `1`, PageFaults = `60` - * ProcessId = `1001`, ThreadId = `2`, PageFaults = `47` + * pid = `1001`, tid = `1`, #PF = `60` + * pid = `1001`, tid = `2`, #PF = `47` * During the time range (T4, T5]: * thread 1 died, thread 3 started - * ProcessId = `1001`, ThreadId = `2`, PageFaults = `53` - * ProcessId = `1001`, ThreadId = `3`, PageFaults = `5` + * pid = `1001`, tid = `2`, #PF = `53` + * pid = `1001`, tid = `3`, #PF = `5` -If we export the metrics using **Cumulative Temporality**, it is quite -straightforward - we just take the data being reported from the asynchronous -instruments and send them. We might want to consider if [Resets and +If we export the metrics using **Cumulative Temporality**: + +* (T0, T1] + * dimensions: {pid = `1001`, tid = `1`}, sum: `50` + * dimensions: {pid = `1001`, tid = `2`}, sum: `30` +* (T0, T2] + * dimensions: {pid = `1001`, tid = `1`}, sum: `53` + * dimensions: {pid = `1001`, tid = `2`}, sum: `38` +* (T0, T3] + * dimensions: {pid = `1001`, tid = `1`}, sum: `56` + * dimensions: {pid = `1001`, tid = `2`}, sum: `42` +* (T0, T4] + * dimensions: {pid = `1001`, tid = `1`}, sum: `60` + * dimensions: {pid = `1001`, tid = `2`}, sum: `47` +* (T0, T5] + * dimensions: {pid = `1001`, tid = `2`}, sum: `53` + * dimensions: {pid = `1001`, tid = `3`}, sum: `5` + +It is quite straightforward - we just take the data being reported from the +asynchronous instruments and send them. We might want to consider if [Resets and Gaps](./datamodel.md#resets-and-gaps) should be used to denote the end of a metric stream - e.g. thread 1 died, the thread ID might be reused by the operating system, and we probably don't want to confuse the metrics backend. -If we export the metrics using **Delta Temporality**, we will have to remember -the last value of **every single permutation we've encountered so far**, because -if we don't, we won't be able to calculate the delta value using `current - -last`. And you can tell, this is super expensive. +If we export the metrics using **Delta Temporality**: + +* (T0, T1] + * dimensions: {pid = `1001`, tid = `1`}, delta: `50` + * dimensions: {pid = `1001`, tid = `2`}, delta: `30` +* (T1, T2] + * dimensions: {pid = `1001`, tid = `1`}, delta: `3` + * dimensions: {pid = `1001`, tid = `2`}, delta: `8` +* (T2, T3] + * dimensions: {pid = `1001`, tid = `1`}, delta: `3` + * dimensions: {pid = `1001`, tid = `2`}, delta: `4` +* (T3, T4] + * dimensions: {pid = `1001`, tid = `1`}, delta: `4` + * dimensions: {pid = `1001`, tid = `2`}, delta: `5` +* (T4, T5] + * dimensions: {pid = `1001`, tid = `2`}, delta: `6` + * dimensions: {pid = `1001`, tid = `3`}, delta: `5` + +You can see that we will have to remember the last value of **every single +permutation we've encountered so far**, because if we don't, we won't be able to +calculate the delta value using `current value - last value`. And you can tell, +this is super expensive. Making it more interesting, if we have min/max value, it is **mathematically impossible** to reliably deduce the Delta temporality from Cumulative From 2c550014adc539ee2e5a5d3ccf96acba9d50706a Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Fri, 24 Sep 2021 16:58:33 -0700 Subject: [PATCH 09/16] try to clarify a bit --- specification/metrics/supplementary-guidelines.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index df39c7df3ea..8f522b41a38 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -123,8 +123,6 @@ So here are some suggestions that we encourage SDK implementers to consider: stream hasn't received any updates for a long period of time, would it be okay to reset the start time? -Now we can explore a more interesting topic, Cumulative->Delta conversion. - In the above case, we have Measurements reported by a [Histogram Instrument](./api.md#histogram). What if we collect measurements from an [Asynchronous Counter](./api.md#asynchronous-counter)? @@ -211,8 +209,8 @@ temporality. For example: So here are some suggestions that we encourage SDK implementers to consider: -* You probably don't want to encourage your users to do Cumulative to Delta +* You probably don't want to encourage your users to do Cumulative->Delta conversion. Actually, you might want to discourage them from doing this. -* If you have to do Cumulative to Delta conversion, and you encountered min/max, +* If you have to do Cumulative->Delta conversion, and you encountered min/max, rather than drop the data on the floor, you might want to convert them to something useful - e.g. [Gauge](./datamodel.md#gauge). From 522ec997ba0afaa79664ff60c512bdf9380d10e2 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Sat, 25 Sep 2021 12:33:20 -0700 Subject: [PATCH 10/16] improve wording --- .../metrics/supplementary-guidelines.md | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index 8f522b41a38..daee1baae2f 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -96,15 +96,15 @@ If we export the metrics using **Cumulative Temporality**: * dimensions: {verb = `GET`, status = `500`}, count: `3`, min: `1 (ms)`, max: `5 (ms)` -If we choose **Cumulative Temporality**, the SDK **has to track what has -happened prior to the latest collection/export cycle**, in this worst case, the -SDK **will have to remember what has happened since the ever beginning of the -process**. This is known as Delta->Cumulative conversion. +You can see that we are performing Delta->Cumulative conversion, and the SDK +**has to track what has happened prior to the latest collection/export cycle**, +in the worst case, the SDK **will have to remember what has happened since the +ever beginning of the process**. Imagine if we have a long running service and we collect metrics with 7 -dimensions, and each dimension can have 30 different values. We might eventually +dimensions and each dimension can have 30 different values. We might eventually end up having to remember the complete set of all `21,870,000,000` permutations! -This **cardinality explosion** a well-known challenge in the metrics space. +This **cardinality explosion** is a well-known challenge in the metrics space. Making it even worse, if we export the permutations even if there are no recent updates, the export batch could become huge and will be very costly. For @@ -190,10 +190,10 @@ If we export the metrics using **Delta Temporality**: * dimensions: {pid = `1001`, tid = `2`}, delta: `6` * dimensions: {pid = `1001`, tid = `3`}, delta: `5` -You can see that we will have to remember the last value of **every single -permutation we've encountered so far**, because if we don't, we won't be able to -calculate the delta value using `current value - last value`. And you can tell, -this is super expensive. +You can see that we are performing Cumulative->Delta conversion, and it requires +us to remember the last value of **every single permutation we've encountered so +far**, because if we don't, we won't be able to calculate the delta value using +`current value - last value`. And as you can tell, this is super expensive. Making it more interesting, if we have min/max value, it is **mathematically impossible** to reliably deduce the Delta temporality from Cumulative From 2613d415d043ca2b95829cfb02ca061997cc235c Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Mon, 27 Sep 2021 10:35:20 -0700 Subject: [PATCH 11/16] cover more topics --- .../metrics/supplementary-guidelines.md | 35 +++++++++++++++++-- 1 file changed, 33 insertions(+), 2 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index daee1baae2f..8255afe48ab 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -9,7 +9,8 @@ Table of Contents: * [Guidelines for instrumentation library authors](#guidelines-for-instrumentation-library-authors) * [Guidelines for SDK authors](#guidelines-for-sdk-authors) - * [Memory efficiency](#memory-efficiency) + * [Aggregation temporality](#aggregation-temporality) + * [Memory management](#memory-management) ## Guidelines for instrumentation library authors @@ -17,7 +18,7 @@ TBD ## Guidelines for SDK authors -### Memory efficiency +### Aggregation temporality The OpenTelemetry Metrics [Data Model](./datamodel.md) and [SDK](./sdk.md) are designed to support both Cumulative and Delta @@ -214,3 +215,33 @@ So here are some suggestions that we encourage SDK implementers to consider: * If you have to do Cumulative->Delta conversion, and you encountered min/max, rather than drop the data on the floor, you might want to convert them to something useful - e.g. [Gauge](./datamodel.md#gauge). + +### Memory management + +Memory management is a wide topic, here we will only cover some of the most +important things for the OpenTelemetry SDK: + +* Choose a better design so the SDK has less things to be memorized, and avoid + keeping things in memory unless there is a must need. One good example is the + [aggregation temporality](#aggregation-temporality). +* Design a better memory layout, so the storage is efficient and accessing the + storage can be fast. This is normally specific to the targeting programming + language and platform. For example, aliging the memory to the CPU cache line, + keeping the hot memories close to each other, keeping the memory close to the + hardware (e.g. non-paged pool, + [NUMA](https://en.wikipedia.org/wiki/Non-uniform_memory_access)). +* Pre-allocate and pool the memory, so the SDK doesn't have to allocate memory + on-the-fly. This is especially useful to language runtimes that have garbage + collectors, as it ensures the hot path in the code won't trigger garbage + colletion. +* Limit the memmory usage, and handle critical memory condition. In general the + expectation is that a telemetry SDK should not fail the application. This can + be done via some dimension-capping algorithm - e.g. start to combine/drop some + data points when the SDK hits the memory limit, and provide a mechanism to + report the data loss. +* Provide configurations to the application owner. The answer to _"what is an + efficient memory usage"_ is ultimately depending on goal of the application + owner. For example, the application owners might want to spend more memory in + order to keep more permutations of metrics dimensions, or they might want to + use memory aggressively for certain dimensions that are important, and keep a + conservative limit for dimensions that are less important. From 9ea88f531915d3f6e4ffdfc96ba325266c44d7d4 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Mon, 27 Sep 2021 10:39:42 -0700 Subject: [PATCH 12/16] fix typo --- specification/metrics/supplementary-guidelines.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index 8255afe48ab..094f984c2e3 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -234,7 +234,7 @@ important things for the OpenTelemetry SDK: on-the-fly. This is especially useful to language runtimes that have garbage collectors, as it ensures the hot path in the code won't trigger garbage colletion. -* Limit the memmory usage, and handle critical memory condition. In general the +* Limit the memory usage, and handle critical memory condition. The general expectation is that a telemetry SDK should not fail the application. This can be done via some dimension-capping algorithm - e.g. start to combine/drop some data points when the SDK hits the memory limit, and provide a mechanism to From ea922dba09606f7cfaa08389dacb2d37f4cd778c Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Mon, 27 Sep 2021 10:40:23 -0700 Subject: [PATCH 13/16] fix typo --- specification/metrics/supplementary-guidelines.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index 094f984c2e3..244e98e6bc6 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -240,8 +240,9 @@ important things for the OpenTelemetry SDK: data points when the SDK hits the memory limit, and provide a mechanism to report the data loss. * Provide configurations to the application owner. The answer to _"what is an - efficient memory usage"_ is ultimately depending on goal of the application - owner. For example, the application owners might want to spend more memory in - order to keep more permutations of metrics dimensions, or they might want to - use memory aggressively for certain dimensions that are important, and keep a - conservative limit for dimensions that are less important. + efficient memory usage"_ is ultimately depending on the goal of the + application owner. For example, the application owners might want to spend + more memory in order to keep more permutations of metrics dimensions, or they + might want to use memory aggressively for certain dimensions that are + important, and keep a conservative limit for dimensions that are less + important. From a2e3bf768564bbbddae4ff7bfe91f4433c39703f Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Mon, 27 Sep 2021 10:51:54 -0700 Subject: [PATCH 14/16] layout --- .../metrics/supplementary-guidelines.md | 57 ++++++++++--------- 1 file changed, 30 insertions(+), 27 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index 244e98e6bc6..2eb4fe3698c 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -219,30 +219,33 @@ So here are some suggestions that we encourage SDK implementers to consider: ### Memory management Memory management is a wide topic, here we will only cover some of the most -important things for the OpenTelemetry SDK: - -* Choose a better design so the SDK has less things to be memorized, and avoid - keeping things in memory unless there is a must need. One good example is the - [aggregation temporality](#aggregation-temporality). -* Design a better memory layout, so the storage is efficient and accessing the - storage can be fast. This is normally specific to the targeting programming - language and platform. For example, aliging the memory to the CPU cache line, - keeping the hot memories close to each other, keeping the memory close to the - hardware (e.g. non-paged pool, - [NUMA](https://en.wikipedia.org/wiki/Non-uniform_memory_access)). -* Pre-allocate and pool the memory, so the SDK doesn't have to allocate memory - on-the-fly. This is especially useful to language runtimes that have garbage - collectors, as it ensures the hot path in the code won't trigger garbage - colletion. -* Limit the memory usage, and handle critical memory condition. The general - expectation is that a telemetry SDK should not fail the application. This can - be done via some dimension-capping algorithm - e.g. start to combine/drop some - data points when the SDK hits the memory limit, and provide a mechanism to - report the data loss. -* Provide configurations to the application owner. The answer to _"what is an - efficient memory usage"_ is ultimately depending on the goal of the - application owner. For example, the application owners might want to spend - more memory in order to keep more permutations of metrics dimensions, or they - might want to use memory aggressively for certain dimensions that are - important, and keep a conservative limit for dimensions that are less - important. +important things for OpenTelemetry SDK. + +**Choose a better design so the SDK has less things to be memorized**, avoid +keeping things in memory unless there is a must need. One good example is the +[aggregation temporality](#aggregation-temporality). + +**Design a better memory layout**, so the storage is efficient and accessing the +storage can be fast. This is normally specific to the targeting programming +language and platform. For example, aliging the memory to the CPU cache line, +keeping the hot memories close to each other, keeping the memory close to the +hardware (e.g. non-paged pool, +[NUMA](https://en.wikipedia.org/wiki/Non-uniform_memory_access)). + +**Pre-allocate and pool the memory**, so the SDK doesn't have to allocate memory +on-the-fly. This is especially useful to language runtimes that have garbage +collectors, as it ensures the hot path in the code won't trigger garbage +colletion. + +**Limit the memory usage, and handle critical memory condition.** The general +expectation is that a telemetry SDK should not fail the application. This can be +done via some dimension-capping algorithm - e.g. start to combine/drop some data +points when the SDK hits the memory limit, and provide a mechanism to report the +data loss. + +**Provide configurations to the application owner.** The answer to _"what is an +efficient memory usage"_ is ultimately depending on the goal of the application +owner. For example, the application owners might want to spend more memory in +order to keep more permutations of metrics dimensions, or they might want to use +memory aggressively for certain dimensions that are important, and keep a +conservative limit for dimensions that are less important. From 3304a1a32c87b5b374d60b300096766d27329343 Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Mon, 27 Sep 2021 10:56:37 -0700 Subject: [PATCH 15/16] fix typo --- specification/metrics/supplementary-guidelines.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index 2eb4fe3698c..11f0705d0c7 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -227,7 +227,7 @@ keeping things in memory unless there is a must need. One good example is the **Design a better memory layout**, so the storage is efficient and accessing the storage can be fast. This is normally specific to the targeting programming -language and platform. For example, aliging the memory to the CPU cache line, +language and platform. For example, aligning the memory to the CPU cache line, keeping the hot memories close to each other, keeping the memory close to the hardware (e.g. non-paged pool, [NUMA](https://en.wikipedia.org/wiki/Non-uniform_memory_access)). @@ -235,7 +235,7 @@ hardware (e.g. non-paged pool, **Pre-allocate and pool the memory**, so the SDK doesn't have to allocate memory on-the-fly. This is especially useful to language runtimes that have garbage collectors, as it ensures the hot path in the code won't trigger garbage -colletion. +collection. **Limit the memory usage, and handle critical memory condition.** The general expectation is that a telemetry SDK should not fail the application. This can be From 7460f21afa9b6949b806a2b92c8a41791ae2aa7e Mon Sep 17 00:00:00 2001 From: Reiley Yang Date: Tue, 28 Sep 2021 14:36:47 -0700 Subject: [PATCH 16/16] Update specification/metrics/supplementary-guidelines.md Co-authored-by: Alan West <3676547+alanwest@users.noreply.github.com> --- specification/metrics/supplementary-guidelines.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/metrics/supplementary-guidelines.md b/specification/metrics/supplementary-guidelines.md index 11f0705d0c7..6e528704286 100644 --- a/specification/metrics/supplementary-guidelines.md +++ b/specification/metrics/supplementary-guidelines.md @@ -100,7 +100,7 @@ If we export the metrics using **Cumulative Temporality**: You can see that we are performing Delta->Cumulative conversion, and the SDK **has to track what has happened prior to the latest collection/export cycle**, in the worst case, the SDK **will have to remember what has happened since the -ever beginning of the process**. +beginning of the process**. Imagine if we have a long running service and we collect metrics with 7 dimensions and each dimension can have 30 different values. We might eventually