Skip to content

Commit 7b52806

Browse files
jmacdreyangaabmass
authored
Add ExponentialHistogram to Metrics data model (open-telemetry#1935)
* Add ExponentialHistogram to Metrics data model * draft expectations * toc * remove one 'exponential' * mention the use of logarithm and inexact computation * manual edit TOC * typo * reduce precision * from Yuke's feedback * mapping methods * several fixes from yzhuge * lint * update links * Changelog * mention min/max * let consumers deal with overflow and underflow * yzhuge's remarks * whitespace * Apply suggestions from code review Co-authored-by: Reiley Yang <[email protected]> * revert TOC trouble etc * upcase * Update specification/metrics/datamodel.md Co-authored-by: Aaron Abbott <[email protected]> Co-authored-by: Reiley Yang <[email protected]> Co-authored-by: Aaron Abbott <[email protected]>
1 parent 548915c commit 7b52806

File tree

2 files changed

+273
-10
lines changed

2 files changed

+273
-10
lines changed

CHANGELOG.md

+2
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ release.
1515

1616
- Add optional min / max fields to histogram data model.
1717
([#1915](https://github.com/open-telemetry/opentelemetry-specification/pull/1915))
18+
- Add exponential histogram to the metrics data model.
19+
([#1935](https://github.com/open-telemetry/opentelemetry-specification/pull/1935))
1820

1921
### Logs
2022

specification/metrics/datamodel.md

+271-10
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,17 @@
1919
* [Sums](#sums)
2020
* [Gauge](#gauge)
2121
* [Histogram](#histogram)
22+
* [ExponentialHistogram](#exponentialhistogram)
23+
+ [Exponential Scale](#exponential-scale)
24+
+ [Exponential Buckets](#exponential-buckets)
25+
+ [Zero Count](#zero-count)
26+
+ [Producer Expectations](#producer-expectations)
27+
- [Scale Zero: Extract the Exponent](#scale-zero-extract-the-exponent)
28+
- [Negative Scale: Extract and Shift the Exponent](#negative-scale-extract-and-shift-the-exponent)
29+
- [All Scales: Use the Logarithm Function](#all-scales-use-the-logarithm-function)
30+
- [Positive Scale: Use a Lookup Table](#positive-scale-use-a-lookup-table)
31+
- [Producer Recommendations](#producer-recommendations)
32+
+ [Consumer Expectations](#consumer-expectations)
2233
* [Summary (Legacy)](#summary-legacy)
2334
- [Exemplars](#exemplars)
2435
- [Single-Writer](#single-writer)
@@ -223,12 +234,13 @@ consisting of several metadata properties:
223234
- Kind of point (integer, floating point, etc)
224235
- Unit of measurement
225236

226-
The primary data of each timeseries are ordered (timestamp, value) points, for
227-
three value types:
237+
The primary data of each timeseries are ordered (timestamp, value) points, with
238+
one of the following value types:
228239

229240
1. Counter (Monotonic, Cumulative)
230241
2. Gauge
231242
3. Histogram
243+
4. Exponential Histogram
232244

233245
This model may be viewed as an idealization of
234246
[Prometheus Remote Write](https://docs.google.com/document/d/1LPhVRSFkGNSuU1fBd81ulhsCPR4hkSZyyBj1SZ8fWOM/edit#heading=h.3p42p5s8n0ui).
@@ -267,9 +279,10 @@ same kind. <sup>[1](#otlpdatapointfn)</sup>
267279

268280
The basic point kinds are:
269281

270-
1. [Sum](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L230)
271-
2. [Gauge](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L200)
272-
3. [Histogram](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L258)
282+
1. [Sum](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.10.x/opentelemetry/proto/metrics/v1/metrics.proto#L198)
283+
2. [Gauge](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.10.x/opentelemetry/proto/metrics/v1/metrics.proto#L192)
284+
3. [Histogram](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.10.x/opentelemetry/proto/metrics/v1/metrics.proto#L211)
285+
4. [Exponential Histogram](https://github.com/open-telemetry/opentelemetry-proto/blob/27a10cd70f63afdbddf460881969f9ad7ae4af5d/opentelemetry/proto/metrics/v1/metrics.proto#L239)
273286

274287
Comparing the OTLP Metric Data Stream and Timeseries data models, OTLP does
275288
not map 1:1 from its point types into timeseries points. In OTLP, a Sum point
@@ -422,14 +435,262 @@ Changing the inclusivity and exclusivity of bounds is an example of
422435
worst-case Histogram error; users should choose Histogram boundaries
423436
so that worst-case error is within their error tolerance.
424437

438+
### ExponentialHistogram
439+
440+
**Status**: [Experimental](../document-status.md)
441+
442+
[ExponentialHistogram](https://github.com/open-telemetry/opentelemetry-proto/blob/cfbf9357c03bf4ac150a3ab3bcbe4cc4ed087362/opentelemetry/proto/metrics/v1/metrics.proto#L222)
443+
data points are an alternate representation to the
444+
[Histogram](#histogram) data point, used to convey a population of
445+
recorded measurements in a compressed format. ExponentialHistogram
446+
compresses bucket boundaries using an exponential formula, making it
447+
suitable for conveying high dynamic range data with small relative
448+
error, compared with alternative representations of similar size.
449+
450+
Statements about `Histogram` that refer to aggregation temporality,
451+
attributes, and timestamps, as well as the `sum`, `count`, `min`, `max` and
452+
`exemplars` fields, are the same for `ExponentialHistogram`. These
453+
fields all share identical interpretation as for `Histogram`, only the
454+
bucket structure differs between these two types.
455+
456+
#### Exponential Scale
457+
458+
The resolution of the ExponentialHistogram is characterized by a
459+
parameter known as `scale`, with larger values of `scale` offering
460+
greater precision. Bucket boundaries of the ExponentialHistogram are
461+
located at integer powers of the `base`, also known as the "growth
462+
factor", where:
463+
464+
```
465+
base = 2**(2**(-scale))
466+
```
467+
468+
The symbol `**` in these formulas represents exponentiation, thus
469+
`2**x` is read "Two to the power of `x`", typically computed by an
470+
expression like `math.Pow(2.0, x)`. Calculated `base` values for
471+
selected scales are shown below:
472+
473+
| Scale | Base | Expression |
474+
| -- | -- | -- |
475+
| 10 | 1.00068 | 2**(1/1024) |
476+
| 9 | 1.00135 | 2**(1/512) |
477+
| 8 | 1.00271 | 2**(1/256) |
478+
| 7 | 1.00543 | 2**(1/128) |
479+
| 6 | 1.01089 | 2**(1/64) |
480+
| 5 | 1.02190 | 2**(1/32) |
481+
| 4 | 1.04427 | 2**(1/16) |
482+
| 3 | 1.09051 | 2**(1/8) |
483+
| 2 | 1.18921 | 2**(1/4) |
484+
| 1 | 1.41421 | 2**(1/2) |
485+
| 0 | 2 | 2**1 |
486+
| -1 | 4 | 2**2 |
487+
| -2 | 16 | 2**4 |
488+
| -3 | 256 | 2**8 |
489+
| -4 | 65536 | 2**16 |
490+
491+
An important property of this design is described as "perfect
492+
subsetting". Buckets of an exponential Histogram with a given scale
493+
map exactly into buckets of exponential Histograms with lesser scales,
494+
which allows consumers to lower the resolution of a histogram (i.e.,
495+
downscale) without introducing error.
496+
497+
#### Exponential Buckets
498+
499+
The ExponentialHistogram bucket identified by `index`, a signed
500+
integer, represents values in the population that are greater than or
501+
equal to `base**index` and less than `base**(index+1)`. Note that the
502+
ExponentialHistogram specifies a lower-inclusive bound while the
503+
explicit-boundary Histogram specifies an upper-inclusive bound.
504+
505+
The positive and negative ranges of the histogram are expressed
506+
separately. Negative values are mapped by their absolute value
507+
into the negative range using the same scale as the positive range.
508+
509+
Each range of the ExponentialHistogram data point uses a dense
510+
representation of the buckets, where a range of buckets is expressed
511+
as a single `offset` value, a signed integer, and an array of count
512+
values, where array element `i` represents the bucket count for bucket
513+
index `offset+i`.
514+
515+
For a given range, positive or negative:
516+
517+
- Bucket index `0` counts measurements in the range `[1, base)`
518+
- Positive indexes correspond with absolute values greater or equal to `base`
519+
- Negative indexes correspond with absolute values less than 1
520+
- There are `2**scale` buckets between successive powers of 2.
521+
522+
For example, with `scale=3` there are `2**3` buckets between 1 and 2.
523+
Note that the lower boundary for bucket index 4 in a `scale=3`
524+
histogram maps into the lower boundary for bucket index 2 in a
525+
`scale=2` histogram and maps into the lower boundary for bucket index
526+
1 (i.e., the `base`) in a `scale=1` histogram—these are examples of
527+
perfect subsetting.
528+
529+
| `scale=3` bucket index | lower boundary | equation |
530+
| -- | -- | -- |
531+
| 0 | 1 | 2**(0/8) |
532+
| 1 | 1.09051 | 2**(1/8) |
533+
| 2 | 1.18921 | 2**(2/8), 2**(1/4) |
534+
| 3 | 1.29684 | 2**(3/8) |
535+
| 4 | 1.41421 | 2**(4/8), 2**(2/4), 2**(1/2) |
536+
| 5 | 1.54221 | 2**(5/8) |
537+
| 6 | 1.68179 | 2**(6/8) |
538+
| 7 | 1.83401 | 2**(7/8) |
539+
540+
#### Zero Count
541+
542+
The ExponentialHistogram contains a special `zero_count` field
543+
containing the count of values that are either exactly zero or within
544+
the region considered zero by the instrumentation at the tolerated
545+
level of precision. This bucket stores values that cannot be
546+
expressed using the standard exponential formula as well as values
547+
that have been rounded to zero.
548+
549+
#### Producer Expectations
550+
551+
The ExponentialHistogram design makes it possible to express values
552+
that are too large or small to be represented in the 64 bit "double"
553+
floating point format. Certain values for `scale`, while meaningful,
554+
are not necessarily useful.
555+
556+
The range of data represented by an ExponentialHistogram determines
557+
which scales can be usefully applied. Regardless of scale, producers
558+
SHOULD ensure that the index of any encoded bucket falls within the
559+
range of a signed 32-bit integer. This recommendation is applied to
560+
limit the width of integers used in standard processing pipelines such
561+
as the OpenTelemetry collector. The wire-level protocol could be
562+
extended for 64-bit bucket indices in a future release.
563+
564+
Producers use a mapping function to compute bucket indices. Producers
565+
are presumed to support IEEE double-width floating-point numbers with
566+
11-bit exponent and 52-bit significand. The pseudo-code below for
567+
mapping values to exponents refers to the following constants:
568+
569+
```golang
570+
const (
571+
// SignificandWidth is the size of an IEEE 754 double-precision
572+
// floating-point significand.
573+
SignificandWidth = 52
574+
// ExponentWidth is the size of an IEEE 754 double-precision
575+
// floating-point exponent.
576+
ExponentWidth = 11
577+
578+
// SignificandMask is the mask for the significand of an IEEE 754
579+
// double-precision floating-point value: 0xFFFFFFFFFFFFF.
580+
SignificandMask = 1 << SignificandWidth - 1
581+
582+
// ExponentBias is the exponent bias specified for encoding
583+
// the IEEE 754 double-precision floating point exponent: 1023.
584+
ExponentBias = 1 << (ExponentWidth-1) - 1
585+
586+
// ExponentMask are set to 1 for the bits of an IEEE 754
587+
// floating point exponent: 0x7FF0000000000000.
588+
ExponentMask = ((1 << ExponentWidth) - 1) << SignificandWidth
589+
)
590+
```
591+
592+
The following choices of mapping function have been validated through
593+
reference implementations.
594+
595+
##### Scale Zero: Extract the Exponent
596+
597+
For scale zero, the index of a value equals its normalized base-2
598+
exponent, meaning the value of _exponent_ in the base-2 fractional
599+
representation `1._significand_ * 2**_exponent_`. Normal IEEE 754
600+
double-width floating point values have indices in the range
601+
`[-1022, +1023]` and subnormal values have indices in the range
602+
`[-1074, -1023]`. This may be written as:
603+
604+
```golang
605+
// GetExponent extracts the normalized base-2 fractional exponent.
606+
// Let the value be represented as `1.significand x 2**exponent`,
607+
// this returns `exponent`. Not defined for 0, Inf, or NaN values.
608+
func GetExponent(value float64) int32 {
609+
rawBits := math.Float64bits(value)
610+
rawExponent := (int64(rawBits) & ExponentMask) >> SignificandWidth
611+
rawSignificand := rawBits & SignificandMask
612+
if rawExponent == 0 {
613+
// Handle subnormal values: rawSignificand cannot be zero
614+
// unless value is zero.
615+
rawExponent -= int64(bits.LeadingZeros64(rawSignificand) - 12)
616+
}
617+
return int32(rawExponent - ExponentBias)
618+
}
619+
```
620+
621+
##### Negative Scale: Extract and Shift the Exponent
622+
623+
For negative scales, the index of a value equals the normalized
624+
base-2 exponent (as by `GetExponent()` above) shifted to the right
625+
by `-scale`. Note that because of sign extension, this shift performs
626+
correct rounding for the negative indices. This may be written as:
627+
628+
```golang
629+
return GetExponent(value) >> -scale
630+
```
631+
632+
##### All Scales: Use the Logarithm Function
633+
634+
For any scale, use of the built-in natural logarithm
635+
function. A multiplicative factor equal to `2**scale / ln(2)`
636+
proves useful (where `ln()` is the natural logarithm), for example:
637+
638+
```golang
639+
scaleFactor := math.Log2E * math.Exp2(scale)
640+
return int64(math.Floor(math.Log(value) * scaleFactor))
641+
```
642+
643+
Note that in the example Golang code above, the built-in `math.Log2E`
644+
is defined as `1 / ln(2)`.
645+
646+
##### Positive Scale: Use a Lookup Table
647+
648+
For positive scales, lookup table methods have been demonstrated
649+
that are able to exactly compute the index in constant time from a
650+
lookup table with `O(2**scale)` entries.
651+
652+
##### Producer Recommendations
653+
654+
At the lowest or highest end of the 64 bit IEEE floating point, a
655+
bucket's range may only be partially representable by the floating
656+
point number format. When mapping a number in these buckets, a
657+
producer may correctly return the index of such a partially
658+
representable bucket. This is considered a normal condition.
659+
660+
For positive scales, the logarithm method is preferred because it
661+
requires very little code, is easy to validate and is nearly as fast
662+
and accurate as the lookup table approach. For zero scale and
663+
negative scales, directly calculating the index from the
664+
floating-point representation is more efficient.
665+
666+
The use of a built-in logarithm function could lead to results that
667+
differ from the bucket index that would be computed using arbitrary
668+
precision or a lookup table, however producers are not required to
669+
perform an exact computation. As a result, ExponentialHistogram
670+
exemplars could map into buckets with zero count. We expect to find
671+
such values counted in the adjacent buckets.
672+
673+
#### Consumer Expectations
674+
675+
ExponentialHistogram bucket indices are expected to map into buckets
676+
where both the upper and lower boundaries can be represented
677+
using IEEE 754 double-width floating point values. Consumers MAY
678+
round the unrepresentable boundary of a partially representable bucket
679+
index to the nearest representable value.
680+
681+
Consumers SHOULD reject ExponentialHistogram data with `scale` and
682+
bucket indices that overflow or underflow this representation.
683+
Consumers that reject such data SHOULD warn the user through error
684+
logging that out-of-range data was received.
685+
425686
### Summary (Legacy)
426687

427688
[Summary](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L268)
428-
metric data points convey quantile summaries, e.g. What is the 99-th percentile
429-
latency of my HTTP server. Unlike other point types in OpenTelemetry, Summary
430-
points cannot always be merged in a meaningful way. This point type is not
431-
recommended for new applications and exists for compatibility with other
432-
formats.
689+
metric data points convey quantile summaries, e.g. What is the 99-th
690+
percentile latency of my HTTP server. Unlike other point types in
691+
OpenTelemetry, Summary points cannot always be merged in a meaningful
692+
way. This point type is not recommended for new applications and
693+
exists for compatibility with other formats.
433694

434695
## Exemplars
435696

0 commit comments

Comments
 (0)