Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more details to SDK MeterProvider and View #1730

Merged
merged 29 commits into from
Jul 27, 2021
Merged
Changes from 11 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
4b866a8
Add View API
reyang May 28, 2021
b3259fc
fix link
reyang May 28, 2021
be9beb5
address comments
reyang Jun 24, 2021
c5c73aa
provide one example
reyang Jun 24, 2021
f82475e
update wording
reyang Jun 24, 2021
7359e9a
polish wording
reyang Jun 24, 2021
2df446b
update wording
reyang Jun 24, 2021
09501aa
mention histogram bucket in the view aggregation
reyang Jun 24, 2021
87e3bfe
resolve comments based on discussion during the SIG meeting
reyang Jun 27, 2021
0c32806
fix typo
reyang Jun 27, 2021
142a345
update the spec based on discussion during the SIG meeting
reyang Jun 29, 2021
8b9a7f0
add more diagrams
reyang Jul 10, 2021
d14d005
update the spec based on the discussion during Metrics SIG meeting
reyang Jul 10, 2021
bbe5b40
minor tweak diagram
reyang Jul 11, 2021
49bf1b1
address review comment
reyang Jul 13, 2021
7d8336d
update the default histogram buckets
reyang Jul 13, 2021
083c51c
remove pipeline from this PR
reyang Jul 20, 2021
4f00ac7
address comments regarding error handling
reyang Jul 20, 2021
3340239
Merge branch 'main' into reyang/metrics-sdk-view
reyang Jul 21, 2021
2d5a888
Update specification/metrics/sdk.md
reyang Jul 21, 2021
f1621e7
adjust wording
reyang Jul 22, 2021
e606fe5
Merge branch 'main' into reyang/metrics-sdk-view
reyang Jul 22, 2021
4d0913a
add instrument type and clarify the selection criteria
reyang Jul 26, 2021
6261507
update the view selection logic
reyang Jul 26, 2021
954102f
update the view selection logic
reyang Jul 26, 2021
5fc0500
fix broken link
reyang Jul 26, 2021
971e770
Merge branch 'main' into reyang/metrics-sdk-view
reyang Jul 26, 2021
4ab3465
update the TODO part by removing the details
reyang Jul 27, 2021
f9f47ed
Merge branch 'main' into reyang/metrics-sdk-view
reyang Jul 27, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
137 changes: 134 additions & 3 deletions specification/metrics/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,141 @@ Table of Contents

## MeterProvider

### Meter Creation

New `Meter` instances are always created through a `MeterProvider` (see
[API](./api.md#meterprovider)). The `name`, `version` (optional), and
`schema_url` (optional) arguments supplied to the `MeterProvider` MUST be used
to create an
[`InstrumentationLibrary`](https://github.com/open-telemetry/oteps/blob/main/text/0083-component.md)
instance which is stored on the created `Meter`.

Configuration (i.e., [MeasurementProcessors](#measurementprocessor),
[MetricProcessors](#metricprocessor), [MetricExporters](#metricexporter) and
[`Views`](#view)) MUST be managed solely by the `MeterProvider` and it MUST
reyang marked this conversation as resolved.
Show resolved Hide resolved
provide some way to configure all of them that are implemented in the SDK, at
least when creating or initializing it.
reyang marked this conversation as resolved.
Show resolved Hide resolved

The `MeterProvider` MAY provide methods to update the configuration. If
cijothomas marked this conversation as resolved.
Show resolved Hide resolved
configuration is updated (e.g., adding a `MetricProcessor`), the updated
configuration MUST also apply to all already returned `Meters` (i.e. it MUST NOT
matter whether a `Meter` was obtained from the `MeterProvider` before or after
the configuration change). Note: Implementation-wise, this could mean that
`Meter` instances have a reference to their `MeterProvider` and access
configuration only via this reference.

### Shutdown

TODO

### ForceFlush
reyang marked this conversation as resolved.
Show resolved Hide resolved

TODO

### View
reyang marked this conversation as resolved.
Show resolved Hide resolved

`View` gives the SDK users flexibility to customize the metrics they want. Here
are some examples when `View` is needed:
reyang marked this conversation as resolved.
Show resolved Hide resolved

* Customize which [Instrument](./api.md#instrument) to be processed/ignored. For
reyang marked this conversation as resolved.
Show resolved Hide resolved
example, an [instrumented library](../glossary.md#instrumented-library) can
provide both temperature and humidity, but the application developer only
wants temperature information.
reyang marked this conversation as resolved.
Show resolved Hide resolved
* Customize the aggregation - if the default aggregation associated with the
Instrument does not meet the expectation. For example, an HTTP client library
reyang marked this conversation as resolved.
Show resolved Hide resolved
might expose HTTP client request duration as [Histogram](./api.md#histogram)
by default, but the application developer only wants the total count of
reyang marked this conversation as resolved.
Show resolved Hide resolved
outgoing requests.
* Customize which attribute(s) to be reported as metrics dimension(s). For
reyang marked this conversation as resolved.
Show resolved Hide resolved
example, an HTTP server library might expose HTTP verb (e.g. GET, POST) and
HTTP status code (e.g. 200, 301, 404). The application developer might only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this belongs, but adjusting the sampling of Exemplars is likely an aggregator problem. I can write up my thoughts here but I investigated what Prometheus has been up to in this space and I think likely exemplar sampling should be tied to aggregator. If not, we should call it out as a thing users may want to change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsuereth Agree. One simple aggregator for exemplars would use an equal-probability reservoir sampler such as Algorithm R. Then, we can apply an adjusted_count attribute to each exemplar, and a downstream system can combine exemplars into a data set, which can be used to estimate histograms of the full-cardinality event stream.

care about HTTP status code (e.g. reporting the total count of HTTP requests
for each HTTP status code). There are also extreme scenario that the
application developer does not need any dimension (e.g. just get the total
reyang marked this conversation as resolved.
Show resolved Hide resolved
count of all incoming requests).
* Add additional dimension(s) from the [Context](../context/context.md). For
example, a [Baggage](../baggage/api.md) value might be available indicating
whether an HTTP request is coming from a bot/crawler or not. The application
developer might want this to be converted to a dimension for HTTP server
metrics (e.g. the request/second from bots vs. real users).
jmacd marked this conversation as resolved.
Show resolved Hide resolved

The SDK must provide the means to register Views with a MeterProvider. Here are
the inputs:

* The `name` of the View (optional). If not provided, the Instrument `name`
reyang marked this conversation as resolved.
Show resolved Hide resolved
would be used by default. This will be used as the name of the [metrics
stream](./datamodel.md#events--data-stream--timeseries).
* The Instrument selection criteria (required), which covers:
* The `name` of the Meter (optional).
* The `version` of the Meter (optional).
* The `schema_url` of the Meter (optional).
* The `name` of the Instrument (required).
* Individual language client MAY choose to support more criteria.
* The configuration for the resulting [metrics
stream](./datamodel.md#events--data-stream--timeseries):
* The `description`. If not provided, the Instrument `description` would be
jmacd marked this conversation as resolved.
Show resolved Hide resolved
used by default.
* The `attribute keys` (optional). If not provided, all the attribute keys
reyang marked this conversation as resolved.
Show resolved Hide resolved
will be used by default (TODO: once the Hint API is available, the default
behavior should respect the Hint if it is available).
* The `extra dimensions` which come from Baggage/Context (optional). If not
provided, no extra dimension will be used. Please note that this only
applies to [synchronous Instruments](./api.md#synchronous-instrument), any
`extra dimensions` configured on [asynchronous
Instruments](./api.md#asynchronous-instrument) will be considered as error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - regarding attribute key usage in the JVM prototype, I did the following: https://github.com/jsuereth/opentelemetry-java/blob/wip-metrics-api/sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/view/AttributesProcessors.java

There's an AttributesProcessor interface that's locked down and provides basic primitives and "join" operations to construct your behavior around altering attributes. Would this meet the specification?

Also what does "considered an error" mean in terms of how to handle this scenario? May want to link to error-handling-in-the-sdk section.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, the metrics SDK spec would want to give some room to the language clients so they can shoot for higher performance. @reyang to follow up with @jsuereth to change the wording in the spec to give that room.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filter attributes from the Measurements.
Take extra attributes from Baggage/Context.
Add extra attributes (most likely to be static per view, e.g. "aggregation=MAX").

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling part is covered in the next section, I've removed the "... will be considered as error" part here to avoid duplication.

* The `aggregation` (optional) to be used. If not provided, a default
jmacd marked this conversation as resolved.
Show resolved Hide resolved
aggregation will be applied by the SDK. The default aggregation is a TODO
(e.g. first see if there is an explicitly specified aggregation from the
View, if not, see if there is a Hint, if not, fallback to the default
aggregation that is associated with the Instrument type), and for histogram,
the default buckets would be [0, 5.0], (5.0, 10.0], (10.0, 25.0], (25.0,
reyang marked this conversation as resolved.
Show resolved Hide resolved
50.0], (50.0, 75.0], (75.0, 100.0], (100.0, 250.0], (250.0, 500.0], (500.0,
1000.0], (1000.0, +∞).

The SDK SHOULD consider the following situation as user error and provide a way
to let the user know (e.g. expose [self-diagnostics
logs](../error-handling.md#self-diagnostics)):

* If there are more than one Instruments meeting the selection criteria.
* If the name of the View has conflict with other Views.

If there is no View registered, all the Instruments associated with the
reyang marked this conversation as resolved.
Show resolved Hide resolved
MeterProvider SHOULD be used.

If there is any View registered, only the registered View(s) SHOULD be used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When/where?

reyang marked this conversation as resolved.
Show resolved Hide resolved

Here is one example:

```python
# Python
'''
+------------------+
| MeterProvider |
| Meter A |
| Counter X |
| Histogram Y |
| Meter B |
| Gauge Z |
+------------------+
'''

meter_provider.start_pipeline(
# metrics from X, Y and Z will be exported to console every 5 seconds (default)
pipeline: pipeline
.add_exporter(ConsoleExporter())
).start_pipeline(
# metrics from X and Y (reported as Foo and Bar) will be exported to Prometheus upon scraping
pipeline: pipeline
.add_view(name="X")
.add_view(name="Foo", instrument_name="Y")
.add_view(name="Bar", instrument_name="Y")
.add_exporter(PrometheusExporter())
)
```

TODO:

* Allow multiple pipelines (processors, exporters).
* Configure "Views".
* Support multiple pipelines (processors, exporters).
* Configure timing (related to [issue
1432](https://github.com/open-telemetry/opentelemetry-specification/issues/1432)).

Expand Down Expand Up @@ -108,7 +239,7 @@ in the SDK:

## MetricExporter

`MetricExporter` defines the interface that protocol-specific exporters must
`MetricExporter` defines the interface that protocol-specific exporters MUST
implement so that they can be plugged into OpenTelemetry SDK and support sending
of telemetry data.

Expand Down