-
Notifications
You must be signed in to change notification settings - Fork 8.5k
[OTel] Setup OTel's metrics client #229696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| return response; | ||
| }) | ||
| .finally(() => { | ||
| originalExit(exitCode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't call process.exit here. Kibana has a graceful shutdown that could be affected by this if the OTel shutdown is completed faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT of calling it only when process.exit was called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the Node.js documentation, if you want to queue anything before exiting the process, you need to add it to the beforeExit listener.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that doesn't help when the process itself calls process.exit():
The 'beforeExit' event is not emitted for conditions causing explicit termination, such as calling process.exit() or uncaught exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
process.on('exit') only allows sync operations while beforeExit allows async work. Reacting to SIGINT and SIGTERM is probably sufficient though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a listener to uncaughtExceptionMonitor as well. This way, we will react to uncaught exceptions (that end up crashing the process, and typically want to flush the events).
The reason for using uncaughtExceptionMonitor instead of uncaughtException is that it doesn't stop the process from crashing.
Regarding explicit calls to process.exit() they only occur in bootstrap.ts, if there's a fatal error (that is logged). Fatal errors are either config or migration errors.
…to otel/instrument-metrics
| readers: meterReaders, | ||
| }); | ||
|
|
||
| api.metrics.setGlobalMeterProvider(meterProvider); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This essentially makes our Prometheus endpoint incompatible with the Core OTel metrics exporters.
IMO, it's OK for now (I didn't want to create more changes in this PR). But we'll need a follow-up PR to address this incompatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help me understand the impact of this a bit better? AFAICT If someone is exporting metrics to Prometheus this call to setGlobalMeterProvider will override the one in @kbn/tracing. So it's kind of an edge case possibility that someone configures opentelemetry.metrics.prometheus.enabled: true AND telemetry.metrics.enabled: true?
In that case, is it possible/preferable to throw an unhandled exception that says "global meter provider already registered... you cannot set both x and y"? Otherwise this approach LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your understanding is correct! (the only nit: it will override the one in @kbn/metrics 😬).
I've created this issue to address it #230184
In that case, is it possible/preferable to throw an unhandled exception that says "global meter provider already registered... you cannot set both x and y"?
I tried figuring out a way to do that, but I couldn't find any way to detect that a GlobalMeterProvider was already registered (OTel Metrics always has one, at least a NoopMeterProvider). We could check that the registered one is not an instance of the NoopMeterProvider, and warn/throw the error.
Howerver, I wish that we could come up with a way to add the Prometheus exporter to the Core-registered MeterProvider (similar to what we did with tracing for the langfuse and phoenix exporters defined in the inference plugin).
jloleysens
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @afharo !
| readers: meterReaders, | ||
| }); | ||
|
|
||
| api.metrics.setGlobalMeterProvider(meterProvider); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help me understand the impact of this a bit better? AFAICT If someone is exporting metrics to Prometheus this call to setGlobalMeterProvider will override the one in @kbn/tracing. So it's kind of an edge case possibility that someone configures opentelemetry.metrics.prometheus.enabled: true AND telemetry.metrics.enabled: true?
In that case, is it possible/preferable to throw an unhandled exception that says "global meter provider already registered... you cannot set both x and y"? Otherwise this approach LGTM.
| return response; | ||
| }) | ||
| .finally(() => { | ||
| originalExit(exitCode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
process.on('exit') only allows sync operations while beforeExit allows async work. Reacting to SIGINT and SIGTERM is probably sufficient though.
| @@ -0,0 +1,90 @@ | |||
| # @kbn/metrics | |||
|
|
|||
| This package includes the logic to initialize the OpenTelemetry Metrics client and its exporters. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a private package? @kbn/tracing isn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not used outside platform. I'm planning to reorg the packages and will move @kbn/tracing and @kbn/tracing-config to the private end as well. But I didn't want to increase this PR's scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "platform" even mean here? What does "private" mean? Why would something like a Discover plugin be able to use it, but APM wouldn't? What about scripts that want to setup tracing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "platform" even mean here? What does "private" mean?
Those are SKA concepts that have been discussed many times.
In my (oversimplified) mental model:
- "platform": not solution-specific code
- "private": code that cannot be imported outside "platform" (cannot be imported by solution-specific code).
Why would something like a Discover plugin be able to use it, but APM wouldn't? What about scripts that want to setup tracing?
AFAIK, any scripts (solution's or not) setting up the OTel tracing should use @kbn/telemetry (which will remain as "shared") and not the internal @kbn/tracing or @kbn/metrics (the one that's currently private in this PR).
Re Discover plugin vs. APM: TBH, I would have loved to have a "core" group that would have restricted the "platform" plugins. But we discovered this during the migration and it was a risk to add it that late based on our tight deadlines. We might introduce that split in the future if we notice that "platform" plugins import core-internal packages.
NOTE: Claiming that a package is private is not set in stone. If we identify a use case where it's needed, we can move it to shared. However, starting it as private makes the public surface more manageable to everyone (maintainers and consumers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we are confusing core with Platform, which is my point. This setup is silly, and in general I don't see why we should default to private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your point re SKA. Do you consider this a blocker in this PR? I don't see how initMetrics from @kbn/metrics would be used in isolation (it would replace the global MeterProvider potentially set up in initTelemetry), and that's why I consider it "private" (too bad that it's still exposed to all platform packages and plugins).
If you think that it should be "shared", happy to move it to unblock the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's not a blocker, I just think it's a useless concept, and I hope we fix it soon (either everything as public, which would be my preference, or "scoped" private packages (or maybe just sub-packages of packages).
| switch (variant.type) { | ||
| case 'grpc': { | ||
| const metadata = new Metadata(); | ||
| Object.entries(variant.value.headers || {}).forEach(([key, value]) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we not call this metadata in the case of grpc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The env var is called OTEL_EXPORTER_OTLP_HEADERS, and they look like headers Authorization=.... This is why I chose to keep headers in the config as well. The fact that it's passed as metadata is an implementation detail, IMO.
| /** | ||
| * Global toggle for telemetry. It disables all form of telemetry: product analytics, OTel tracing and OTel metrics. | ||
| */ | ||
| enabled: schema.boolean({ defaultValue: true }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I got thrown off by the fact it's called telemetryTracingSchemaProps., should it just be telemetrySchemaProps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It is the current default. Otherwise, the "product telemetry" would be disabled by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but do you want to change the name of the var?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! I missed the 2nd comment (network issues the other day). I'll rename the var.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed in fa33d15
| return async () => {}; | ||
| } | ||
| if (telemetryConfig.tracing.enabled) { | ||
| initTracing({ resource, tracingConfig: telemetryConfig.tracing }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it safe to call this before registering instrumentations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so.
If an instrumentation calls trace.getTracer(), it'll get the tracer in the TracerProvider registered by initTracing.
If registered before calling initTracing, trace.getTracer() returns the NoopTracerProvider, essentially discarding the traces.
💚 Build Succeeded
Metrics [docs]Public APIs missing comments
Unknown metric groupsAPI count
ESLint disabled line counts
Total ESLint disabled count
History
cc @afharo |
dgieselaar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
pickypg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as long as #230184 is a fast-follow.
## Summary Resolves elastic#229933 Partially addresses elastic#224860 Partially addresses elastic#230002 Notable changes in this PR: * Adds instrumentation for the OTel Metrics client, accepting 2 configurations for the exporters: gRPC and HTTP. * Extends the OTel resource definition with additional properties, and makes them shared between tracing and metrics instrumentations. * Improves `telemetry` config schema definition * Applies validation to the tracing and metrics configuration so that config-schema-defined defaults can be used. * Conditionally register the metrics provider in the plugin `monitoring_collection` to stop it from replacing the newly registered global metrics provider if it has no exporters to set up. * Registers cherry-picked metric-relevant EDOT-provided autoinstrumentations only when metric collection is enabled. --- ### Small demo of the collected metrics I've created a small dashboard to demo the metrics that are automatically collected by the registered instrumentation: <img width="2310" height="884" alt="image" src="https://github.com/user-attachments/assets/9b3ebea4-b45c-4f33-a05f-c9f2ac7c3175" /> <img width="2305" height="839" alt="image" src="https://github.com/user-attachments/assets/fcb77d72-38e9-494f-a164-736bbc5fef05" /> If you want to see it live, download [this file](https://github.com/user-attachments/files/21534687/export.ndjson.zip), unzip it, and import the resulting `export.ndjson` file in a Serverless Observability project. Then, click on "Add data" > "Application" > "OpenTelemetry" > "Managed OTLP Endpoint", and copy the URL and API Key <img width="1727" height="991" alt="image" src="https://github.com/user-attachments/assets/60d94e92-ca6d-4002-a6cd-b4c6e3b85c3f" /> Then configure your local Kibana with the following: ```yaml telemetry.metrics: enabled: true interval: 10s exporters: - grpc: url: <the URL you copied> headers: authorization: "ApiKey <the API key that was generated>" ``` --- ### `@opentelemetry/exporter-metrics-otlp-http` - [x] **Purpose:** What is this dependency used for? Briefly explain its role in your changes. This is used to set up the OTel Trace OTLP exporter using the HTTP protocol. At the moment, we are only capable of shipping the OTel metrics using the gRPC protocol, but we'd like to be able to enable support for the HTTP exporter as well. - [x] **Justification:** Why is adding this dependency the best approach? It's the official and recommended exporter. When using the OTel/EDOT SDKs, if the process is run with the env var `OTEL_EXPORTER_OTLP_ENDPOINT`, it automatically instruments this exporter. We need to programmatically import it because we want to use the settings coming in `kibana.yml` instead, and we're not allowed to use dependencies that are not listed in our `package.json` (but this library was already installed by the SDK, as can be seen in the [yarn.lock file](https://github.com/elastic/kibana/pull/229696/files#diff-51e4f558fae534656963876761c95b83b6ef5da5103c4adef6768219ed76c2deR9483)). - [x] **Alternatives explored:** Were other options considered (e.g., using existing internal libraries/utilities, implementing the functionality directly)? If so, why was this dependency chosen over them? We didn't consider any other alternatives, since this is the official (and already installed by the SDK) OTel HTTP metrics exporter. - [x] **Existing dependencies:** Does Kibana have a dependency providing similar functionality? If so, why is the new one preferred? Yes, this library is already installed by the SDK for auto-configuration in case the env var `OTEL_EXPORTER_OTLP_ENDPOINT` is attached to the process. Similarly, we already use OTel's **Traces** (as opposed to Metrics) OTLP exporter using HTTP ([link](https://github.com/elastic/kibana/blob/6d55439bd795b6c8f01084dc4ce8b0e2cb7eb0a1/package.json#L1135)). This is the Metrics homonimus. --- ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [x] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [x] This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The `release_note:breaking` label should be applied in these situations. - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [x] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels. ### Identify risks Does this PR introduce any risks? For example, consider risks like hard to test bugs, performance regression, potential of data loss. Describe the risk, its severity, and mitigation for each identified risk. Invite stakeholders and evaluate how to proceed before merging. - [ ] [See some risk examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) - [ ] ... --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
## Summary Resolves elastic#229933 Partially addresses elastic#224860 Partially addresses elastic#230002 Notable changes in this PR: * Adds instrumentation for the OTel Metrics client, accepting 2 configurations for the exporters: gRPC and HTTP. * Extends the OTel resource definition with additional properties, and makes them shared between tracing and metrics instrumentations. * Improves `telemetry` config schema definition * Applies validation to the tracing and metrics configuration so that config-schema-defined defaults can be used. * Conditionally register the metrics provider in the plugin `monitoring_collection` to stop it from replacing the newly registered global metrics provider if it has no exporters to set up. * Registers cherry-picked metric-relevant EDOT-provided autoinstrumentations only when metric collection is enabled. --- ### Small demo of the collected metrics I've created a small dashboard to demo the metrics that are automatically collected by the registered instrumentation: <img width="2310" height="884" alt="image" src="https://github.com/user-attachments/assets/9b3ebea4-b45c-4f33-a05f-c9f2ac7c3175" /> <img width="2305" height="839" alt="image" src="https://github.com/user-attachments/assets/fcb77d72-38e9-494f-a164-736bbc5fef05" /> If you want to see it live, download [this file](https://github.com/user-attachments/files/21534687/export.ndjson.zip), unzip it, and import the resulting `export.ndjson` file in a Serverless Observability project. Then, click on "Add data" > "Application" > "OpenTelemetry" > "Managed OTLP Endpoint", and copy the URL and API Key <img width="1727" height="991" alt="image" src="https://github.com/user-attachments/assets/60d94e92-ca6d-4002-a6cd-b4c6e3b85c3f" /> Then configure your local Kibana with the following: ```yaml telemetry.metrics: enabled: true interval: 10s exporters: - grpc: url: <the URL you copied> headers: authorization: "ApiKey <the API key that was generated>" ``` --- ### `@opentelemetry/exporter-metrics-otlp-http` - [x] **Purpose:** What is this dependency used for? Briefly explain its role in your changes. This is used to set up the OTel Trace OTLP exporter using the HTTP protocol. At the moment, we are only capable of shipping the OTel metrics using the gRPC protocol, but we'd like to be able to enable support for the HTTP exporter as well. - [x] **Justification:** Why is adding this dependency the best approach? It's the official and recommended exporter. When using the OTel/EDOT SDKs, if the process is run with the env var `OTEL_EXPORTER_OTLP_ENDPOINT`, it automatically instruments this exporter. We need to programmatically import it because we want to use the settings coming in `kibana.yml` instead, and we're not allowed to use dependencies that are not listed in our `package.json` (but this library was already installed by the SDK, as can be seen in the [yarn.lock file](https://github.com/elastic/kibana/pull/229696/files#diff-51e4f558fae534656963876761c95b83b6ef5da5103c4adef6768219ed76c2deR9483)). - [x] **Alternatives explored:** Were other options considered (e.g., using existing internal libraries/utilities, implementing the functionality directly)? If so, why was this dependency chosen over them? We didn't consider any other alternatives, since this is the official (and already installed by the SDK) OTel HTTP metrics exporter. - [x] **Existing dependencies:** Does Kibana have a dependency providing similar functionality? If so, why is the new one preferred? Yes, this library is already installed by the SDK for auto-configuration in case the env var `OTEL_EXPORTER_OTLP_ENDPOINT` is attached to the process. Similarly, we already use OTel's **Traces** (as opposed to Metrics) OTLP exporter using HTTP ([link](https://github.com/elastic/kibana/blob/6d55439bd795b6c8f01084dc4ce8b0e2cb7eb0a1/package.json#L1135)). This is the Metrics homonimus. --- ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [x] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [x] This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The `release_note:breaking` label should be applied in these situations. - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [x] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels. ### Identify risks Does this PR introduce any risks? For example, consider risks like hard to test bugs, performance regression, potential of data loss. Describe the risk, its severity, and mitigation for each identified risk. Invite stakeholders and evaluate how to proceed before merging. - [ ] [See some risk examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) - [ ] ... --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
## Summary Resolves elastic#229933 Partially addresses elastic#224860 Partially addresses elastic#230002 Notable changes in this PR: * Adds instrumentation for the OTel Metrics client, accepting 2 configurations for the exporters: gRPC and HTTP. * Extends the OTel resource definition with additional properties, and makes them shared between tracing and metrics instrumentations. * Improves `telemetry` config schema definition * Applies validation to the tracing and metrics configuration so that config-schema-defined defaults can be used. * Conditionally register the metrics provider in the plugin `monitoring_collection` to stop it from replacing the newly registered global metrics provider if it has no exporters to set up. * Registers cherry-picked metric-relevant EDOT-provided autoinstrumentations only when metric collection is enabled. --- ### Small demo of the collected metrics I've created a small dashboard to demo the metrics that are automatically collected by the registered instrumentation: <img width="2310" height="884" alt="image" src="https://github.com/user-attachments/assets/9b3ebea4-b45c-4f33-a05f-c9f2ac7c3175" /> <img width="2305" height="839" alt="image" src="https://github.com/user-attachments/assets/fcb77d72-38e9-494f-a164-736bbc5fef05" /> If you want to see it live, download [this file](https://github.com/user-attachments/files/21534687/export.ndjson.zip), unzip it, and import the resulting `export.ndjson` file in a Serverless Observability project. Then, click on "Add data" > "Application" > "OpenTelemetry" > "Managed OTLP Endpoint", and copy the URL and API Key <img width="1727" height="991" alt="image" src="https://github.com/user-attachments/assets/60d94e92-ca6d-4002-a6cd-b4c6e3b85c3f" /> Then configure your local Kibana with the following: ```yaml telemetry.metrics: enabled: true interval: 10s exporters: - grpc: url: <the URL you copied> headers: authorization: "ApiKey <the API key that was generated>" ``` --- ### `@opentelemetry/exporter-metrics-otlp-http` - [x] **Purpose:** What is this dependency used for? Briefly explain its role in your changes. This is used to set up the OTel Trace OTLP exporter using the HTTP protocol. At the moment, we are only capable of shipping the OTel metrics using the gRPC protocol, but we'd like to be able to enable support for the HTTP exporter as well. - [x] **Justification:** Why is adding this dependency the best approach? It's the official and recommended exporter. When using the OTel/EDOT SDKs, if the process is run with the env var `OTEL_EXPORTER_OTLP_ENDPOINT`, it automatically instruments this exporter. We need to programmatically import it because we want to use the settings coming in `kibana.yml` instead, and we're not allowed to use dependencies that are not listed in our `package.json` (but this library was already installed by the SDK, as can be seen in the [yarn.lock file](https://github.com/elastic/kibana/pull/229696/files#diff-51e4f558fae534656963876761c95b83b6ef5da5103c4adef6768219ed76c2deR9483)). - [x] **Alternatives explored:** Were other options considered (e.g., using existing internal libraries/utilities, implementing the functionality directly)? If so, why was this dependency chosen over them? We didn't consider any other alternatives, since this is the official (and already installed by the SDK) OTel HTTP metrics exporter. - [x] **Existing dependencies:** Does Kibana have a dependency providing similar functionality? If so, why is the new one preferred? Yes, this library is already installed by the SDK for auto-configuration in case the env var `OTEL_EXPORTER_OTLP_ENDPOINT` is attached to the process. Similarly, we already use OTel's **Traces** (as opposed to Metrics) OTLP exporter using HTTP ([link](https://github.com/elastic/kibana/blob/6d55439bd795b6c8f01084dc4ce8b0e2cb7eb0a1/package.json#L1135)). This is the Metrics homonimus. --- ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [x] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [x] This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The `release_note:breaking` label should be applied in these situations. - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [x] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels. ### Identify risks Does this PR introduce any risks? For example, consider risks like hard to test bugs, performance regression, potential of data loss. Describe the risk, its severity, and mitigation for each identified risk. Invite stakeholders and evaluate how to proceed before merging. - [ ] [See some risk examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) - [ ] ... --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Summary
Resolves #229933
Partially addresses #224860
Partially addresses #230002
Notable changes in this PR:
telemetryconfig schema definitionmonitoring_collectionto stop it from replacing the newly registered global metrics provider if it has no exporters to set up.Small demo of the collected metrics
I've created a small dashboard to demo the metrics that are automatically collected by the registered instrumentation:
If you want to see it live, download this file, unzip it, and import the resulting
export.ndjsonfile in a Serverless Observability project.Then, click on "Add data" > "Application" > "OpenTelemetry" > "Managed OTLP Endpoint", and copy the URL and API Key
Then configure your local Kibana with the following:
@opentelemetry/exporter-metrics-otlp-httpThis is used to set up the OTel Trace OTLP exporter using the HTTP protocol. At the moment, we are only capable of shipping the OTel metrics using the gRPC protocol, but we'd like to be able to enable support for the HTTP exporter as well.
It's the official and recommended exporter. When using the OTel/EDOT SDKs, if the process is run with the env var
OTEL_EXPORTER_OTLP_ENDPOINT, it automatically instruments this exporter. We need to programmatically import it because we want to use the settings coming inkibana.ymlinstead, and we're not allowed to use dependencies that are not listed in ourpackage.json(but this library was already installed by the SDK, as can be seen in the yarn.lock file).We didn't consider any other alternatives, since this is the official (and already installed by the SDK) OTel HTTP metrics exporter.
Yes, this library is already installed by the SDK for auto-configuration in case the env var
OTEL_EXPORTER_OTLP_ENDPOINTis attached to the process.Similarly, we already use OTel's Traces (as opposed to Metrics) OTLP exporter using HTTP (link). This is the Metrics homonimus.
Checklist
Check the PR satisfies following conditions.
Reviewers should verify this PR satisfies this list as well.
release_note:breakinglabel should be applied in these situations.release_note:*label is applied per the guidelinesbackport:*labels.Identify risks
Does this PR introduce any risks? For example, consider risks like hard to test bugs, performance regression, potential of data loss.
Describe the risk, its severity, and mitigation for each identified risk. Invite stakeholders and evaluate how to proceed before merging.