-
Notifications
You must be signed in to change notification settings - Fork 351
feat(otel): add support for otel metrics api via protobuf and json #6783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- Add metrics.proto and metrics_service.proto (OTLP v1 spec) - Update protobuf_loader to support metrics protos - Rename protos/ -> otlp/ directory for better organization
- Create OtlpHttpExporterBase for shared HTTP export logic - Create OtlpTransformerBase for shared transformation logic - Refactor logs exporter/transformer to extend base classes - Update test mocking paths - Eliminates ~400 lines of duplication
packages/dd-trace/src/opentelemetry/metrics/periodic_metric_reader.js
Outdated
Show resolved
Hide resolved
| dataPoint.timeUnixNano = timestamp | ||
| } | ||
|
|
||
| #aggregateHistogram (metric, value, attributes, attrKey, timestamp, stateKey, cumulativeState) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BenchmarksBenchmark execution time: 2025-11-03 20:03:42 Comparing candidate commit 822e4dc in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 1604 metrics, 66 unstable metrics. |
Overall package sizeSelf size: 13.26 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.7.0 | 35.02 MB | 35.02 MB | | @datadog/native-appsec | 10.3.0 | 20.73 MB | 20.74 MB | | @datadog/native-iast-taint-tracking | 4.0.0 | 11.72 MB | 11.73 MB | | @datadog/pprof | 5.12.0 | 11.19 MB | 11.57 MB | | @opentelemetry/core | 1.30.1 | 908.66 kB | 7.16 MB | | protobufjs | 7.5.4 | 2.95 MB | 5.82 MB | | @datadog/wasm-js-rewriter | 4.0.1 | 2.85 MB | 3.58 MB | | @opentelemetry/resources | 1.9.1 | 306.54 kB | 1.74 MB | | @datadog/native-metrics | 3.1.1 | 1.02 MB | 1.43 MB | | @opentelemetry/api-logs | 0.207.0 | 201.39 kB | 1.42 MB | | @opentelemetry/api | 1.9.0 | 1.22 MB | 1.22 MB | | jsonpath-plus | 10.3.0 | 617.18 kB | 1.08 MB | | import-in-the-middle | 1.15.0 | 127.66 kB | 856.24 kB | | lru-cache | 10.4.3 | 804.3 kB | 804.3 kB | | @datadog/openfeature-node-server | 0.1.0-preview.13 | 106.46 kB | 424.36 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | source-map | 0.7.6 | 185.63 kB | 185.63 kB | | pprof-format | 2.2.1 | 163.06 kB | 163.06 kB | | @datadog/sketches-js | 2.1.1 | 109.9 kB | 109.9 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 7.0.5 | 63.38 kB | 63.38 kB | | istanbul-lib-coverage | 3.2.2 | 34.37 kB | 34.37 kB | | rfdc | 1.4.1 | 27.15 kB | 27.15 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | shell-quote | 1.8.3 | 23.74 kB | 23.74 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | semifies | 1.0.0 | 15.84 kB | 15.84 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | mutexify | 1.4.0 | 5.71 kB | 8.74 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | module-details-from-path | 1.0.4 | 3.96 kB | 3.96 kB | | escape-string-regexp | 5.0.0 | 3.66 kB | 3.66 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
This comment has been minimized.
This comment has been minimized.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6783 +/- ##
==========================================
+ Coverage 83.62% 83.93% +0.31%
==========================================
Files 506 514 +8
Lines 21373 21709 +336
==========================================
+ Hits 17873 18222 +349
+ Misses 3500 3487 -13 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| Metrics are collected periodically and exported via OTLP over HTTP. The protocol can be configured using `OTEL_EXPORTER_OTLP_METRICS_PROTOCOL` or `OTEL_EXPORTER_OTLP_PROTOCOL` environment variables. Supported protocols are `http/protobuf` (default) and `http/json`. All metrics use delta aggregation temporality to match Datadog's data model. For complete OTLP exporter configuration options, see the [OpenTelemetry OTLP Exporter documentation](https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/). | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The text is totally fine, I think it would just be more straight forward in case the individual configuration has all entries listed right away instead of having a separate section with a longer text that describes that.
I would therefore just inline this content into the above variables besides the parts that apply across multiple envs.
| target.otelLogsBatchTimeout = maybeInt(OTEL_BSP_SCHEDULE_DELAY) | ||
| target.otelLogsMaxExportBatchSize = maybeInt(OTEL_BSP_MAX_EXPORT_BATCH_SIZE) | ||
|
|
||
| const otelMetricsExporter = String(OTEL_METRICS_EXPORTER).toLowerCase() !== 'none' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit
| const otelMetricsExporter = String(OTEL_METRICS_EXPORTER).toLowerCase() !== 'none' | |
| const otelMetricsExporter = !OTEL_METRICS_EXPORTER || OTEL_METRICS_EXPORTER.toLowerCase() !== 'none' |
| target.otelLogsMaxExportBatchSize = maybeInt(OTEL_BSP_MAX_EXPORT_BATCH_SIZE) | ||
|
|
||
| const otelMetricsExporter = String(OTEL_METRICS_EXPORTER).toLowerCase() !== 'none' | ||
| this.#setBoolean(target, 'otelMetricsEnabled', DD_METRICS_OTEL_ENABLED && otelMetricsExporter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have some documentation about the none exporter deactivating the metrics?
| target.otelMetricsTimeout = maybeInt(OTEL_EXPORTER_OTLP_METRICS_TIMEOUT) || target.otelTimeout | ||
| target.otelMetricsExportTimeout = maybeInt(OTEL_METRIC_EXPORT_TIMEOUT) | ||
| target.otelMetricsExportInterval = maybeInt(OTEL_METRIC_EXPORT_INTERVAL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is zero allowed for any of these values?
| if (OTEL_EXPORTER_OTLP_ENDPOINT || OTEL_EXPORTER_OTLP_METRICS_ENDPOINT) { | ||
| this.#setString(target, 'otelMetricsUrl', OTEL_EXPORTER_OTLP_METRICS_ENDPOINT || target.otelUrl) | ||
| } | ||
| this.#setString(target, 'otelMetricsHeaders', OTEL_EXPORTER_OTLP_METRICS_HEADERS || target.otelHeaders) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: these will mess with telemetry values for now, while that is an issue in lots of places and it will be resolved in another PR where we fix the telemetry (the issue is that the property will be defined by either another property or the env and that can not be differentiated for the telemetry being defined like that).
| #startTime | ||
|
|
||
| constructor (temporalityPreference = TEMPORALITY.DELTA) { | ||
| this.#temporalityPreference = temporalityPreference | ||
| this.#startTime = Number(process.hrtime.bigint()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #startTime | |
| constructor (temporalityPreference = TEMPORALITY.DELTA) { | |
| this.#temporalityPreference = temporalityPreference | |
| this.#startTime = Number(process.hrtime.bigint()) | |
| #startTime = Number(process.hrtime.bigint()) | |
| constructor (temporalityPreference = TEMPORALITY.DELTA) { | |
| this.#temporalityPreference = temporalityPreference |
| if (!metricsMap.has(metricKey)) { | ||
| metricsMap.set(metricKey, { | ||
| name, | ||
| description, | ||
| unit, | ||
| type, | ||
| instrumentationScope, | ||
| temporality: this.#getTemporality(type), | ||
| data: [], | ||
| dataPointMap: new Map() | ||
| }) | ||
| } | ||
|
|
||
| const metric = metricsMap.get(metricKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (!metricsMap.has(metricKey)) { | |
| metricsMap.set(metricKey, { | |
| name, | |
| description, | |
| unit, | |
| type, | |
| instrumentationScope, | |
| temporality: this.#getTemporality(type), | |
| data: [], | |
| dataPointMap: new Map() | |
| }) | |
| } | |
| const metric = metricsMap.get(metricKey) | |
| let metric = metricsMap.get(metricKey) | |
| if (!metric) { | |
| metric = { | |
| name, | |
| description, | |
| unit, | |
| type, | |
| instrumentationScope, | |
| temporality: this.#getTemporality(type), | |
| data: [], | |
| dataPointMap: new Map() | |
| } | |
| metricsMap.set(metricKey, metric) | |
| } |
|
|
||
| this.#applyDeltaTemporality(metrics, lastExportedState) | ||
|
|
||
| return metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a public return type or is it only used internally?
|
|
||
| const scopeKey = this.#getScopeKey(instrumentationScope) | ||
| const metricKey = `${scopeKey}:${name}:${type}` | ||
| const attrKey = JSON.stringify(attributes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the result of JSON.stringify on an object as key is not stable.
E.g., { a: 1, b: 2 } is not equal to { b: 2, a: 1 }
You can use a library such as https://www.npmjs.com/package/safe-stable-stringify
This should also be checked in other parts of the code where we do similar things.
Please also add test cases for that.
| } | ||
| } | ||
|
|
||
| delete metric.dataPointMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am uncertain why that property is removed. Could you add a comment? I would also likely just set it to undefined anyway.
What does this PR do?
Adds full OpenTelemetry Metrics support to dd-trace-js with a custom Meter Provider implementation. Enable with
DD_METRICS_OTEL_ENABLED=trueto export metrics via OTLP protocol.Key Features:
http/protobuf(default) orhttp/jsonprotocolsOTEL_EXPORTER_OTLP_METRICS_*environment variablesConfiguration:
DD_METRICS_OTEL_ENABLED- Enable OpenTelemetry metrics (default:false)OTEL_EXPORTER_OTLP_METRICS_ENDPOINT- Endpoint URL (default:http://localhost:4318/v1/metrics)OTEL_EXPORTER_OTLP_METRICS_PROTOCOL- Protocol:http/protobuforhttp/jsonOTEL_METRIC_EXPORT_INTERVAL- Export interval in ms (default:60000)Motivation
Enables customers to use OpenTelemetry Metrics API with dd-trace-js without adding the OpenTelemetry SDK as a dependency. Custom implementation provides better integration with dd-trace-js configurations, avoids vendoring grpc libraries and maintains flexibility.
Additional Notes