Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receive OTLP HTTP metrics, forward to Prometheus, collision on metric names #21471

Closed
marcalff opened this issue Apr 17, 2023 · 11 comments
Closed

Comments

@marcalff
Copy link
Member

marcalff commented Apr 17, 2023

Setup:

  • C++ application (test app, opentelemetry-cpp 1.9.0, OTLP HTTP exporter)
  • opentelemetry-collector 0.75.0
  • prometheus 2.42.0

In a test application, define meters and metrics as follows:

  • Meter my_meter_1
    • Metric my_metric_a
    • Metric my_metric_b
  • Meter my_meter_2
    • Metric my_metric_a
    • Metric my_metric_b
    • Metric my_metric_c

As far as I understand, this is valid, because what identifies a metric is the instrumentation scope + the metric name, so that my_meter_1.my_metric_a and my_meter_2.my_metric_a are two different metrics.

Export this using OTLP HTTP, to an opentelemetry-collector.

From the collector logs, the data received is:

2023-04-17T20:40:58.805+0200	info	MetricsExporter	{"kind": "exporter", "data_type": "metrics", "name": "logging", "#metrics": 5}
2023-04-17T20:40:58.805+0200	info	ResourceMetrics #0
Resource SchemaURL: 
Resource attributes:
     -> telemetry.sdk.version: Str(1.9.0)
     -> telemetry.sdk.language: Str(cpp)
     -> service.version: Str(xxx)
     -> telemetry.sdk.name: Str(opentelemetry)
     -> service.instance.id: Str(xxx)
     -> service.name: Str(xxx)
     -> service.namespace: Str(xxx)
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope my_meter_1 1.0
Metric #0
Descriptor:
     -> Name: my_metric_b
     -> Description: description b
     -> Unit: 
     -> DataType: Sum
     -> IsMonotonic: true
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2023-04-17 18:40:37.784020265 +0000 UTC
Timestamp: 2023-04-17 18:40:58.697529476 +0000 UTC
Value: 110
Metric open-telemetry/opentelemetry-collector#1
Descriptor:
     -> Name: my_metric_a
     -> Description: description a
     -> Unit: 
     -> DataType: Sum
     -> IsMonotonic: true
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2023-04-17 18:40:37.784020265 +0000 UTC
Timestamp: 2023-04-17 18:40:58.697529476 +0000 UTC
Value: 108
ScopeMetrics open-telemetry/opentelemetry-collector#1
ScopeMetrics SchemaURL: 
InstrumentationScope my_meter_2 1.0
Metric #0
Descriptor:
     -> Name: my_metric_b
     -> Description: description b
     -> Unit: 
     -> DataType: Sum
     -> IsMonotonic: true
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2023-04-17 18:40:37.784020265 +0000 UTC
Timestamp: 2023-04-17 18:40:58.697764025 +0000 UTC
Value: 116
Metric open-telemetry/opentelemetry-collector#1
Descriptor:
     -> Name: my_metric_a
     -> Description: description a
     -> Unit: 
     -> DataType: Sum
     -> IsMonotonic: true
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2023-04-17 18:40:37.784020265 +0000 UTC
Timestamp: 2023-04-17 18:40:58.697764025 +0000 UTC
Value: 114
Metric open-telemetry/opentelemetry-collector#2
Descriptor:
     -> Name: my_metric_c
     -> Description: description c
     -> Unit: 
     -> DataType: Sum
     -> IsMonotonic: true
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2023-04-17 18:40:37.784020265 +0000 UTC
Timestamp: 2023-04-17 18:40:58.697764025 +0000 UTC
Value: 118
	{"kind": "exporter", "data_type": "metrics", "name": "logging"}

So far, this works as expected: the opentelemetry collector receives data that describe:

  • a measurement for instrumentation scope my_meter_1 and metric my_metric_a
  • another measurement for instrumentation scope my_meter_2 and metric my_metric_a

Now, in the opentelemetry collector, forward the metric data to prometheus:

receivers:
  otlp:
    protocols:
      http:

processors:
  batch:
  memory_limiter:
    # 75% of maximum memory up to 4G
    limit_mib: 1536
    # 25% of limit up to 2G
    spike_limit_mib: 512
    check_interval: 5s

exporters:
  logging:
    loglevel: debug

  # Forward metrics to Prometheus
  prometheusremotewrite:
    endpoint: "http://localhost:9090/api/v1/write"
    # For official Prometheus (e.g. running via Docker)
    # endpoint: 'http://prometheus:9090/api/v1/write'
    # tls:
    #   insecure: true

service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [logging, prometheusremotewrite]

Data is sent using prometheusremotewrite, and is visible in prometheus.

The issue is that prometheus only displays:

  • my_metric_a
  • my_metric_b
  • my_metric_c

The fully qualified name (my_meter_1.my_metric_a) is lost in the chain, so that data from different meters (my_meter_1 and my_meter_2) is mixed when there is a name collision on the metric name (my_meter_a).

Expected result:

  • my_meter_1_my_metric_a visible in prometheus (or something similar)
  • my_meter_2_my_metric_a visible in prometheus

In other words, there should be some way to preserve the fully qualified name of a metric when exporting it from OTLP to prometheus, so that metrics from different meters can be separated.

@marcalff marcalff added the bug Something isn't working label Apr 17, 2023
@bogdandrutu
Copy link
Member

Move to contrib, since this is an issue in translating otlp to prometheus remote write.

@bogdandrutu bogdandrutu transferred this issue from open-telemetry/opentelemetry-collector May 3, 2023
@github-actions
Copy link
Contributor

github-actions bot commented May 3, 2023

Pinging code owners for exporter/prometheusremotewrite: @Aneurysm9 @rapphil. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@Aneurysm9
Copy link
Member

The PRW translation logic needs to be updated to conform to the spec for handling instrumentation scope in Prometheus exporters. The PRW exporter should also be given a configuration option to disable adding these labels.

@marcalff
Copy link
Member Author

Familiar neither with the code nor with go, so I can't provide a patch.

From code reading, it seems the naming translation happens at:

  • File pkg/translator/prometheus/normalize_name.go
  • Function BuildPromCompliantName()
  • Function normalizeName()

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jul 17, 2023
@marcalff
Copy link
Member Author

@Aneurysm9 @rapphil @kovrus

Ping to avoid automatically closing the issue, the problem is still relevant.

If this is tracked by another issue number, please link it.

@github-actions github-actions bot removed the Stale label Jul 18, 2023
@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Sep 18, 2023
@marcalff
Copy link
Member Author

Again:

@Aneurysm9 @rapphil @kovrus

Ping to avoid automatically closing the issue, the problem is still relevant.

If this is tracked by another issue number, please link it.

In light of:

I would like to know how an application is expected to instrument metrics here:

  • Should the metric name be globally unique, avoiding the collision ?
  • Is it ok to have the same metric name provided by different meters, and if so how is the prometheus exporter handling the collision ?

@github-actions github-actions bot removed the Stale label Sep 19, 2023
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Nov 20, 2023
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 19, 2024
@marcalff
Copy link
Member Author

Ping @Aneurysm9 @rapphil @kovrus @bogdandrutu
CC @svrnm

This issue is still relevant, and is still blocking to deploy an application using:
opentelemetry-cpp --> opentelemetry-collector --> prometheus

Per previous analysis from @Aneurysm9:

The PRW translation logic needs to be updated to conform to the spec for handling instrumentation scope in Prometheus exporters. The PRW exporter should also be given a configuration option to disable adding these labels.

This is a confirmed bug in the prometheus write exporter.

Why are verified bugs automatically closed by github actions, which purpose does that serve ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants