-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTel Collector compatibility of the metrics-generator #2970
Comments
Thanks for the issue!
Grafana agent vendors OTel components directly so it will be in line with the OTel Collector's behavior.
Yes! Currently we do have a lot of configuration options that allow the user to control the shape of the output metrics. I think a good path here is to make sure that Tempo span metrics can be configured to look like OTel Collector metrics by adding any required configuration. Then we can provide a some example configurations to make the two equivalent. It is unfortunately not simple to just change the default metric names since operators have built dashboards/alerts/etc. on top of them. This could be a very costly breaking change to some of our users. Another issue in play is that Grafana has some custom experiences built around these metrics in the Tempo Explore pane. So even if the user were able to adjust their config so Tempo metrics looked like OTel there would still be this gap where it would break some functionality in Grafana. Going to cc @grafana/observability-traces-and-profiling for thoughts. Also, @rlankfo has done some work in this area on our side and would love to have his input. |
Yeah we sort of assume the names of the metrics in all the queries that use them from tempo side. We could make that configurable, that is not hard, although that is another layer of configuration which makes it harder for the user. I wonder if we could somehow autodetect the naming, if there is a reasonable pattern with just a few options like |
When generating metrics in Tempo, it's possible to use relabeling during remote write. This would allow you to do things like rename metrics, drop labels, etc. You should technically be able to align your metric names with semantic conventions in this way. Here's an example of a rename and label drop: metrics_generator:
registry:
external_labels:
source: "tempo"
storage:
path: "/tmp/tempo/generator/wal"
remote_write:
- url: "${MIMIR_URL}/api/v1/push"
send_exemplars: true
write_relabel_configs:
- source_labels: ["__name__", "connection_type"]
target_label: "__name__"
separator: "@"
regex: "traces_service_graph_request_client_(.*)@database"
replacement: 'db_client_duration_$1'
- regex: "connection_type"
action: "labeldrop" In this example, I'm renaming This is a good article on how relabeling in prometheus works: https://grafana.com/blog/2022/03/21/how-relabeling-in-prometheus-works/ I hope this helps! |
@aocenas This might be a nice feature to add the list. If we can support Tempo and OTEL then we'd also get Grafana Agent (since they create OTEL metrics). Honestly, you might not even need to autodetect anything. We might be able to write some clever PromQL queries that sum both values up. Thanks for the example @rlankfo ! |
After some further testing, I came across some more findings, which are not fully related to the compatibility, as they also affect the specific OTel Connectors, but wanted to add them here for more context.
Due to these reasons I am currently testing an alternative approach to create a Node Graph Panel in Grafana based on the existing client metrics with some promql and Grafana transformation magic, but still have to wrap my head around it. |
Renaming the prometheus metrics during export would probably break the tempo panel in grafana (node graphs and span RPS, error rate, etc). |
Is your feature request related to a problem? Please describe.
While trying to test the integration of the Span Metrics Connector I found that there are some compatibility issues between the OTel
spanmetricsconnector
and the Tempospanmetrics
processor (I didn't have look at the Grafana Agent)The metric names in Tempo seem to differ from the "OTel Semantic Conventions" (v1.21.0)
[namespace_]duration_milliseconds_bucket
vs.traces_spanmetrics_latency_bucket
From inspecting the Semantic Conventions for HTTP Metrics as well as other Metrics (e.g. Promtail, Loki, OTel Automatic Instrumentations, ...) duration seems to be the correct and most commonly used name
The OTel
spanmetricconnector
allows to define a namespace for the generated metrics.Grafana and Tempo use a hardcoded
traces_spanmetrics
namespace/prefix, which seems, can not be changed.Describe the solution you'd like
It would be really great if the OTel Connectors would also work seamlessly with Grafana and Tempo (including the OTel Service Graph Connector), to allow frictionless migration between different components, based on the individual use-case.
Namespace support would also be nice, but could be addressed alternatively with a default namespace in the Connector, or a note in the documentation.
Describe alternatives you've considered
To have Span Metrics and Service Graphs working with Grafana, the only viable option seems to use Tempo's Metrics-generator, or use a Processor in the Collector to rename the metrics to be compatible with what Grafana needs.
Additional context
Change spanmetrics metric names and labels to match OTel conventions #1478
Merge span metrics and service graph into span metrics connector open-telemetry/opentelemetry-collector-contrib#26648
The text was updated successfully, but these errors were encountered: