Skip to content

Trace and metrics exporter wrappers to append details to errors#8363

Merged
bonnici merged 16 commits intodevfrom
njm/ROUTER-1283/otel-error-messages
Oct 13, 2025
Merged

Trace and metrics exporter wrappers to append details to errors#8363
bonnici merged 16 commits intodevfrom
njm/ROUTER-1283/otel-error-messages

Conversation

@bonnici
Copy link
Contributor

@bonnici bonnici commented Oct 1, 2025

Adds a wrapper around MetricsExporters and SpanExporters so that the exporter type can be appended to error messages. This allows us to differentiate between e.g. "Apollo OTel" errors and "OTLP exporter" errors.

Error messages will now look like:

  • Apollo traces sent via OTLP
ERROR  OpenTelemetry trace error occurred: [apollo traces] Exporter otlp encountered the following error(s): the grpc server returns error (The service is currently unavailable): , detailed error message: dns error
  • Apollo traces sent via usage reports
ERROR  OpenTelemetry trace error occurred: [apollo traces] Exporter ApolloExporter encountered the following error(s): Apollo exporter unavailable error: error sending request for url (https://usage-reporting.api.dev0.c0.gql.zone/api/ingress/traces)
  • Zipkin traces
ERROR  OpenTelemetry trace error occurred: [zipkin traces] error sending request for url (http://0.0.0.1:4100/api/v2/spans)
  • OTLP traces
ERROR  OpenTelemetry trace error occurred: [otlp traces] Exporter otlp encountered the following error(s): the grpc server returns error (The service is currently unavailable): , detailed error message: tcp connect error
  • Apollo OTLP metrics
ERROR  OpenTelemetry metric error occurred: Metrics error: [apollo metrics] the grpc server returns error (The service is currently unavailable): , detailed error message: dns error
  • Apollo usage report metrics (unchanged)
ERROR  failed to submit Apollo report: Apollo exporter unavailable error: error sending request for url (https://usage-reporting.api.dev0.c0.gql.zone/api/ingress/traces)
  • Other OTLP metrics
OpenTelemetry metric error occurred: Metrics error: [otlp metrics] the grpc server returns error (The service is currently unavailable): , detailed error message: tcp connect error

Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

  • PR description explains the motivation for the change and relevant context for reviewing
  • PR description links appropriate GitHub/Jira tickets (creating when necessary)
  • Changeset is included for user-facing changes
  • Changes are compatible1
  • Documentation2 completed
  • Performance impact assessed and acceptable
  • Metrics and logs are added3 and documented
  • Tests added and passing4
    • Unit tests
    • Integration tests
    • Manual tests, as necessary

Exceptions

Note any exceptions here

Notes

Footnotes

  1. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this.

  2. Configuration is an important part of many changes. Where applicable please try to document configuration examples.

  3. A lot of (if not most) features benefit from built-in observability and debug-level logs. Please read this guidance on metrics best-practices.

  4. Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions.

@apollo-librarian
Copy link
Contributor

apollo-librarian bot commented Oct 1, 2025

✅ Docs preview has no changes

The preview was not built because there were no changes.

Build ID: b8fce3fb9722fa6379b82d87
Build Logs: View logs

@github-actions

This comment has been minimized.

@bonnici bonnici marked this pull request as ready for review October 2, 2025 02:13
@bonnici bonnici requested a review from a team October 2, 2025 02:13
@bonnici bonnici requested a review from a team as a code owner October 2, 2025 02:13
@bonnici bonnici force-pushed the njm/ROUTER-1283/otel-error-messages branch from 6d86e23 to d693ace Compare October 2, 2025 02:42
@bonnici bonnici force-pushed the njm/ROUTER-1283/otel-error-messages branch from 461df35 to 62b639c Compare October 3, 2025 01:07
Copy link
Contributor

@BrynCooke BrynCooke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this improves and removes the need for NamedTokioRumtime? If so let's remove it. If not then we can go ahead and merge.

@bonnici
Copy link
Contributor Author

bonnici commented Oct 9, 2025

Looks like this improves and removes the need for NamedTokioRumtime? If so let's remove it. If not then we can go ahead and merge.

I wasn't sure what NamedTokioRuntime was used for, but if it's just for this use-case I'll remove it.

@BrynCooke BrynCooke self-requested a review October 10, 2025 08:04
@@ -0,0 +1,5 @@
### Trace and metrics exporter wrappers to append details to errors ([PR #8363](https://github.com/apollographql/router/pull/8363))

Added a wrapper around `MetricsExporter`s and `SpanExporter`s so that the exporter type can be appended to error messages. This allows us to differentiate between e.g. "Apollo OTel" errors and "OTLP exporter" errors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changesets should focus on end user impact. I'm guessing it would be something like this:

Suggested change
Added a wrapper around `MetricsExporter`s and `SpanExporter`s so that the exporter type can be appended to error messages. This allows us to differentiate between e.g. "Apollo OTel" errors and "OTLP exporter" errors.
Error messages raised during tracing and metrics exporting now indicate if the error occurred when exporting to Apollo Studio or to the user's configured OTLP endpoint.

but it would also be helpful to clarify exactly how and why this is useful for customers

@bonnici bonnici enabled auto-merge (squash) October 13, 2025 04:18
@bonnici bonnici merged commit 5bb86c7 into dev Oct 13, 2025
15 checks passed
@bonnici bonnici deleted the njm/ROUTER-1283/otel-error-messages branch October 13, 2025 04:36
@abernix abernix mentioned this pull request Oct 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants