chore: Update to OpenTelemetry 0.31.0#8922
Merged
Conversation
added 30 commits
February 19, 2026 19:40
Add comprehensive migration plan document covering: - Dependency updates (OTel 0.31, datadog 0.19) - Internal datadog exporter removal - API changes (Resource, Key/KeyValue, instrument builders) - SpanExporter trait lifetime changes with Arc<Mutex> pattern - MetricExporter temporality configuration - Observable gauge lifecycle management - 18 phases with verified answers to all open questions
Update all OpenTelemetry crates to their latest compatible versions: - opentelemetry/sdk/otlp/zipkin/prometheus/http/semantic-conventions: 0.31 - opentelemetry-aws: 0.19 - opentelemetry-datadog: 0.19 - tracing-opentelemetry: 0.32 (compatible with OTel 0.31) This is Phase 1 of the OTel 0.31 migration.
Replace the forked internal datadog exporter with the external opentelemetry-datadog crate (version 0.19). Key changes: - Delete tracing/datadog_exporter/ directory with all internal exporter code - Create tracing/datadog/propagator.rs preserving our custom propagator with full SamplingPriority support (UserReject, AutoReject, AutoKeep, UserKeep) that the external crate doesn't provide - Update DatadogExporter usage to opentelemetry_datadog::DatadogExporter - Update imports throughout to use the new locations The external crate's DatadogTraceState::with_priority_sampling only takes a bool, so we keep our custom propagator implementation for full sampling priority control needed by the DatadogAgentSampling.
Remove NamedTokioRuntime and named_runtime_channel module which are no longer needed with OpenTelemetry SDK 0.31. The OTel 0.31 BatchSpanProcessor::builder() no longer takes a runtime parameter - it uses tokio internally. This simplifies all trace exporter configurations (Apollo, Datadog, OTLP, Zipkin).
OpenTelemetry SDK 0.31 replaces the Resource constructor API with a builder pattern: - Resource::new(attrs) -> Resource::builder_empty().with_attributes(attrs).build() - Resource::from_detectors(...) -> Resource::builder_empty().with_detectors(...).build() - Resource::empty() -> Resource::builder_empty().build()
OpenTelemetry 0.31 removes the convenience methods Key::string(), Key::array(), etc. Update all usages to use KeyValue::new() with explicit Value types where needed.
OpenTelemetry SDK 0.31 renames the instrument builder finalization method from .init() to .build() for consistency with other builder patterns in the SDK.
OpenTelemetry SDK 0.31 adds a new required field to SpanData: parent_span_is_remote. Set to false since we construct SpanData internally from LightSpanData, not from actual OTel spans with remote parent detection.
Update SpanExporter implementations for OTel 0.31 API changes: - export(&mut self, ...) -> export(&self, ...) - shutdown(&mut self) -> shutdown(&self) returning ExportResult - BoxFuture -> impl Future return type Updated exporters: - NamedSpanExporter (error_handler.rs) - ExporterWrapper (datadog/mod.rs) - FailingSpanExporter (test mock) Note: apollo_telemetry::Exporter needs Arc<Mutex> refactoring for interior mutability which is a larger change.
- Enable semconv_experimental feature for semantic conventions - Rename TracerProvider to SdkTracerProvider - Rename Builder to TracerProviderBuilder in tracing reload - Rewrite error_handler.rs for OTel 0.31 compatibility: - Remove global error handler (set_error_handler no longer exists) - Remove MetricsError usage (no longer exists in OTel 0.31) - Rename NamedMetricsExporter to NamedMetricExporter - Update SpanExporter and PushMetricExporter implementations - Rename MetricsExporterBuilder to MetricExporterBuilder - Fix SpanExporter trait implementation signatures Remaining work: metrics aggregation module requires significant refactoring due to InstrumentProvider trait changes in OTel 0.31.
The opentelemetry_otlp 0.31 removed new_exporter() function in favor of direct builder patterns: - TonicExporterBuilder::default() for gRPC transport - HttpExporterBuilder::default() for HTTP transport - SpanExporterBuilder::new().with_tonic() for span exporters - MetricExporterBuilder::new().with_tonic() for metric exporters Also updated build_metrics_exporter() to use with_temporality().build() as the aggregation/temporality selector API has been simplified.
The opentelemetry_sdk 0.31 simplified the ResourceDetector trait by removing the timeout parameter from detect(). Updated all three implementations (StaticResourceDetector, EnvServiceNameDetector, ConfigResourceDetector) to use the new signature.
The opentelemetry-datadog crate was only in dev-dependencies but is used in the main source code. Moved it to regular dependencies and removed the temporary comment noting it was disabled. Version 0.19 is compatible with opentelemetry 0.31.
The TemporalitySelector trait was removed in OpenTelemetry SDK 0.31. Temporality is now set directly on metric exporters via with_temporality(). Removed: - CustomTemporalitySelector struct and TemporalitySelector impl - Related temporality override tests - Import of CustomTemporalitySelector in metrics/apollo/mod.rs Added simple From<&Temporality> conversion for direct temporality setting.
In OTel SDK 0.31, the `new_view(Instrument, Stream)` function was removed and replaced with closure-based views: `Fn(&Instrument) -> Option<Stream>`. Changes: - Update MetricsBuilder::with_view() to accept closure instead of Box<dyn View> - Replace MetricView's TryInto<Box<dyn View>> with into_view_fn() method - Update allocation views to use closure pattern - Use Stream::builder() API instead of Stream::new() builder pattern
The global shutdown_tracer_provider() function was removed in OTel 0.31. Instead, set a new default tracer provider which causes the old one to be returned and dropped, triggering its shutdown.
The new_pipeline() function was removed in opentelemetry-zipkin 0.31. Use ZipkinExporter::builder() with with_collector_endpoint() instead. Service name is now handled via the Resource on the TracerProvider rather than being set directly on the exporter.
OTel 0.31 changed SpanExporter trait methods from &mut self to &self: - export(&mut self, ...) -> export(&self, ...) - shutdown(&mut self) -> shutdown(&self) -> OTelSdkResult - set_resource(&mut self, ...) -> set_resource(&self, ...) Introduce ExporterInner struct wrapped in Mutex to provide interior mutability while keeping all existing method implementations intact. The outer Exporter delegates to inner.lock().*_impl() methods. Also updates ApolloOtlpExporter to use &self and return OTelSdkResult, and replaces TraceError/ExportResult with OTelSdkError/OTelSdkResult.
In OTel SDK 0.31, the tracer provider struct was renamed from TracerProvider to SdkTracerProvider. The builder type remains TracerProviderBuilder. Updated all usages across: - src/tracer.rs - src/plugins/telemetry/reload/activation.rs - src/plugins/telemetry/reload/otel.rs - src/plugins/telemetry/reload/tracing.rs - src/plugins/telemetry/otel/tracer.rs - tests/common.rs
Temporality moved from opentelemetry_sdk::metrics::data::Temporality to opentelemetry_sdk::metrics::Temporality in OTel SDK 0.31.
Update the migration plan with detailed findings from implementation attempt: - Add current status section tracking completed commits - Add Phase 10A for tonic 0.14.5 upgrade (required first) - Update Phase 10B/11 with exact SpanExporter/SpanProcessor signatures - Rewrite Phase 18 with observable instrument API changes discovered: - ObservableCounter::new() takes 0 args in 0.31 - with_inner() is pub(crate) only - observe() removed from observable types - Solution: store extra observables in keep_alive collection - Add Phase 19 for trace Config API (builder methods removed) - Add Phase 20/21 for remaining fixes and test updates - Add summary of remaining commits in order
opentelemetry-otlp 0.31 depends on tonic 0.14.5. Update direct dependency to match and avoid version conflicts. Feature names changed in tonic 0.14: - tls → tls-ring - tls-roots → tls-native-roots Also update tonic-build to 0.14.5 for compatibility.
SpanProcessor trait changes in OTel 0.31: - force_flush() now returns OTelSdkResult instead of TraceResult<()> - shutdown() replaced by shutdown_with_timeout(Duration) - Import path: opentelemetry_sdk::error::OTelSdkResult Updated implementations: - ApolloFilterSpanProcessor in tracing/mod.rs - DatadogSpanProcessor in tracing/datadog/span_processor.rs - MockSpanProcessor in test code
SpanExporter trait changes in OTel 0.31: - export() returns impl Future instead of BoxFuture (remove #[async_trait]) - shutdown(&self) → shutdown(&mut self) - set_resource(&self) → set_resource(&mut self) - force_flush(&mut self) required (has default impl) Updated implementations: - Exporter in apollo_telemetry.rs - ApolloOtlpExporter::shutdown in apollo_otlp_exporter.rs - Call site in shutdown_impl now uses &mut self.otlp_exporter
In tonic 0.14, the prost-related build functionality was moved from tonic-build to a separate tonic-prost-build crate. This includes the configure() function used for protobuf compilation. Changes: - Add tonic-prost-build 0.14.0 as a build dependency - Update studio.rs to use tonic_prost_build::configure() - Fix compile_protos() call to pass PathBuf directly (API change)
ExportError was moved from opentelemetry to opentelemetry_sdk crate.
Aggregation enum moved from opentelemetry_sdk::metrics to opentelemetry_sdk::metrics::aggregation module.
Major API changes in the metrics module: - MeterProvider::versioned_meter replaced with meter_with_scope - SyncCounter/SyncHistogram/SyncGauge/SyncUpDownCounter traits replaced with unified SyncInstrument trait using measure() method - AsyncInstrument::as_any() method removed - InstrumentProvider methods now take builder types directly instead of individual parameters - Observable instrument callbacks are now set via builder, not register_callback method - opentelemetry::metrics::Result type removed Updated AggregateInstrumentProvider macros to use new InstrumentBuilder, HistogramBuilder, and AsyncInstrumentBuilder parameter types.
The AggregationSelector trait was removed in OpenTelemetry SDK 0.31. Histogram bucket boundaries are now configured using the Views API on the MeterProvider instead of passing an aggregation selector to exporters. Changes: - Remove CustomAggregationSelector from metrics/mod.rs - Update OTLP exporter to use MetricExporter::builder() pattern - Update Prometheus exporter to remove with_aggregation_selector() - Add histogram bucket boundary views to both exporters - Update Apollo metrics to use MetricExporter::builder() - Add spec_unstable_metrics_views feature for Aggregation type access - Fix Aggregation import path (now opentelemetry_sdk::metrics::Aggregation)
OpenTelemetry 0.31 introduced significant changes to the metrics API: MeterProvider changes: - `versioned_meter()` replaced by `meter_with_scope(InstrumentationScope)` - `GlobalMeterProvider` removed, use `Arc<dyn MeterProvider + Send + Sync>` - Added `public_dynamic()` for dynamic meter providers InstrumentProvider changes: - Methods now take builder types (InstrumentBuilder, HistogramBuilder, AsyncInstrumentBuilder) instead of individual parameters - `register_callback()` and related types (Observer, CallbackRegistration) removed Observable instrument changes: - Observable instruments are now marker types without `observe()` method - Observations happen through callbacks registered at build time - Aggregate observables now leak delegate storage to keep registrations alive Other changes: - Remove duplicate Eq/Hash derives from prost (now included by default) - AsyncInstrument trait now requires `T: Send + Sync` bounds - StreamBuilder::with_allowed_attribute_keys() now takes impl IntoIterator - Update test code to use new APIs
Member
|
i like the changeset, has all the important information and only the important information 👍 |
Member
goto-bus-stop
left a comment
There was a problem hiding this comment.
I pretty much went thru everything and this is all my comments!
| # This means including the rmp library | ||
| # opentelemetry-datadog = { version = "0.12.0", features = ["reqwest-client"] } | ||
| opentelemetry-aws = "0.19" | ||
| rmp = "0.8" |
Member
There was a problem hiding this comment.
Should we retain the TEMP DATADOG comments around this?
Contributor
Author
There was a problem hiding this comment.
I mean, we don't have the dependency at all anymore, so maybe not?
| // manually filter salsa logs because some of them run at the INFO level https://github.com/salsa-rs/salsa/issues/425 | ||
| let log_level = format!("{log_level},salsa=error"); | ||
| // filter opentelemetry internal logs to warn level (OTel 0.31 emits INFO logs for provider setup) | ||
| let log_level = format!("{log_level},salsa=error,opentelemetry=warn"); |
Member
There was a problem hiding this comment.
Is it still possible for users to explicitly set the opentelemetry log level w/ RUST_LOG?
Contributor
Author
There was a problem hiding this comment.
No, We're effectively saying that it will always be warn.
Co-authored-by: bryn <bryn@apollographql.com>
Member
|
I guess both the reintroduced tests are a bit flaky... e; oh, that one is just bc of the name, thanks the fix Rohan 😁 |
goto-bus-stop
approved these changes
Mar 13, 2026
rohan-b99
approved these changes
Mar 13, 2026
smyrick
pushed a commit
that referenced
this pull request
Mar 17, 2026
Co-authored-by: bryn <bryn@apollographql.com> Co-authored-by: Renée <renee.kooi@apollographql.com> Co-authored-by: rohan-b99 <43239788+rohan-b99@users.noreply.github.com>
smyrick
pushed a commit
that referenced
this pull request
Mar 20, 2026
Co-authored-by: bryn <bryn@apollographql.com> Co-authored-by: Renée <renee.kooi@apollographql.com> Co-authored-by: rohan-b99 <43239788+rohan-b99@users.noreply.github.com>
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Updates the Router to Otel 0.31.0, the latest at the time of writing.
This is mostly just updating to deal with changed APIs from upstream, however there are a couple of areas that are not compatible and required code changes.
Closes #7794
Closes #8368
Checklist
Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.
Exceptions
Note any exceptions here
Notes
Footnotes
It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. ↩
Configuration is an important part of many changes. Where applicable please try to document configuration examples. ↩
A lot of (if not most) features benefit from built-in observability and
debug-level logs. Please read this guidance on metrics best-practices. ↩Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. ↩