Skip to content

Conversation

@swcollard
Copy link
Contributor

@swcollard swcollard commented Sep 2, 2025

This change implements Open Telemetry instrumentation to the Apollo MCP Server. With this feature, metrics and traces can be exported to an OpenTelemetry Protocol (OTLP) compatible collector for use in any observability vendor that supports OTel such as grafana, datadog, and jaegar.

All documentation to configure the feature is included in this PR, as well as the list of metrics being reported.

For reviewing, the main sections are

  • Configuration for setting up metric and trace exporting
  • Instrumentation of important sections of the code. MCP functions like call tool and initialization, and the processes of generating MCP Tools from GraphQL Operations.
  • A new prebuild step that generates an Enum of metrics and attributes from a telemetry.toml file
  • Propagation of OTel trace headers to downstream API for distributed tracing of MCP Tool calls.
  • Configuration to omit potentially high-cardinality attributes from the metrics and traces being exported.

@apollo-librarian
Copy link

apollo-librarian bot commented Sep 2, 2025

✅ Docs preview ready

The preview is ready to be viewed. View the preview

File Changes

2 new, 4 changed, 0 removed
+ (developer-tools)/apollo-mcp-server/(latest)/cors.mdx
+ (developer-tools)/apollo-mcp-server/(latest)/telemetry.mdx
* (developer-tools)/apollo-mcp-server/(latest)/config-file.mdx
* (developer-tools)/apollo-mcp-server/(latest)/define-tools.mdx
* (developer-tools)/apollo-mcp-server/(latest)/guides/auth-auth0.mdx
* (developer-tools)/apollo-mcp-server/(latest)/_sidebar.yaml

Build ID: c512a9892f30671ee8a3a2d6
Build Logs: View logs

URL: https://www.apollographql.com/docs/deploy-preview/c512a9892f30671ee8a3a2d6

@swcollard swcollard force-pushed the feature/telemetry branch 2 times, most recently from fd83a1a to a2e0713 Compare September 23, 2025 17:30
@swcollard swcollard marked this pull request as ready for review September 23, 2025 18:08
@swcollard swcollard requested review from a team as code owners September 23, 2025 18:08

#### Observability platform integration

The MCP server works with any OTLP-compatible backend. Consult your provider's documentation for specific endpoint URLs and authentication:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the joining-the-dots here!

Two links are throwing 404s though - DataDog & New Relic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, just updated those with new links

Copy link
Contributor

@nicholascioli nicholascioli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks excellent! I've left a few nits and general questions.

Comment on lines +8 to +12
#[derive(Debug)]
pub struct FilteringExporter<E> {
inner: E,
omitted: HashSet<Key>,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: If most of the methods are going to be pass-through, you might want to just implement Deref for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the remaining ones, since they are more related to code cleanup and not functional changes, I am going to create an internal follow up ticket to address them so we can get this merged and the release candidate build created for QA. Appreciate all the feedback!

Ok(self.get_info())
}

#[tracing::instrument(skip(self, context, request), fields(apollo.mcp.tool_name = request.name.as_ref(), apollo.mcp.request_id = %context.id.clone()))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Using the raw field name here instead of the constants generated by the build script might cause some confusion later if they ever diverge. While the macro might not let you use the constant, you should be able to run the code generated by this macro manually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This entire file can probably be generated with the generated code.

swcollard and others added 11 commits September 24, 2025 10:09
* Create a prototype of otel emitting traces to local jaegar instance

* Add tracing annotations

* Combine logging and tracing

* copilot feedback

* Add changeset

* Instrument other tools

* Clippy fixes

* PR feedback

* Change default env name to development

Co-authored-by: Dale Seo <[email protected]>

* Remove some extraneous extensions

---------

Co-authored-by: Dale Seo <[email protected]>
* Implement metrics for mcp tool and operation counts and durations

* Changeset

* Unit test attribute setting in graphql.rs

* Add axum_otel_metrics for emitting basic http metrics about requests

* Lazy load singleton Meter for metrics

* Alphabetize

* Simplify result.is_error checking
…l headers downstream (#307)

* Fix sending OTLP trace headers downstream to GQL API

* Changeset
* Add basic config file options to otel telemetry

* Happy path unit test for telemetry config

* Changeset

* Taplo format

* Refactor to add a couple more tests

* Update unit test for clippy after rust upgrade

* Rename unit tests
* feat: adding ability to omit attributes for traces and metrics

* chore: adding changeset

* chore: updating changeset text

* chore: removing unused file

* chore: renaming exporter file

* chore: renaming module

* re-running checks

* chore: auto-gen the as_str function using enums instead of constants

* chore: updating changeset entry and fixing fields in instruemnt attribute

* chore: adding description as a doc string

* chore: updating changeset  entry

* chore: updating operation attribute descriptions

* chore: updating operation_type attribute to operation_source

* chore: skipping request from instrumentation

* chore: fixing clippy issues

* re-running nix build

* updating unit test snapshot

* chore: fixing toml formatting issues

* test: removing asserts on constants

* chore: format issues

* test: adjusting unit test to not use local envar

* revert: undoing accidental commit

* test: removing unnecessary assert
* feat: adding config option for trace sampling

* chore: adding changeset entry
Add status code to the http trace
Changeset

Unit tests for auth.rs

Add raw_operation as a telemetry attribute

Unit test starting a streamable http server

Rename self to operation_source
Add telemetry docs to sidebar

Cleaning up

Fix title on grafana guide

Update the overview

Update configuration section to be more complete

Add a quick start guide and info about prod

Update production section with more useful ino

Apply suggestions from code review

Style edits

Co-authored-by: Joseph Caudle <[email protected]>

Remove custom from 'custom metrics'

Remove Grafana how-to guide

Move configuration reference to the config page

Add a note about cardinality control using sampling and attribute filtering

Fix typo: traaces -> traces
Fix links to external vendors OTLP docs

pr feedback: LazyLock meter, debug and todo removal, comments
@swcollard swcollard merged commit 9f421b1 into develop Sep 24, 2025
11 checks passed
@swcollard swcollard deleted the feature/telemetry branch September 24, 2025 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

How the Support of the Open telemetry is addressed with the Apollo MCP Server?.

6 participants