Skip to content

feat(observability): add OpenTelemetry tracing and metrics support#590

Merged
Cali0707 merged 7 commits intocontainers:mainfrom
nader-ziada:otel
Jan 21, 2026
Merged

feat(observability): add OpenTelemetry tracing and metrics support#590
Cali0707 merged 7 commits intocontainers:mainfrom
nader-ziada:otel

Conversation

@nader-ziada
Copy link
Collaborator

Add observability capabilities through OpenTelemetry, providing distributed tracing and metrics for MCP operations and HTTP requests.

  • Automatic tracing of all MCP tool calls via middleware
  • HTTP request/response tracing in server mode
  • New telemetry package with configurable OTLP tracing via environment variables
  • Metrics collection for tool execution statistics and HTTP request counters
  • Middleware-based tracing that follows MCP semantic conventions
  • Session-based span context propagation between tool calls
  • Enable via OTEL_EXPORTER_OTLP_ENDPOINT environment variable
  • Custom sampling rates via OTEL_TRACES_SAMPLER environment variables

Comment on lines 60 to 61
spanName := fmt.Sprintf("%s %s", method, path)
_, span := o.tracer.Start(ctx, spanName,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried we may be missing the traceparent/tracestate on these spans

In the current semconv proposal it seems like these should be stored in the req.Params.Meta of the MCP request. But, this runs before the request is deserialized into the MCP types.

Copy link
Collaborator Author

@nader-ziada nader-ziada Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add span propagation

Copy link
Collaborator

@Cali0707 Cali0707 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took me so long to review here @nader-ziada - thanks for fixing up all the spans!

Comment on lines 57 to 59
provider := sdkmetric.NewMeterProvider(
sdkmetric.WithReader(reader),
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to somewhere hook the provider.Shutdown(ctx) into the server shutdown, so that it can try to flush the metrics

_, span := o.tracer.Start(ctx, spanName,
trace.WithSpanKind(trace.SpanKindServer),
trace.WithAttributes(
attribute.String("http.method", method),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From https://opentelemetry.io/docs/specs/semconv/http/http-spans/#http-server-span I think that they have changed the attribute names 😓

It looks like these should probably be:

  1. "http.request.method"
  2. "url.path" (although they also have the "http.route" still - not totally sure for this one)
  3. "http.response.status_code"
    I think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed them :)


// Create OTLP exporter
// Endpoint is configured via OTEL_EXPORTER_OTLP_ENDPOINT env var
exporter, err := otlptracegrpc.New(ctx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we support allowing HTTP as well as grpc through the OTEL_EXPORTER_OTLP_PROTOCOL env var?

Copy link
Collaborator

@Cali0707 Cali0707 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@matzew
Copy link
Collaborator

matzew commented Jan 12, 2026

/assign

Will take a look 👁️

@nader-ziada
Copy link
Collaborator Author

@matzew @Cali0707 in progress: adding env vars in toml file for easier config

@nader-ziada
Copy link
Collaborator Author

@matzew @Cali0707 in progress: adding env vars in toml file for easier config

done now

Add observability capabilities through OpenTelemetry, providing distributed tracing and metrics for MCP operations and HTTP requests.

- Automatic tracing of all MCP tool calls via middleware
- HTTP request/response tracing in server mode
- New telemetry package with configurable OTLP tracing via environment variables
- Metrics collection for tool execution statistics and HTTP request counters
- Middleware-based tracing that follows MCP semantic conventions
- Session-based span context propagation between tool calls
- Enable via OTEL_EXPORTER_OTLP_ENDPOINT environment variable
- Custom sampling rates via OTEL_TRACES_SAMPLER environment variables

Signed-off-by: Nader Ziada <nziada@redhat.com>
Signed-off-by: Nader Ziada <nziada@redhat.com>
- mcp.tool.duration: Histogram tracking tool call latency in seconds,
  with tool.name label for per-tool performance analysis

- mcp.server.info: Gauge exposing version metadata with version and
  go_version labels for fleet tracking

Signed-off-by: Nader Ziada <nziada@redhat.com>
Copy link
Collaborator

@Cali0707 Cali0707 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

But since this is pretty large I would feel better if @matzew or @manusa gave it a look before merging

If this is blocking anything, we can also do any cleanup as follow ups

Signed-off-by: Matthias Wessendorf <mwessend@redhat.com>
Signed-off-by: Matthias Wessendorf <mwessend@redhat.com>
├─ network.transport: tcp
└─ Status: OK
```

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we feel about including a screenshot ?

image: quay.io/containers/kubernetes_mcp_server:latest
env:
# OTLP endpoint (required to enable tracing)
- name: OTEL_EXPORTER_OTLP_ENDPOINT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(in a follow up PR) wanna add something to the helm chart for this subject?

Signed-off-by: Nader Ziada <nziada@redhat.com>
Copy link
Collaborator

@matzew matzew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Cali0707 Cali0707 merged commit dd5724a into containers:main Jan 21, 2026
6 checks passed
@manusa manusa added this to the 0.1.0 milestone Jan 21, 2026
manusa added a commit to marcnuri-forks/kubernetes-mcp-server that referenced this pull request Jan 28, 2026
The toolScopedAuthorizationMiddleware was conceived for tool authorization
but the feature was dropped in favor of implementing this in an MCP gateway.

This removes:
- The middleware function and its registration
- TokenScopesContextKey constant and tokenScopesContextKeyType type
- Unused imports (fmt, slices) in middleware.go

This change was accidentally reverted in PR containers#590 and is now being
re-applied (originally from PR containers#633).

Signed-off-by: Marc Nuri <marc@marcnuri.com>
Cali0707 pushed a commit that referenced this pull request Jan 28, 2026
)

The toolScopedAuthorizationMiddleware was conceived for tool authorization
but the feature was dropped in favor of implementing this in an MCP gateway.

This removes:
- The middleware function and its registration
- TokenScopesContextKey constant and tokenScopesContextKeyType type
- Unused imports (fmt, slices) in middleware.go

This change was accidentally reverted in PR #590 and is now being
re-applied (originally from PR #633).

Signed-off-by: Marc Nuri <marc@marcnuri.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants