feat(observability): add OpenTelemetry tracing and metrics support#590
feat(observability): add OpenTelemetry tracing and metrics support#590Cali0707 merged 7 commits intocontainers:mainfrom
Conversation
pkg/metrics/otel_collector.go
Outdated
| spanName := fmt.Sprintf("%s %s", method, path) | ||
| _, span := o.tracer.Start(ctx, spanName, |
There was a problem hiding this comment.
I'm a little worried we may be missing the traceparent/tracestate on these spans
In the current semconv proposal it seems like these should be stored in the req.Params.Meta of the MCP request. But, this runs before the request is deserialized into the MCP types.
There was a problem hiding this comment.
will add span propagation
Cali0707
left a comment
There was a problem hiding this comment.
Sorry it took me so long to review here @nader-ziada - thanks for fixing up all the spans!
pkg/metrics/otel_stats_collector.go
Outdated
| provider := sdkmetric.NewMeterProvider( | ||
| sdkmetric.WithReader(reader), | ||
| ) |
There was a problem hiding this comment.
I think we need to somewhere hook the provider.Shutdown(ctx) into the server shutdown, so that it can try to flush the metrics
pkg/metrics/otel_collector.go
Outdated
| _, span := o.tracer.Start(ctx, spanName, | ||
| trace.WithSpanKind(trace.SpanKindServer), | ||
| trace.WithAttributes( | ||
| attribute.String("http.method", method), |
There was a problem hiding this comment.
From https://opentelemetry.io/docs/specs/semconv/http/http-spans/#http-server-span I think that they have changed the attribute names 😓
It looks like these should probably be:
- "http.request.method"
- "url.path" (although they also have the "http.route" still - not totally sure for this one)
- "http.response.status_code"
I think?
pkg/telemetry/telemetry.go
Outdated
|
|
||
| // Create OTLP exporter | ||
| // Endpoint is configured via OTEL_EXPORTER_OTLP_ENDPOINT env var | ||
| exporter, err := otlptracegrpc.New(ctx) |
There was a problem hiding this comment.
Can we support allowing HTTP as well as grpc through the OTEL_EXPORTER_OTLP_PROTOCOL env var?
|
/assign Will take a look 👁️ |
Add observability capabilities through OpenTelemetry, providing distributed tracing and metrics for MCP operations and HTTP requests. - Automatic tracing of all MCP tool calls via middleware - HTTP request/response tracing in server mode - New telemetry package with configurable OTLP tracing via environment variables - Metrics collection for tool execution statistics and HTTP request counters - Middleware-based tracing that follows MCP semantic conventions - Session-based span context propagation between tool calls - Enable via OTEL_EXPORTER_OTLP_ENDPOINT environment variable - Custom sampling rates via OTEL_TRACES_SAMPLER environment variables Signed-off-by: Nader Ziada <nziada@redhat.com>
Signed-off-by: Nader Ziada <nziada@redhat.com>
- mcp.tool.duration: Histogram tracking tool call latency in seconds, with tool.name label for per-tool performance analysis - mcp.server.info: Gauge exposing version metadata with version and go_version labels for fleet tracking Signed-off-by: Nader Ziada <nziada@redhat.com>
Signed-off-by: Matthias Wessendorf <mwessend@redhat.com>
Signed-off-by: Matthias Wessendorf <mwessend@redhat.com>
Fixes for Otel - avoid panic
| ├─ network.transport: tcp | ||
| └─ Status: OK | ||
| ``` | ||
|
|
There was a problem hiding this comment.
how do we feel about including a screenshot ?
| image: quay.io/containers/kubernetes_mcp_server:latest | ||
| env: | ||
| # OTLP endpoint (required to enable tracing) | ||
| - name: OTEL_EXPORTER_OTLP_ENDPOINT |
There was a problem hiding this comment.
(in a follow up PR) wanna add something to the helm chart for this subject?
Signed-off-by: Nader Ziada <nziada@redhat.com>
The toolScopedAuthorizationMiddleware was conceived for tool authorization but the feature was dropped in favor of implementing this in an MCP gateway. This removes: - The middleware function and its registration - TokenScopesContextKey constant and tokenScopesContextKeyType type - Unused imports (fmt, slices) in middleware.go This change was accidentally reverted in PR containers#590 and is now being re-applied (originally from PR containers#633). Signed-off-by: Marc Nuri <marc@marcnuri.com>
) The toolScopedAuthorizationMiddleware was conceived for tool authorization but the feature was dropped in favor of implementing this in an MCP gateway. This removes: - The middleware function and its registration - TokenScopesContextKey constant and tokenScopesContextKeyType type - Unused imports (fmt, slices) in middleware.go This change was accidentally reverted in PR #590 and is now being re-applied (originally from PR #633). Signed-off-by: Marc Nuri <marc@marcnuri.com>
Add observability capabilities through OpenTelemetry, providing distributed tracing and metrics for MCP operations and HTTP requests.