Skip to content

.NET: [Breaking Change] Auto-wire ChatClient with OpenTelemetryChatClient in OpenTelemetryAgent#5750

Merged
rogerbarreto merged 9 commits into
mainfrom
copilot/fix-maf-telemetry-flow
May 13, 2026
Merged

.NET: [Breaking Change] Auto-wire ChatClient with OpenTelemetryChatClient in OpenTelemetryAgent#5750
rogerbarreto merged 9 commits into
mainfrom
copilot/fix-maf-telemetry-flow

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 11, 2026

Motivation and Context

OpenTelemetryAgent instrumented only the agent-level invoke_agent span; the underlying IChatClient was untouched, so model-level chat spans, usage metrics, and Azure Monitor traces never flowed unless callers manually wrapped every chat client. End result: telemetry silently absent in Foundry/hosted agent samples even with an exporter configured.

Description

OpenTelemetryAgent now auto-wraps the inner ChatClientAgent's IChatClient with OpenTelemetryChatClient on each invocation, so chat-level telemetry flows alongside invoke_agent.

  • OpenTelemetryAgent ctors — the original OpenTelemetryAgent(AIAgent innerAgent, string? sourceName = null) constructor signature is preserved (binary-compatible) and delegates with auto-wiring enabled. A new overload OpenTelemetryAgent(AIAgent innerAgent, string? sourceName, bool autoWireChatClient) adds the opt-out. The sourceName is normalized once in the constructor (empty string is treated as OpenTelemetryConsts.DefaultSourceName) and reused by both the outer OpenTelemetryChatClient and the auto-wired inner OpenTelemetryChatClient, so agent-level and chat-level spans always land on the same ActivitySource. The opt-out flag lives only on the constructor (not on the AIAgentBuilder extension), keeping chat-client terminology out of the abstract builder surface.
  • Per-invocation wiringForwardingChatClient.GetResponseAsync / GetStreamingResponseAsync route the AgentRunOptions through a new GetRunOptionsWithChatClientWiring helper that:
    1. Short-circuits when auto-wire is disabled.
    2. Resolves the nested ChatClientAgent via InnerAgent.GetService<ChatClientAgent>() (no type assumption on InnerAgent; wrapping agents that surface a nested ChatClientAgent through GetService are supported). No-ops when the result is null.
    3. No-ops when chatClientAgent.GetService<ChatClientAgentOptions>()?.UseProvidedChatClientAsIs is true (respects the user's explicit opt-out on the chat client pipeline; ChatClientAgent.GetService already exposes ChatClientAgentOptions).
    4. No-ops when chatClientAgent.GetService<IChatClient>()?.GetService<OpenTelemetryChatClient>() is already non-null (avoids double-wrap).
    5. Otherwise clones the caller's ChatClientAgentRunOptions (or, when the caller passes a plain AgentRunOptions, creates one and copies the base properties: ContinuationToken, AllowBackgroundResponses, AdditionalProperties, ResponseFormat) and sets ChatClientFactory to chain onto any user-supplied factory rather than replacing it. The factory step also inspects the post-user-factory result via GetService(typeof(OpenTelemetryChatClient)) and skips wrapping when the chat client is already instrumented, so a user factory that itself calls UseOpenTelemetry(...) does not produce duplicate chat spans.
  • UseOpenTelemetry extension — unchanged public surface; the existing extension simply constructs an OpenTelemetryAgent (with auto-wiring on by default), so telemetry now flows automatically for existing call sites.
  • Tests — full coverage of every wiring branch in GetRunOptionsWithChatClientWiring and WrapIfNeeded: default-on (sync + streaming), explicit opt-out via the constructor (sync + streaming), UseProvidedChatClientAsIs opt-out, non-ChatClientAgent no-op, no-double-wrap when the underlying chat client is pre-instrumented, chaining of a user-supplied ChatClientFactory, no-double-wrap when the user factory itself returns an OpenTelemetry-instrumented client, plain AgentRunOptions propagation of AllowBackgroundResponses / AdditionalProperties / ResponseFormat and (separately) ContinuationToken, ChatClientAgentRunOptions clone path with no user factory (asserts ChatOptions are preserved and the caller's instance is not mutated), and null / empty sourceName normalization (Theory: both produce spans on OpenTelemetryConsts.DefaultSourceName).

Behavioral breaking change

This PR introduces a behavioral (not API) change. Existing call sites of new OpenTelemetryAgent(innerAgent) and AIAgentBuilder.UseOpenTelemetry(...) will now begin emitting an additional chat span per invocation when the inner agent is (or surfaces) a ChatClientAgent whose IChatClient is not already wrapped with OpenTelemetryChatClient.

Impact is limited to a specific subset of users:

  • Users who previously wrapped both the agent and the chat client themselves are unaffected: the chat-client OTel wrapper is detected and the auto-wire path no-ops.
  • Users who wrapped only the agent (the original silent-telemetry case this PR fixes) will start receiving the missing chat-level spans, metrics, and Azure Monitor traces. This is the intended new default.
  • Users who do not want the new chat-level spans can opt out either:
    • On the constructor: new OpenTelemetryAgent(innerAgent, sourceName: null, autoWireChatClient: false)
    • On the ChatClientAgent itself: new ChatClientAgent(chatClient, new ChatClientAgentOptions { UseProvidedChatClientAsIs = true })

Source compatibility, binary compatibility, and the UseOpenTelemetry extension surface are all preserved. Only the runtime telemetry shape changes, and only in the direction of emitting strictly more (and previously missing) signal.

Opt-out (constructor-only)

var otelAgent = new OpenTelemetryAgent(innerAgent, sourceName: null, autoWireChatClient: false);

Or via ChatClientAgentOptions:

new ChatClientAgent(chatClient, new ChatClientAgentOptions { UseProvidedChatClientAsIs = true });

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? Behavioral breaking change documented above. Not source/binary breaking.

Copilot AI self-assigned this May 11, 2026
Copilot AI review requested due to automatic review settings May 11, 2026 18:27
Copilot AI review requested due to automatic review settings May 11, 2026 18:27
Copilot AI requested review from Copilot and removed request for Copilot May 11, 2026 18:36
Copilot AI changed the title [WIP] Fix MAF telemetry flow by default .NET: Auto-wire ChatClient with OpenTelemetryChatClient in OpenTelemetryAgent May 11, 2026
Copilot AI requested a review from rogerbarreto May 11, 2026 18:38
Comment thread dotnet/src/Microsoft.Agents.AI/OpenTelemetryAgent.cs
Comment thread dotnet/src/Microsoft.Agents.AI/OpenTelemetryAgentBuilderExtensions.cs Outdated
Copilot AI requested review from Copilot and removed request for Copilot May 11, 2026 18:57
Copilot AI requested a review from rogerbarreto May 11, 2026 18:58
Comment thread dotnet/src/Microsoft.Agents.AI/OpenTelemetryAgent.cs Outdated
Copilot AI requested review from Copilot and removed request for Copilot May 12, 2026 07:36
Comment thread dotnet/src/Microsoft.Agents.AI/OpenTelemetryAgent.cs Outdated
Copilot AI requested a review from rogerbarreto May 12, 2026 07:37
Copilot AI requested review from Copilot and removed request for Copilot May 12, 2026 07:39
Copilot AI review requested due to automatic review settings May 12, 2026 07:42
Comment thread dotnet/src/Microsoft.Agents.AI/OpenTelemetryAgent.cs
Copilot AI and others added 9 commits May 13, 2026 12:45
…tryAgent

Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/96dd033a-0c48-4d3f-9148-324bfd436b75

Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>
…tAsIs; drop redundant check

Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/6ac3f75d-eeb7-4811-8043-9a27511b0a8b

Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>
…lient

Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/008d914d-8cbb-4e9f-81b6-f8c3c8bd8d04

Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>
…eName) signature

Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/a890c9a7-0b77-40ab-802c-dfbf09f8c260

Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>
…r factory

Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/3afbf18c-de22-4236-a2f2-02ca1e98ae21

Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>
…g path coverage

Normalize the configured source name once in the constructor so the outer OpenTelemetryChatClient and the auto-wired inner OpenTelemetryChatClient always emit spans on the same ActivitySource. A caller passing an empty string previously produced agent-level spans on DefaultSourceName but auto-wired chat spans on the empty source, causing the chat spans to be silently dropped by exporters subscribed to the default source.

Tests added to cover the previously unexercised OTEL wiring branches:

- Ctor_NullOrEmptySourceName_AutoWiredChatClientUsesDefaultSource_Async (Theory: null and empty)

- AutoWireChatClient_PlainAgentRunOptions_PreservesContinuationToken_Async

- AutoWireChatClient_ChatClientAgentRunOptions_NoUserFactory_PreservesChatOptions_Async

- AutoWireChatClient_StreamingDisabled_DoesNotEmitChatSpan_Async
Annotate the new 3-arg OpenTelemetryAgent(AIAgent, string?, bool) constructor with [Experimental(DiagnosticIds.Experiments.AgentsAIExperiments)] (MAAI001) so callers must explicitly opt in to the auto-wire toggle. The original 2-arg constructor stays non-experimental and delegates with autoWireChatClient: true; the delegating call is locally suppressed so the existing source compatibility surface is preserved.
- Use string.IsNullOrWhiteSpace (not IsNullOrEmpty) when normalizing the constructor sourceName, so callers passing whitespace-only strings still land on OpenTelemetryConsts.DefaultSourceName instead of an unsubscribed ActivitySource.

- Fix the misleading pragma comment on the 2-arg ctor delegating call: auto-wiring is the new default, it does not preserve the original (pre-PR) behavior.

- Expand the GetRunOptionsWithChatClientWiring XML doc to spell out that a base AgentRunOptions (not ChatClientAgentRunOptions) is also accepted: it is converted to ChatClientAgentRunOptions with the auto-wire factory installed and base properties copied.

- Tests: extend the source-name normalization Theory with whitespace cases ('   ' and '\t'); add end-to-end coverage for plain AgentRunOptions over a real ChatClientAgent (sync + streaming) asserting the inner chat client is invoked and both invoke_agent + chat spans are emitted.
@rogerbarreto rogerbarreto force-pushed the copilot/fix-maf-telemetry-flow branch from 7ed8676 to 2848b66 Compare May 13, 2026 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

.NET: [Bug]: MAF telemetry not flowing by default / requires per-client manual enablement

6 participants