Fix thread safety issues in WorkerNodeTelemetryData by AR-May · Pull Request #13413 · dotnet/msbuild

AR-May · 2026-03-18T17:28:44Z

Related to #12867

Context

Fixing two bugs in telemetry infrastructure when using /m /mt (in-proc multithreaded) mode:

Thread-safety crash: all in-proc nodes share a single TelemetryForwarderProvider singleton. Multiple RequestBuilder instances run on dedicated threads and call AddTask/AddTarget concurrently on the same WorkerNodeTelemetryData dictionary fields, causing race conditions and dictionary corruption.

Nx telemetry duplication: In /m /mt mode, N BuildRequestEngine instances share one TelemetryForwarderProvider singleton. Each engine calls FinalizeProcessing on shutdown, sending the entire accumulated data each time. The InternalTelemetryConsumingLogger merges all N copies, inflating every counter N times.

Reproduction: 20+ non-SDK .NET Framework library projects + 1 exe referencing all of them, built with MSBuild.exe Repro.sln /m /mt.

Changes Made

Fix

Batch-then-merge in RequestBuilder: Each RequestBuilder now accumulates task/target telemetry into a local WorkerNodeTelemetryData instance (zero contention), then merges once into the shared state via elemetryForwarder.MergeWorkerData().

Thread-safe TelemetryForwarder: Added an internal lock protecting both MergeWorkerData and FinalizeProcessing. The forwarder is a singleton shared across BuildRequestEngine instances in /m /mt mode, so concurrent access is expected.

Swap-and-send in FinalizeProcessing: Instead of sending the same accumulated data on every call, FinalizeProcessing atomically swaps the internal data with a fresh empty instance under the lock, then sends only if non-empty. This ensures:

First engine to finalize sends all data accumulated so far
Subsequent engines find empty data and skip sending (no duplication)
Late merges from other engines go into the new instance and are sent by the next FinalizeProcessing call (no
data loss)

Testing

Locally tested that the issue is gone on a repro project.
Unit tests

Copilot

Pull request overview

This PR addresses a concurrency bug in MSBuild’s internal worker-node telemetry aggregation when running in in-proc multi-threaded mode (/m /mt), where multiple RequestBuilder instances can concurrently mutate shared telemetry dictionaries and corrupt their state.

Changes:

Switch telemetry reporting in RequestBuilder.UpdateStatisticsPostBuild from per-target/per-task updates to batching into a local WorkerNodeTelemetryData and merging once.
Replace ITelemetryForwarder.AddTask/AddTarget with a single MergeWorkerData API and update the provider implementations accordingly.
Add explicit locking around aggregation in the internal telemetry-consuming logger.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/Framework/Telemetry/WorkerNodeTelemetryData.cs	Adds method documentation and minor refactor while preserving merge/aggregation behavior.
src/Build/TelemetryInfra/TelemetryForwarderProvider.cs	Replaces per-item update APIs with `MergeWorkerData` and exposes key creation helper for batching.
src/Build/TelemetryInfra/InternalTelemetryConsumingLogger.cs	Adds a lock around worker telemetry aggregation.
src/Build/TelemetryInfra/ITelemetryForwarder.cs	Updates forwarder contract to support batched merging.
src/Build/BackEnd/Components/RequestBuilder/RequestBuilder.cs	Implements local accumulation + single merge under a lock to prevent concurrent dictionary writes.

src/Build/BackEnd/Components/RequestBuilder/RequestBuilder.cs

src/Build/TelemetryInfra/TelemetryForwarderProvider.cs

src/Build/BackEnd/Components/RequestBuilder/RequestBuilder.cs

AR-May · 2026-03-19T13:50:58Z

/azp run

azure-pipelines · 2026-03-19T13:51:08Z

Azure Pipelines successfully started running 1 pipeline(s).

Copilot

Pull request overview

Fixes telemetry thread-safety and counter inflation in in-proc multithreaded (/m /mt) builds by batching telemetry per RequestBuilder, merging into a shared forwarder under a lock, and preventing repeated “send the whole buffer” behavior during engine shutdown.

Changes:

Accumulate task/target telemetry in a per-RequestBuilder WorkerNodeTelemetryData and merge once into the shared forwarder.
Make TelemetryForwarder thread-safe and change finalization to “swap-and-send” to avoid Nx duplication across multiple BuildRequestEngine finalizers.
Add unit tests for WorkerNodeTelemetryData.IsEmpty and forwarder reset behavior after FinalizeProcessing.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/Framework/Telemetry/WorkerNodeTelemetryData.cs	Adds `IsEmpty` and improves merge/add method clarity used by the forwarder swap-and-send logic.
src/Framework/Telemetry/TaskOrTargetTelemetryKey.cs	Introduces a helper factory (`Create`) to centralize key construction used by `RequestBuilder`.
src/Build/TelemetryInfra/TelemetryForwarderProvider.cs	Adds locking, batch merge entrypoint, and swap-and-send finalization to prevent races and duplicate sends.
src/Build/TelemetryInfra/ITelemetryForwarder.cs	Replaces per-item APIs with a batch merge API (`MergeWorkerData`).
src/Build/BackEnd/Components/RequestBuilder/RequestBuilder.cs	Switches to batch-then-merge telemetry collection to remove dictionary contention.
src/Build.UnitTests/Telemetry/Telemetry_Tests.cs	Adds tests for the new reset/empty behavior and forwarder finalization semantics.

src/Build/TelemetryInfra/TelemetryForwarderProvider.cs

src/Build.UnitTests/Telemetry/Telemetry_Tests.cs

src/Framework/Telemetry/TaskOrTargetTelemetryKey.cs

src/Build/TelemetryInfra/ITelemetryForwarder.cs

src/Framework/Telemetry/WorkerNodeTelemetryData.cs

Copilot AI review requested due to automatic review settings March 18, 2026 17:28

Copilot started reviewing on behalf of AR-May March 18, 2026 17:29 View session

AR-May marked this pull request as draft March 18, 2026 17:31

Copilot AI reviewed Mar 18, 2026

View reviewed changes

src/Build/BackEnd/Components/RequestBuilder/RequestBuilder.cs Outdated Show resolved Hide resolved

src/Build/TelemetryInfra/TelemetryForwarderProvider.cs Show resolved Hide resolved

src/Build/BackEnd/Components/RequestBuilder/RequestBuilder.cs Outdated Show resolved Hide resolved

AR-May force-pushed the make-telemetry-thread-safe branch 2 times, most recently from f2e0591 to 3e1fd80 Compare March 19, 2026 13:25

AR-May added 2 commits March 19, 2026 14:27

Fix thread safety issues in WorkerNodeTelemetryData

d11afb0

Add tests, small refactoring

1a19fa3

AR-May force-pushed the make-telemetry-thread-safe branch from 3e1fd80 to 1a19fa3 Compare March 19, 2026 13:30

AR-May marked this pull request as ready for review March 19, 2026 13:32

AR-May requested a review from Copilot March 19, 2026 15:42

Copilot AI reviewed Mar 19, 2026

View reviewed changes

src/Build/TelemetryInfra/TelemetryForwarderProvider.cs Show resolved Hide resolved

src/Build.UnitTests/Telemetry/Telemetry_Tests.cs Show resolved Hide resolved

AR-May mentioned this pull request Mar 20, 2026

Write end-to-end tests for /mt mode. #12640

Open

JanProvaznik approved these changes Mar 20, 2026

View reviewed changes

src/Framework/Telemetry/TaskOrTargetTelemetryKey.cs Outdated Show resolved Hide resolved

src/Build/TelemetryInfra/ITelemetryForwarder.cs Show resolved Hide resolved

src/Framework/Telemetry/WorkerNodeTelemetryData.cs Show resolved Hide resolved

Remove TaskOrTargetTelemetryKey.Create()

bca8fca

JanProvaznik approved these changes Mar 20, 2026

View reviewed changes

AR-May enabled auto-merge (squash) March 20, 2026 13:48

AR-May merged commit dfe5370 into dotnet:main Mar 20, 2026
10 checks passed

AR-May self-assigned this Mar 20, 2026

This was referenced Mar 20, 2026

[release/10.0.3xx] Source code updates from dotnet/msbuild dotnet/dotnet#5614

Merged

[main] Source code updates from dotnet/msbuild dotnet/dotnet#5619

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix thread safety issues in WorkerNodeTelemetryData#13413

Fix thread safety issues in WorkerNodeTelemetryData#13413
AR-May merged 3 commits intodotnet:mainfrom
AR-May:make-telemetry-thread-safe

AR-May commented Mar 18, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AR-May commented Mar 19, 2026

Uh oh!

azure-pipelines bot commented Mar 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

AR-May commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Changes Made

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AR-May commented Mar 19, 2026

Uh oh!

azure-pipelines bot commented Mar 19, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AR-May commented Mar 18, 2026 •

edited

Loading