Include effective temp directory in node-reuse handshake salt (fixes #13594)#13651
Closed
JanProvaznik wants to merge 1 commit into
Closed
Include effective temp directory in node-reuse handshake salt (fixes #13594)#13651JanProvaznik wants to merge 1 commit into
JanProvaznik wants to merge 1 commit into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes incorrect MSBuild node reuse across concurrent invocations that differ only in their effective temp directory, by incorporating Path.GetTempPath() into the node-reuse handshake salt (and corresponding MSBuild server handshake hash).
Changes:
- Include
Path.GetTempPath()in the handshake salt for framework node reuse (src/Framework/BackEnd/Handshake.cs). - Mirror the same temp-dir salt contribution in the legacy CLR2 TaskHost handshake (
src/MSBuildTaskHost/CommunicationsUtilities.cs). - Add unit tests validating same-temp produces same key and different-temp produces different keys/hashes (
src/Build.UnitTests/BackEnd/HandshakeTempDir_Tests.cs).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/MSBuildTaskHost/CommunicationsUtilities.cs | Adds effective temp directory to TaskHost handshake salt to prevent cross-temp node reuse. |
| src/Framework/BackEnd/Handshake.cs | Adds effective temp directory to core node handshake salt (affects node reuse and server scoping). |
| src/Build.UnitTests/BackEnd/HandshakeTempDir_Tests.cs | New regression tests for handshake key/hash differences across temp directories. |
c9c33ea to
7f27bee
Compare
Two MSBuild invocations launched with different TMP/TEMP/TMPDIR environment values were able to bind to each other's reusable worker nodes because the handshake salt was computed only from MSBUILDNODEHANDSHAKESALT and the tools-directory path. When build A finished and cleared its per-build temp folder, a still-running build B that had inherited the same node would break with warnings such as MSB5018 or fatal "system cannot find the path specified" errors pointing at A's deleted folder. Fixes dotnet#13594 Mix Path.GetTempPath() into the salt for non-TaskHost handshakes in src/Framework/BackEnd/Handshake.cs. Both parent and child read the same env at handshake construction time, so matching environments still produce matching salts (legitimate worker-node reuse continues to work) while distinct temp environments now produce distinct salts and refuse to bind. ServerNodeHandshake derives from Handshake, so the MSBuildServer pipe name and per-server mutex are likewise temp-dir scoped. TaskHost paths are intentionally excluded from the temp-dir salt input: - NET TaskHost (parent .NET Framework MSBuild ↔ child shipped from a separately-released .NET SDK) tolerates version skew via the magic NetTaskHostHandshakeVersion. Adding a salt input that isn't synchronized across both release trains would break VS+SDK combinations until both sides picked up the change. The temp-dir fix for NET TaskHost is left for a coordinated VS+SDK rollout follow-up. - CLR2 TaskHost has NodeReuse disabled by design (see NodeProviderOutOfProcTaskHost.IsNodeReuseEnabled), so there is no cross-build reuse surface to protect. Add HandshakeTempDir_Tests covering: - Worker handshake key differs across distinct temp dirs (regression). - Worker handshake key is stable for identical temp dirs (sanity). - ServerNodeHandshake.ComputeHash differs across distinct temp dirs. - NET TaskHost handshake key is unchanged across distinct temp dirs (locks in the deferred behavior that preserves VS+SDK compat). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
7f27bee to
9972459
Compare
Member
Author
|
@rainersigwald I see 2 options
|
This was referenced May 4, 2026
Member
Author
|
too heavy |
This was referenced May 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #13594
Two MSBuild invocations launched with different
TMP/TEMP/TMPDIRenvironments could bind to each other''s reusable worker nodes because the handshake salt was derived only fromMSBUILDNODEHANDSHAKESALTand the tools-directory path. When build A finished and cleaned its per-build temp folder, the still-running build B (now driving a node that thought A''s temp was its own) failed with warnings such asMSB5018or fatal "system cannot find the path specified" errors pointing at A''s deleted folder.Fix
Mix
Path.GetTempPath()into the salt for non-TaskHost handshakes insrc/Framework/BackEnd/Handshake.cs. Both parent and child read the same env at handshake-construction time (the launcher copies the parent env wholesale and only ever setsDOTNET_ROOT*overrides — never temp vars), so matching environments still produce matching salts (legitimate worker-node reuse keeps working) and distinct temp environments now produce distinct salts and refuse to bind.ServerNodeHandshakederives fromHandshake, so MSBuild Server pipe names and the per-server mutex are isolated per temp env as well.The exclusion is implemented as a single
IsHandshakeOptionEnabled(nodeType, HandshakeOptions.TaskHost)check. Worker nodes (HandshakeOptions.NodeReusewithoutTaskHost) andServerNodeHandshake(HandshakeOptions.None) keep the new salt input. TaskHost paths skip it.Why TaskHost paths are exempted (and why that''s a non-issue in practice)
Cross-build TaskHost reuse is not on the default path:
MSBUILDREUSETASKHOSTNODES=1is required.MSBuildTaskHost/OutOfProcTaskHostNode.cs:509-512and the parallelMSBuild/OutOfProcTaskHostNode.cs:1275:Framework/Traits.cs:334:ReuseTaskHostNodes = Environment.GetEnvironmentVariable("MSBUILDREUSETASKHOSTNODES") == "1". Default is off. Even though the parent advertisesPrepareForReuse=true, every TaskHost shuts down at end-of-build unless the user has flipped this rarely-used escape hatch.MSBuildTaskHost/OutOfProcTaskHostNode.cs:570-578:For #13594 to bite on the TaskHost path, all of the following must hold simultaneously:
MSBUILDREUSETASKHOSTNODES=1set,Runtime="NET"(orMSBUILDFORCEALLTASKSOUTOFPROC=1) tasks actually used, the first parent exits cleanly, the second parent has a differentTMP, and tasks in the second build consumePath.GetTempPath()-derived files. That combination is essentially nonexistent in real workloads, and users who do hit it have always hadMSBUILDDISABLENODEREUSE=1and per-buildMSBUILDNODEHANDSHAKESALTas workarounds (both of which the PR leaves in place and continues to honor).The exemption is therefore not a deferred-problem caveat — it is the right design:
NetTaskHostHandshakeVersion = 99in the file-version slots. Adding a salt input that isn''t synchronized across both release trains would introduce a hard handshake mismatch on any VS+SDK pairing where one side has the change and the other doesn''t. Since the bug doesn''t manifest there in practice, the salt input is correctly omitted.NodeReusedisabled by design (NodeProviderOutOfProcTaskHost.IsNodeReuseEnabled, line 762:&& !Handshake.IsHandshakeOptionEnabled(hostContext, HandshakeOptions.CLR2)), so each CLR2 host is single-shot and exits at end of build. No bug surface.Before — handshake salt does not depend on temp dir
sequenceDiagram autonumber participant ShellA as Shell A<br/>TMP=C:\tmpA participant BuildA as Build A (parent) participant Node as Reusable worker node participant BuildB as Build B (parent) participant ShellB as Shell B<br/>TMP=C:\tmpB ShellA->>BuildA: msbuild ... BuildA->>BuildA: salt = hash(saltVar + toolsDir) BuildA->>Node: spawn (inherits TMP=C:\tmpA) Node->>Node: opens C:\tmpA\MSBuildTemp{guid} Note over BuildA,Node: salt matches → bound ShellB->>BuildB: msbuild ... BuildB->>BuildB: salt = hash(saltVar + toolsDir) Note over BuildA,BuildB: ⚠ same salt despite different TMP BuildB->>Node: discover & reuse (handshake passes) Note over BuildB,Node: Node is still rooted under C:\tmpA BuildA-->>ShellA: done ShellA->>ShellA: rm -rf C:\tmpA (per-build cleanup) Node->>Node: write to C:\tmpA\... 💥 Node-->>BuildB: MSB5018 / "path not found" BuildB-->>ShellB: ❌ build failsAfter — temp dir is part of the salt for worker nodes
sequenceDiagram autonumber participant ShellA as Shell A<br/>TMP=C:\tmpA participant BuildA as Build A (parent) participant NodeA as Worker node A participant NodeB as Worker node B participant BuildB as Build B (parent) participant ShellB as Shell B<br/>TMP=C:\tmpB ShellA->>BuildA: msbuild ... BuildA->>BuildA: salt = hash(saltVar + toolsDir + "C:\tmpA\") BuildA->>NodeA: spawn (inherits TMP=C:\tmpA) NodeA->>NodeA: salt = hash(... + "C:\tmpA\") ✅ matches BuildA ShellB->>BuildB: msbuild ... BuildB->>BuildB: salt = hash(saltVar + toolsDir + "C:\tmpB\") BuildB->>NodeA: probe (handshake) NodeA-->>BuildB: salt mismatch → reject BuildB->>NodeB: spawn fresh (inherits TMP=C:\tmpB) NodeB->>NodeB: salt = hash(... + "C:\tmpB\") ✅ matches BuildB BuildA-->>ShellA: done ShellA->>ShellA: rm -rf C:\tmpA NodeA-->>NodeA: idles (no other build can bind) NodeB->>NodeB: keeps writing under C:\tmpB ✅ BuildB-->>ShellB: ✔ build succeedsSalt composition
flowchart LR subgraph BEFORE [Before] A1[MSBUILDNODEHANDSHAKESALT env] A2[toolsDirectory] A1 --> AC{{concat}} A2 --> AC AC --> AH[GetHashCode → int salt] end subgraph AFTER [After - non-TaskHost only] B1[MSBUILDNODEHANDSHAKESALT env] B2[toolsDirectory] B3[Path.GetTempPath] B1 --> BC{{concat}} B2 --> BC B3 --> BC BC --> BH[GetHashCode → int salt] end subgraph AFTER_TH [After - TaskHost paths unchanged] C1[MSBUILDNODEHANDSHAKESALT env] C2[toolsDirectory] C1 --> CC{{concat}} C2 --> CC CC --> CH[GetHashCode → int salt] endWhy this approach
Path.GetTempPath()from the same inherited environment at handshake time.IsHandshakeOptionEnabled(..., TaskHost)check.Alternatives considered and rejected: hashing the full environment block (kills all reuse —
PWD,_,OLDPWD,CMDCMDLINEetc. churn per shell), moving the per-node temp folder to a fixed system path (ignores users who deliberately setTMP, doesn''t reachToolTask/ user task code that re-resolvesPath.GetTempPath()), pinning temp at startup (same gap), runtime detection + respawn (reactive and racy — corruption already happened), full env diff over the wire (much larger handshake, no extra benefit over hashing into the salt), and documentingMSBUILDNODEHANDSHAKESALTas the workaround (the existing workaround — pushes the burden to every caller).Blast radius
Microsoft.Buildworker nodesMSBuildServer-{hash}pipe +Global\msbuild-server-running-{hash}mutex)TMPvalues no longer share a server.--shutdownfrom a different temp env will not see the foreign server, but stale servers idle out via existing timeouts.MSBUILDREUSETASKHOSTNODES=1(off by default) and on the parent surviving long enough to send a clean shutdown; the bug from #13594 cannot manifest on this path under normal conditions, and excluding the salt input avoids introducing a handshake-protocol skew across the VS+SDK release boundary.NodeReuseis disabled there by design, so no bug surface.MSBuildTaskHost.exe(legacy CLR2 binary)Tests
src/Build.UnitTests/BackEnd/HandshakeTempDir_Tests.cs(new) covers:Handshake_DifferentTempDirectory_ProducesDifferentKeyHandshake_SameTempDirectory_ProducesSameKeyServerNodeHandshake_DifferentTempDirectory_ProducesDifferentHashHandshake_NetTaskHost_DifferentTempDirectory_ProducesSameKeyAll tests use the standard
ITestOutputHelperinjection pattern so test diagnostics are captured. Existing tests in the related neighborhood (AppHostSupport_Tests,UnixNodeReuseFixes_Tests,TaskHostNodeKey_Tests) continue to pass.