Fix Linux Mutex Utilization #2229

nagilson · 2025-04-09T18:32:27Z

Now that the mutex is implemented correctly, I can check that the code which uses the mutex is all correct.
An issue I discovered, is that on Linux, we use a sudo process folder to exchange files with a master sudo process.
This folder is shared across all instances of vscode. If you kill vscode, the sudo process does not go away.

When a new instance checks if it should spawn a master sudo process, it tries to communicate with the old one. But the old one may be busy, and the old logic only waited 1 second. The sudo process may be busy for multiple minutes. This can result in race conditions where the old sudo process and new one are doing some of the same tasks. Now, I've given each vscode instance its own folder to run sudo commands under.

The script that runs these commands is still in the same protected location. It will only run authorized and approved commands by us, so spawning a new directory is OK.

Resolves #2224

It seems like it dropped the execution lock (maybe it threw an error) when executing The command apt-get -o DPkg::Lock::Timeout=180 upgrade -y dotnet-sdk-8.0 was forwarded to the master process to run. It waited about 2 seconds then freed the server before it was done. It freed before it threw but free happens before throw, so it might have failed for some reason, then failed again since stdout.txt didn't exist because the process wasn't done running. then freed. When it freed, it was still going but then it allowed the sudo master process killer to run which caused this to fail later.... Still not sure why that'd happen, it might be if the locking was not implemented correctly.

This will help prevent an existing process from colliding with another.

this is basically like random int collision at this point, could probably remove.

Copilot

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (2)

vscode-dotnet-runtime-library/src/Utils/CommandExecutor.ts:123

Locking now uses the script path instead of the communication directory, while subsequent operations continue to use the directory. Verify if this change is intentional to avoid potential race conditions or locking inconsistencies.

return executeWithLock(this.context.eventStream, false, RUN_UNDER_SUDO_LOCK(this.sudoProcessScript), SUDO_LOCK_PING_DURATION_MS, waitForLockTimeMs,

vscode-dotnet-runtime-library/src/Utils/CommandExecutor.ts:149

[nitpick] Reducing the timeout from 1000ms to 500ms for checking process liveness may cause premature failures. Confirm that this new timeout value meets the operational requirements.

if (await this.sudoProcIsLive(false, fullCommandString, 500))

vscode-dotnet-runtime-library/src/Utils/FileUtilities.ts

Co-authored-by: Copilot <[email protected]>

vscode-dotnet-runtime-library/src/Acquisition/DotnetPathFinder.ts

nagilson · 2025-04-10T17:09:40Z

vscode-dotnet-runtime-library/src/Utils/CommandExecutor.ts

+        catch (error: any)
+        {
+            // eslint-disable-next-line @typescript-eslint/no-unsafe-member-access
+            error.message = error.message + `\nFailed to create ${this.sudoProcessCommunicationDir}. Please check your permissions or install dotnet manually.`;


This may happen in the rare event of a collision where the dir exists, so its ok to keep going here.

Can you have fallback dirs? Like if dir exists, check dir1, then dir2, etc.? I don't feel great about catching any exception and not even verifying that it failed for the reason you mentioned. You could also have sudoProcessCommunication dir is unexpectedly null; that would be a bug, and we should care, I think

Forgind

Don't think any of these are too serious. I'm a little concerned about the undefined thing, but that might just be my lack of js knowledge

Forgind · 2025-04-10T17:16:35Z

vscode-dotnet-runtime-library/src/Utils/CommandExecutor.ts

+        catch (error: any)
+        {
+            // eslint-disable-next-line @typescript-eslint/no-unsafe-member-access
+            error.message = error.message + `\nFailed to create ${this.sudoProcessCommunicationDir}. Please check your permissions or install dotnet manually.`;


Can you have fallback dirs? Like if dir exists, check dir1, then dir2, etc.? I don't feel great about catching any exception and not even verifying that it failed for the reason you mentioned. You could also have sudoProcessCommunication dir is unexpectedly null; that would be a bug, and we should care, I think

Forgind · 2025-04-10T17:22:43Z

vscode-dotnet-runtime-library/src/Utils/CommandExecutor.ts

        {
-            if (await this.sudoProcIsLive(false, fullCommandString, 1000)) // If the sudo process was spawned by another instance of code, we do not want to have 2 at once but also do not waste a lot of time checking
-            // As it should not be in the middle of an operation which may cause it to take a while.
+            if (await this.sudoProcIsLive(false, fullCommandString, 500)) // If the sudo process was spawned by another instance of code, we do not want to have 2 at once but also do not waste a lot of time checking


I'm not sure users would really notice the difference between 500ms and 1000ms. I don't know how long this is expected to take (IPC, right? I think speed then depends on OS), but I could imagine it taking longer than 500ms on a heavily loaded system

That's true. This check only exists for when there is an existing sudo process working in same the individualized folder.

That should only happen when we think we have not spawned a sudo process ourselves that is currently alive, or the sudo process spawned unsuccessfully while throwing an error. This check can probably be removed in its entirety.

vscode-dotnet-runtime-library/src/Utils/CommandExecutor.ts

Forgind · 2025-04-10T17:27:48Z

vscode-dotnet-runtime-library/src/Utils/CommandExecutor.ts

-            stderr: (fs.readFileSync(stderrFile, 'utf8')).trim(),
-            status: (fs.readFileSync(statusFile, 'utf8')).trim()
+            stdout: (await (this.fileUtil as FileUtilities).read(stdoutFile)).trim(),
+            stderr: (await (this.fileUtil as FileUtilities).read(stderrFile)).trim(),


This defaults to utf8? I'm also wondering if you can have these three going in parallel then just 'join' them afterwards.

Yes! The function does default to utf8. I like the parallelization idea, though I want to avoid adding that complexity here

vscode-dotnet-runtime-library/src/Utils/FileUtilities.ts

vscode-dotnet-runtime-library/src/test/unit/TestUtility.ts

vscode-dotnet-runtime-library/src/Utils/CommandExecutor.ts

nagilson · 2025-04-10T18:44:19Z

Can you have fallback dirs? Like if dir exists, check dir1, then dir2, etc.? I don't feel great about catching any exception and not even verifying that it failed for the reason you mentioned. You could also have sudoProcessCommunication dir is unexpectedly null; that would be a bug, and we should care, I think

That's a good point. If we cant make the first dir, I doubt we can make the 2nd. Perhaps we could use the existing directory as a fallback, though that would introduce the race condition again if it happens multiple times. I could condition it to not fail on EEXIST. Though, I just don't know what other types of errors the kernel could throw at us :/

… kernels EACCESS as well. This may be something that will not resolve, though if some admin thing is blocking and holding the file, perhaps this is a good resolution. I never encountered this when trying different permissions on my demos but it seems that some people are.

…e-dotnet-runtime into nagilson-linux-rc

nagilson · 2025-04-10T19:01:05Z

@Forgind, I think I resolved the 2 issues you put at the top by throwing the error in certain cases if we cant use mkdir or using a backupdir, and then by removing the extra 500 ms timeout check. Thanks for looking and let me know what you think!

Forgind

I'm happy; thanks!

might as well do it in this PR

nagilson added 6 commits April 9, 2025 11:30

Fix lint

f7769db

Add timeout to the logging

31dd58d

Make each sudo process instance get its own directory.

10fb7eb

This will help prevent an existing process from colliding with another.

Dont delay anymore since the chance of this is very low

5bc102e

this is basically like random int collision at this point, could probably remove.

Fix lint

fe8c2de

nagilson marked this pull request as ready for review April 9, 2025 22:25

Copilot AI review requested due to automatic review settings April 9, 2025 22:25

Copilot AI reviewed Apr 9, 2025

View reviewed changes

vscode-dotnet-runtime-library/src/Utils/FileUtilities.ts Show resolved Hide resolved

nagilson and others added 2 commits April 9, 2025 15:31

Throw an error if stderr DNE

c763555

Co-authored-by: Copilot <[email protected]>

Fix support matrix

2ab61ef

nagilson mentioned this pull request Apr 9, 2025

[NETE2ESDK][Ubuntu24.04] Error occurs when re-install runtime or SDK after re-launch VSCode which is killed before #2224

Closed

nagilson added 2 commits April 9, 2025 15:57

Dont echo env in linux

8377c27

Fix copilot change

09fac83

nagilson requested a review from a team April 10, 2025 16:40

Fix lint

6e32916

nagilson commented Apr 10, 2025

View reviewed changes

vscode-dotnet-runtime-library/src/Acquisition/DotnetPathFinder.ts Outdated Show resolved Hide resolved

nagilson commented Apr 10, 2025

View reviewed changes

vscode-dotnet-runtime-library/src/Acquisition/DotnetPathFinder.ts Outdated Show resolved Hide resolved

nagilson added 2 commits April 10, 2025 10:08

Fix env var logging

ed420a6

Fix env var logging

cd1fe2b

nagilson commented Apr 10, 2025

View reviewed changes

Forgind reviewed Apr 10, 2025

View reviewed changes

MiYanni approved these changes Apr 10, 2025

View reviewed changes

vscode-dotnet-runtime-library/src/Utils/CommandExecutor.ts Show resolved Hide resolved

nagilson added 4 commits April 10, 2025 11:56

Merge branch 'nagilson-linux-rc' of https://github.com/nagilson/vscod…

e83ee9a

…e-dotnet-runtime into nagilson-linux-rc

Remove extraneous check

adf66f4

Use the old dir as a fallback if we cant create dirs and throw

185c59c

Lint

0aa0e03

Forgind approved these changes Apr 10, 2025

View reviewed changes

Delete the directory

a4b99b2

might as well do it in this PR

nagilson enabled auto-merge (squash) April 10, 2025 20:46

nagilson merged commit 902bb39 into dotnet:main Apr 10, 2025
8 checks passed

nagilson mentioned this pull request Jul 21, 2025

C#DK Performance & Quality via .NET Install Tool dotnet/sdk#49885

Open

Fix Linux Mutex Utilization #2229

Fix Linux Mutex Utilization #2229

Uh oh!

Conversation

nagilson commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nagilson Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Forgind Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

Forgind left a comment

Choose a reason for hiding this comment

Uh oh!

Forgind Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

Forgind Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

nagilson Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Forgind Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

nagilson Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nagilson commented Apr 10, 2025

Uh oh!

nagilson commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Forgind left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nagilson commented Apr 9, 2025 •

edited

Loading

nagilson Apr 10, 2025 •

edited

Loading

nagilson commented Apr 10, 2025 •

edited

Loading