-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EvaluateCacheFile is causing restore crashes on Linux #9199
Comments
Updated to include more of the builds / jobs where this failed. |
Are all the failures coming from the SDK resolver? Might be related to #8793 |
I clicked on a couple of random ones and all of them had resolver stack traces. I'll be handling priority issues next week (conveniently), so I'll take an initial stab at it. |
fyi @jeffkl |
That seems weird, the directory should exist. I suppose |
There's a permissions trick. The tmp folder in linux is global, so the first operation that creates /tmp/NuGetScratch sets the permissions for that folder. In every scenario that NuGet writes to the temp folder we usually call: https://github.com/NuGet/NuGet.Client/blob/dev/src/NuGet.Core/NuGet.Common/PathUtil/DirectoryUtility.cs instead of Directory.CreateDirectory, because that creates the tmp folder with the correct permissions, allowing other machine users to write to the same folder, in turn allowing all users to do a restore themselves. Now, we don't want the dgspec folder to have the same open permissions, we want to use the default one, because the dg spec folder is obj. Since the resolver runs a preview restore, I think fixing #8793 should be enough, since it makes the temp project location irrelevant since we don't write anything to it. |
Added more affected builds |
Presently this is causing 6% of our builds to fail in dotnet/runtime. Is there a step we can do before calling restore that would allow us to work around this problem until a real fix comes in? Waiting for NuGet means we'll likely stay at 6% failure for several weeks (at a minimum) given the state of code flow. |
It's a multi step thing but should unblock you:
|
Would a |
Yeah, that'd work too |
@nkolev92 that work around of creating the https://dev.azure.com/dnceng/public/_build/results?buildId=540894&view=logs This is worisome for several reasons:
|
I'm fairly confident it's the correct fix...the dependency hasn't flown into dotnet yet dotnet/sdk#10726. |
@nkolev92 we are still hitting this with
https://dev.azure.com/dnceng/public/_build/results?buildId=551633&view=logs |
Sorry I was wrong, we aren't using the right bits. The commits points to an old build from early February: 7bac015acc5b7e0161a058c8febc98abe2386d51 |
Details about Problem
Restore during build using
dotnet msbuild
is crashing with an unhandled exception on Unix operating systems. The problem appears to be a simple unhandled exception.Builds where this has occurred
This is starting to become a bit of a blocking issue for the dotnet/runtime repository. Started showing up yesterday and now I'm seeing it across a number of builds.
The runs do have binlogs available from the restores but they don't really seem to have helpful information in them. Example
https://dev.azure.com/dnceng/9ee6d478-d288-47f7-aacc-f6e6d082ae6d/_apis/build/builds/529965/artifacts?artifactName=BuildLogs_Mono_Linux_x64_release&fileId=50260E1C6BB8177752CB6690A24F7D636CC8A076F6FB6CC6FAB6F5DB3AED4BC502&fileName=Build.binlog&api-version=5.0-preview.3
The text was updated successfully, but these errors were encountered: