Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate PR build failure #25293

Closed
ajaybhargavb opened this issue Aug 26, 2020 · 21 comments
Closed

Investigate PR build failure #25293

ajaybhargavb opened this issue Aug 26, 2020 · 21 comments
Assignees
Labels
area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework investigate

Comments

@ajaybhargavb
Copy link
Contributor

ajaybhargavb commented Aug 26, 2020

This PR had failures due to build ordering issues. Need to investigate.

More context, #25283 (comment)

cc @dotnet/aspnet-build

@ajaybhargavb ajaybhargavb added investigate area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework labels Aug 26, 2020
@JunTaoLuo
Copy link
Contributor

@dougbu
Copy link
Member

dougbu commented Aug 29, 2020

Is the problem usually or always showing up in Microsoft.AspNetCore.App.Runtime❔

@sebastienros
Copy link
Member

Update:

From the binlog viewer, this node

Project Name=Project "Microsoft.AspNetCore.App.Runtime.csproj" (GetNativeManifest target(s)): 

Has a property _BuildOutputInPackage that has the pdbs duplicated:

F:\workspace\_work\1\s\artifacts\bin\Microsoft.AspNetCore.Antiforgery\Release\net5.0\Microsoft.AspNetCore.Antiforgery.pdb
F:\workspace\_work\1\s\artifacts\bin\Microsoft.AspNetCore.App.Runtime\Release\net5.0\win-x64\Microsoft.AspNetCore.Antiforgery.pdb

@dougbu says:

_GetBuildOutputFilesWithTfm is a target in the Runtime project. Sounds like it's either (a) wasting time adding duplicates to an iterm group or (b) not removing duplicates when it should. Either way, the root cause is likely a change somewhere in the .NET SDK that rejiggered the inputs to that target.

They might be copied to the same directory. The bin/Microsoft.AspNetCore.App.Runtime folder may also be intended as a staging area for the real layout. You'll need to look more at the project and perhaps a dotnet msbuild /pp:pp.out file for the project to figure out the project's expectations and whether the SDK is doing something different now. I often find myself fliipping back 'n forth betwen the log viewer and VS Code to try and figure these problems out.

@dougbu
Copy link
Member

dougbu commented Sep 3, 2020

The PDB already in a package problem is impacting a lot of builds:

> runfo search-buildlog --definition=aspnet --pr --value="pdb' is not added because the package already contains file" --count 50
https://dev.azure.com/dnceng/public/_build/results?buildId=799262
https://dev.azure.com/dnceng/public/_build/results?buildId=799186
https://dev.azure.com/dnceng/public/_build/results?buildId=799129
https://dev.azure.com/dnceng/public/_build/results?buildId=799129
https://dev.azure.com/dnceng/public/_build/results?buildId=799079
https://dev.azure.com/dnceng/public/_build/results?buildId=797240
https://dev.azure.com/dnceng/public/_build/results?buildId=798807
https://dev.azure.com/dnceng/public/_build/results?buildId=798686
https://dev.azure.com/dnceng/public/_build/results?buildId=797948
https://dev.azure.com/dnceng/public/_build/results?buildId=797948
https://dev.azure.com/dnceng/public/_build/results?buildId=798502
https://dev.azure.com/dnceng/public/_build/results?buildId=798307
https://dev.azure.com/dnceng/public/_build/results?buildId=798151
https://dev.azure.com/dnceng/public/_build/results?buildId=798151
https://dev.azure.com/dnceng/public/_build/results?buildId=798151
https://dev.azure.com/dnceng/public/_build/results?buildId=797993
https://dev.azure.com/dnceng/public/_build/results?buildId=797993
https://dev.azure.com/dnceng/public/_build/results?buildId=798020
https://dev.azure.com/dnceng/public/_build/results?buildId=798020
https://dev.azure.com/dnceng/public/_build/results?buildId=798020
https://dev.azure.com/dnceng/public/_build/results?buildId=798018
https://dev.azure.com/dnceng/public/_build/results?buildId=798018
https://dev.azure.com/dnceng/public/_build/results?buildId=798018
https://dev.azure.com/dnceng/public/_build/results?buildId=797973
https://dev.azure.com/dnceng/public/_build/results?buildId=797928
https://dev.azure.com/dnceng/public/_build/results?buildId=797928
https://dev.azure.com/dnceng/public/_build/results?buildId=797896
https://dev.azure.com/dnceng/public/_build/results?buildId=797896
https://dev.azure.com/dnceng/public/_build/results?buildId=797616
https://dev.azure.com/dnceng/public/_build/results?buildId=797887
https://dev.azure.com/dnceng/public/_build/results?buildId=797887
https://dev.azure.com/dnceng/public/_build/results?buildId=797772
https://dev.azure.com/dnceng/public/_build/results?buildId=797772
https://dev.azure.com/dnceng/public/_build/results?buildId=797813
https://dev.azure.com/dnceng/public/_build/results?buildId=797644
https://dev.azure.com/dnceng/public/_build/results?buildId=797724
https://dev.azure.com/dnceng/public/_build/results?buildId=797724

Evaluated 50 builds
Impacted 23 builds
Impacted 37 jobs
Searched 14002 records

Strangely, though we have warn-as-error enabled, this isn't failing our builds or at least not all of our builds. I picked https://dev.azure.com/dnceng/public/_build/results?buildId=797973&view=logs&j=1b89928a-2219-5ef9-602f-f95beb3da4dc&t=bcce746a-741e-59b7-adec-a4a82d8c5e52 somewhat randomly and see lots of error NU5118 but also Build succeeded.. @rainersigwald any ideas why dotnet msbuild is fine w/ these particular errors❔ We aren't suppressing them in the repo.

@TanayParikh
Copy link
Contributor

/datadisks/disk1/workspace/_work/1/s/src/Components/WebAssembly/Sdk/src/targets/Microsoft.NET.Sdk.BlazorWebAssembly.Current.targets(256,5): error MSB3954: Failed to compute hash for file '/datadisks/disk1/workspace/_work/1/s/src/Components/Web.JS/dist/Release/blazor.webassembly.js' because it does not exist or is inaccessible. [/datadisks/disk1/workspace/_work/1/s/src/Components/benchmarkapps/Wasm.Performance/TestApp/Wasm.Performance.TestApp.csproj]
##[error]src/Components/WebAssembly/Sdk/src/targets/Microsoft.NET.Sdk.BlazorWebAssembly.Current.targets(256,5): error MSB3954: (NETCORE_ENGINEERING_TELEMETRY=Build) Failed to compute hash for file '/datadisks/disk1/workspace/_work/1/s/src/Components/Web.JS/dist/Release/blazor.webassembly.js' because it does not exist or is inaccessible.
/datadisks/disk1/workspace/_work/1/s/src/Components/WebAssembly/Sdk/src/targets/Microsoft.NET.Sdk.BlazorWebAssembly.Current.targets(256,5): error MSB3954: Failed to compute hash for file '/datadisks/disk1/workspace/_work/1/s/src/Components/Web.JS/dist/Release/blazor.webassembly.js' because it does not exist or is inaccessible. [/datadisks/disk1/workspace/_work/1/s/src/Components/WebAssembly/testassets/HostedInAspNet.Client/HostedInAspNet.Client.csproj]
##[error]src/Components/WebAssembly/Sdk/src/targets/Microsoft.NET.Sdk.BlazorWebAssembly.Current.targets(256,5): error MSB3954: (NETCORE_ENGINEERING_TELEMETRY=Build) Failed to compute hash for file '/datadisks/disk1/workspace/_work/1/s/src/Components/Web.JS/dist/Release/blazor.webassembly.js' because it does not exist or is inaccessible.
/datadisks/disk1/workspace/_work/1/s/src/Components/WebAssembly/Sdk/src/targets/Microsoft.NET.Sdk.BlazorWebAssembly.Current.targets(256,5): error MSB3954: Failed to compute hash for file '/datadisks/disk1/workspace/_work/1/s/src/Components/Web.JS/dist/Release/blazor.webassembly.js' because it does not exist or is inaccessible. [/datadisks/disk1/workspace/_work/1/s/src/Components/WebAssembly/testassets/StandaloneApp/StandaloneApp.csproj]
##[error]src/Components/WebAssembly/Sdk/src/targets/Microsoft.NET.Sdk.BlazorWebAssembly.Current.targets(256,5): error MSB3954: (NETCORE_ENGINEERING_TELEMETRY=Build) Failed to compute hash for file '/datadisks/disk1/workspace/_work/1/s/src/Components/Web.JS/dist/Release/blazor.webassembly.js' because it does not exist or is inaccessible.

#25577 (comment)

@dougbu
Copy link
Member

dougbu commented Sep 4, 2020

More on the PDB duplicates: May be able to tell from the following (which ignores PRs) when the problem started…

> runfo search-buildlog --definition=aspnet --value="pdb' is not added because the package already contains file" --count 50
https://dev.azure.com/dnceng/public/_build/results?buildId=799262
https://dev.azure.com/dnceng/public/_build/results?buildId=799262
https://dev.azure.com/dnceng/public/_build/results?buildId=798151
https://dev.azure.com/dnceng/public/_build/results?buildId=798151
https://dev.azure.com/dnceng/public/_build/results?buildId=798151
https://dev.azure.com/dnceng/public/_build/results?buildId=798020
https://dev.azure.com/dnceng/public/_build/results?buildId=798020
https://dev.azure.com/dnceng/public/_build/results?buildId=797887
https://dev.azure.com/dnceng/public/_build/results?buildId=797644
https://dev.azure.com/dnceng/public/_build/results?buildId=796964
https://dev.azure.com/dnceng/public/_build/results?buildId=796399
https://dev.azure.com/dnceng/public/_build/results?buildId=795369
https://dev.azure.com/dnceng/public/_build/results?buildId=795151
https://dev.azure.com/dnceng/public/_build/results?buildId=795131
https://dev.azure.com/dnceng/public/_build/results?buildId=794466
https://dev.azure.com/dnceng/public/_build/results?buildId=794020
https://dev.azure.com/dnceng/public/_build/results?buildId=794020
https://dev.azure.com/dnceng/public/_build/results?buildId=793901

Evaluated 50 builds
Impacted 13 builds
Impacted 18 jobs
Searched 14426 records

No, I haven't checked whether that's every build since some date.

@dougbu
Copy link
Member

dougbu commented Sep 4, 2020

The missing JavaScript file problem seems to have affected just one non-PR build:

> runfo search-buildlog --definition=aspnet --value="blazor.webassembly.js' because it does not exist or is inaccessible" --count 50
https://dev.azure.com/dnceng/public/_build/results?buildId=799302
https://dev.azure.com/dnceng/public/_build/results?buildId=799302

Evaluated 50 builds
Impacted 1 builds
Impacted 2 jobs
Searched 14426 records

@dougbu
Copy link
Member

dougbu commented Sep 4, 2020

that issue also affected one PR:

> runfo search-buildlog --definition=aspnet --value="because it does not exist or is inaccessible" --count 50 --pr
https://dev.azure.com/dnceng/public/_build/results?buildId=802001
https://dev.azure.com/dnceng/public/_build/results?buildId=802001

Evaluated 50 builds
Impacted 1 builds
Impacted 2 jobs
Searched 13940 records

@dougbu
Copy link
Member

dougbu commented Sep 6, 2020

@Pilchie just got really unlucky w/ four PRs opened at about the same time:

@dougbu
Copy link
Member

dougbu commented Sep 8, 2020

I'm able to reproduce the PDB duplication problem locally and seeing the build fail when it occurs. Still can't see why we sometimes hit those NU5118 errors without failing the build. Any ideas on that strange part of this @rainersigwald

@dougbu
Copy link
Member

dougbu commented Sep 8, 2020

Assigned @BrennanConroy because he's got #25707 out and @wtgodbe because he's delving into the duplicate PDB problem

@dougbu
Copy link
Member

dougbu commented Sep 8, 2020

@SteveSandersonMS is anyone looking at the Components ordering issue that leads to occasional errors like the following❔ This isn't as urgent as the other problems but we should try to get to the bottom of it.

@halter73 if nobody gets to this part of the issue today, please include it in your hand-off notes for the next person in the rotation.

/datadisks/disk1/workspace/_work/1/s/src/Components/WebAssembly/Sdk/src/targets/Microsoft.NET.Sdk.BlazorWebAssembly.Current.targets(256,5): error MSB3954: Failed to compute hash for file '/datadisks/disk1/workspace/_work/1/s/src/Components/Web.JS/dist/Release/blazor.webassembly.js' because it does not exist or is inaccessible. [/datadisks/disk1/workspace/_work/1/s/src/Components/benchmarkapps/Wasm.Performance/TestApp/Wasm.Performance.TestApp.csproj]

@SteveSandersonMS
Copy link
Member

SteveSandersonMS commented Sep 8, 2020

It's the first I've heard of it. I don't recall seeing this happen on local builds. Is there any known context, such as when it started happening? Might it be a side-effect of whatever ordering issue causes the duplicate PDB problem?

@dougbu
Copy link
Member

dougbu commented Sep 8, 2020

Is there any known context, such as when it started happening?

The exact timing isn't clear because the problem happens sporadically. I reported a couple of runfo results for builds failing that way in #25293 (comment) and #25293 (comment)


Might it be a side-effect of whatever ordering issue causes the duplicate PDB problem?

No, that only occurs when building the Runtime project.

@dougbu
Copy link
Member

dougbu commented Sep 8, 2020

And,

It's the first I've heard of it.

Ooopsie

@wtgodbe
Copy link
Member

wtgodbe commented Sep 8, 2020

Where I'm at so far -

https://github.com/NuGet/NuGet.Client/blob/3f573425ac74d90b96a61e5e85ed39a39aad8a7f/src/NuGet.Core/NuGet.Build.Tasks.Pack/NuGet.Build.Tasks.Pack.targets#L207-L223

It looks like BuildOutputInPackage is getting duplicate .pdb entries - one from the output folder of Microsoft.AspNetCore.App.Runtime, and one from the output of the project that produced the .pdb. That item group gets populated here:

https://github.com/NuGet/NuGet.Client/blob/3f573425ac74d90b96a61e5e85ed39a39aad8a7f/src/NuGet.Core/NuGet.Build.Tasks.Pack/NuGet.Build.Tasks.Pack.targets#L420-L429

The size of the binlog has made this hard to trace back locally, and building locally hasn't repro'd it yet for me. But, I did notice that the duplicates appear to be present even in green CI builds. Still poking around.

@wtgodbe
Copy link
Member

wtgodbe commented Sep 8, 2020

Looks like in 3.1, _GetBuildOutputFilesWithTfm created output that was populated with just .dll's and .pdb's from the Microsoft.AspNetCore.App.Runtime output folder, whereas today it also has .pdb's from the output folders of the individual projects

@wtgodbe
Copy link
Member

wtgodbe commented Sep 8, 2020

Not sure when this started, but what's happening:

<Target Name="_ResolveRuntimePackBuildOutput" DependsOnTargets="ResolveReferences;Crossgen">
<ItemGroup>
<BuiltProjectOutputGroupOutput Include="$(ProjectDepsFilePath)" />
<BuiltProjectOutputGroupOutput Include="@(ReferenceCopyLocalPaths)" Condition=" '%(ReferenceCopyLocalPaths.IsNativeImage)' != 'true' " />
<!-- Include all .pdbs in build output. Some .pdb files are part of ReferenceCopyLocalPaths, but this can change when running /t:Pack /p:NoBuild=true. -->
<BuiltProjectOutputGroupOutput Include="$(TargetDir)*.pdb" Exclude="@(ReferenceCopyLocalPaths)" />
<!-- Crossgen symbols for Linux include a GUID in the file name which cannot be predicted. -->
<BuiltProjectOutputGroupOutput Include="$(TargetDir)*.map" Condition="'$(CrossGenSymbolsType)' == 'PerfMap'" />
<NuGetPackInput Include="@(BuiltProjectOutputGroupOutput)" />
</ItemGroup>
</Target>

ResolveReferences adds the .pdb's from the individual project output dirs to ReferenceCopyLocalPaths, then it appears the condition on

<BuiltProjectOutputGroupOutput Include="$(TargetDir)*.pdb" Exclude="@(ReferenceCopyLocalPaths)" />
isn't working, since the .pdb's in the output directory of the SharedFx project don't have the same %(Identity) as the ones in the in the output dirs of the individual projects. Tweaking that condition should fix the issue

@wtgodbe
Copy link
Member

wtgodbe commented Sep 9, 2020

Resolved by #25733

@wtgodbe wtgodbe reopened this Sep 10, 2020
@wtgodbe
Copy link
Member

wtgodbe commented Sep 10, 2020

Looks like this is still recurring. #25733 did make NuGetPackInput free of duplicates, but we're still seeing this failure, e.g. https://dev.azure.com/dnceng/public/_build/results?buildId=809582&view=results. Looking into it.

@wtgodbe
Copy link
Member

wtgodbe commented Sep 11, 2020

Fixed by #25785

@wtgodbe wtgodbe closed this as completed Sep 11, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Oct 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework investigate
Projects
None yet
Development

No branches or pull requests

8 participants