Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows ARM64 build gets stuck at installer.tasks #38902

Closed
pgovind opened this issue Jul 8, 2020 · 18 comments · Fixed by #41947
Closed

Windows ARM64 build gets stuck at installer.tasks #38902

pgovind opened this issue Jul 8, 2020 · 18 comments · Fixed by #41947

Comments

@pgovind
Copy link

pgovind commented Jul 8, 2020

I'm trying to build dotnet/runtime for windows on ARM64 with the following command line:

build clr+libs+libs.tests -a arm64 -c release

The build seems to be stuck at the installer.tasks step:

C:\Users\prgovi\Desktop\Work\runtime>build clr+libs+libs.tests -a arm64 -c release
  Determining projects to restore...
  Tool 'coverlet.console' (version '1.7.2') was restored. Available commands: coverlet
  Tool 'dotnet-reportgenerator-globaltool' (version '4.5.8') was restored. Available commands: reportgenerator
  Tool 'microsoft.dotnet.xharness.cli' (version '1.0.0-prerelease.20329.4') was restored. Available commands: xharness
  Restore was successful.
  All projects are up-to-date for restore.
  Determining projects to restore...
  All projects are up-to-date for restore.
  installer.tasks -> C:\Users\prgovi\Desktop\Work\runtime\artifacts\bin\installer.tasks\Debug\netstandard2.0\installer.tasks.dll
**The build just gets stuck at this point**

You might need a device to repro this. Let me know if you want to RDP into mine to test/repro or want logs.

cc @Anipik @ericstj @safern @carlossanlop @eiriktsarpalis

@ghost
Copy link

ghost commented Jul 8, 2020

Tagging subscribers to this area: @ViktorHofer
Notify danmosemsft if you want to be subscribed.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Jul 8, 2020
@pgovind pgovind added this to the Future milestone Jul 8, 2020
@pgovind pgovind changed the title ARM64 build gets stuck at installer.tasks Windows ARM64 build gets stuck at installer.tasks Jul 8, 2020
@safern
Copy link
Member

safern commented Jul 8, 2020

Probably an MSBuild/SDK issue?

cc: @rainersigwald

@safern safern removed the untriaged New issue has not been triaged by the area owner label Jul 8, 2020
@ViktorHofer ViktorHofer removed their assignment Jul 8, 2020
@ViktorHofer
Copy link
Member

cc @eiriktsarpalis

@eiriktsarpalis
Copy link
Member

I'm reproducing the exact same issue in dotnet/runtime. The issue started after dotnet-installer merged this fix.

Possibly related: dotnet/msbuild#5474

cc @danmosemsft

@rainersigwald
Copy link
Member

Is it possible to get memory dumps during the hang? Or at least build with a diag-verbosity text log (binlogs are better except when they can't be flushed because of a hang/crash in the build) to get more clues about where things are going wrong?

@ladipro is chasing arm64 MSBuild problems, so I'll add this to his pile.

@safern
Copy link
Member

safern commented Jul 8, 2020

@carlossanlop has a minimal repro for this with diagnostics verbosity. Can you share the whole output in this issue?

@eiriktsarpalis
Copy link
Member

Here's my output using build.cmd clr+libs -a arm64 -rc release -v diag

log.txt

@safern
Copy link
Member

safern commented Jul 8, 2020

@eiriktsarpalis I think we can reduce the log and make it smaller if you do:

dotnet build tools-local\tasks\installer.tasks\ /p:TargetArchitecture=arm64 /v:diag

@eiriktsarpalis
Copy link
Member

Sure, here you go

log2.txt

@carlossanlop
Copy link
Member

carlossanlop commented Jul 8, 2020

Here's my repro as requested by @safern:

  1. My laptop is Surface Pro X, with Windows 10 Version 1909 (Build 18363.900).
  2. Installed PowerShell Core ARM64 - No installer, need to download the zip from here and extract it in "C:\Program Files (Arm)\PowerShell\7-preview".
  3. Installed VS and all requirements, CMake (note there's no Arm64 version, not sure if relevant), and cloned runtime.
  4. Installed dotnet via the dotnet-install script, using the version required by the runtime repo, double checked that the environment variable got set to point to the install location:
.\dotnet-install.ps1 -Architecture arm64 -InstallDir "C:\Program Files (Arm)\dotnet" -JsonFile C:\Users\myusername\source\repos\runtime\global.json 
  1. Building runtime gets stuck in an "installer.tasks" step:
PS C:\Users\myusername\source\repos\runtime> .\build.cmd clr+libs -rc release -a arm64
  Determining projects to restore...
  Tool 'coverlet.console' (version '1.7.2') was restored. Available commands: coverlet
  Tool 'dotnet-reportgenerator-globaltool' (version '4.5.8') was restored. Available commands: reportgenerator
  Tool 'microsoft.dotnet.xharness.cli' (version '1.0.0-prerelease.20329.4') was restored. Available commands: xharness

  Restore was successful.
  Restored C:\Users\myusername\.nuget\packages\microsoft.dotnet.arcade.sdk\5.0.0-beta.20316.1\tools\Tools.proj (in 419 ms).
  Determining projects to restore...
  Restored C:\Users\myusername\source\repos\runtime\tools-local\tasks\tasks.proj (in 262 ms).
  Restored C:\Users\myusername\source\repos\runtime\tools-local\tasks\installer.tasks\installer.tasks.csproj (in 2.52 sec).
  installer.tasks -> C:\Users\myusername\source\repos\runtime\artifacts\bin\installer.tasks\Debug\netstandard2.0\installer.tasks.dll
  1. Attempting to build installer.tasks.csproj by itself with diagnostics logging gets stuck after these messages, and freezes PowerShell (can't kill the process).
    I uploaded the full binlog file in OneDrive (it's ~5MB). If you'd like to see it, please contact me via Teams so I can give you the link.
PS C:\Users\myusername\source\repos\runtime> dotnet build /v:diag .\tools-local\tasks\installer.tasks\installer.tasks.csproj
...
13:01:07.920   1:2>Project "C:\Users\myusername\source\repos\runtime\tools-local\tasks\installer.tasks\installer.tasks.csproj" (1:2) is building "C:\Users\myusername\source\repos\runtime\tools-local\tasks\installer.tasks\installer.tasks.csproj" (1:4) on node 2 (Build target(s)).
13:01:08.878   1:4>Building with tools version "Current".
                   Target "ErrorForMissingTestRunner" skipped, due to false condition; ('$(IsTestProject)' == 'true' AND '$(TestRunnerName)' != '') was evaluated as ('false' == 'true' AND '' != '').
                   Target "_CheckForUnsupportedTargetFramework" skipped, due to false condition; ('$(_UnsupportedTargetFrameworkError)' == 'true') was evaluated as ('' == 'true').
13:01:08.888   1:4>Target "_CollectTargetFrameworkForTelemetry: (TargetId:2)" in file "C:\Program Files (Arm)\dotnet\sdk\5.0.100-preview.6.20310.4\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.TargetFrameworkInference.targets" from project "C:\Users\myusername\source\repos\runtime\tools-local\tasks\installer.tasks\installer.tasks.csproj" (target "_CheckForInvalidConfigurationAndPlatform" depends on it):
                   Using "AllowEmptyTelemetry" task from assembly "C:\Program Files (Arm)\dotnet\sdk\5.0.100-preview.6.20310.4\Sdks\Microsoft.NET.Sdk\targets\..\tools\net5.0/Microsoft.NET.Build.Tasks.dll".
                   Task "AllowEmptyTelemetry" (TaskId:2)
                     Task Parameter:EventName=targetframeworkeval (TaskId:2)
                     Task Parameter:EventData=TargetFrameworkVersion=.NETFramework,Version=v4.6;RuntimeIdentifier=;SelfContained=;UseApphost=;OutputType=Library (TaskId:2)

@pgovind
Copy link
Author

pgovind commented Jul 8, 2020

Adding something I found here: I updated my Surface X and created a clean WSL2 Ubuntu environment and ran the build and it almost completed (I killed it to check something). When I tried running it again, it was stuck again after the installer.tasks step. @safern suggested killing all the dotnet processes, and retrying the build and voila, it succeeded. Interestingly, the same steps did not work in Windows( my windows build is still stuck after the installer.tasks step ). Perhaps, it's related to locks or the number of build processes running at the same time?

@ericstj
Copy link
Member

ericstj commented Jul 8, 2020

There's probably nothing specific about that project, it just happens to be the first CSProj that gets built.

I haven't seen anyone here share the dump file that @rainersigwald suggested. To get that, once you have a hang, go into task-manager and right click and choose "Create Dump File". You should grab all the dotnet msbuild processes. For that matter you can also try passing -m:1 and see if it works around this (telling msbuild to use only one process), however it will slow down build significantly.
image

The task that's hanging calls into MSBuild to send telemetry:
https://github.com/dotnet/sdk/blob/d8ef7ef571f035fa10f8e13935aa6afe54316579/src/Tasks/Microsoft.NET.Build.Tasks/AllowEmptyTelemetry.cs#L47
You can try no-oping the target that calls that task to see if the build proceeds further. You can do that by adding <Target Name="_CollectTargetFrameworkForTelemetry" /> to

@carlossanlop
Copy link
Member

@ericstj @rainersigwald I collected the dump files and sent you an email with their location.

@carlossanlop
Copy link
Member

carlossanlop commented Jul 9, 2020

@ericstj opting out from telemetry as you suggested seemed to do the trick, the build is not stopping anymore. That's what was causing installer.tasks to hang. I hope the dump files help figure out the root cause.

Edit: It seems the build is now failing with a bunch of other reasons, which I will investigate. They are probably unrelated to the installer.tasks issue.

@eiriktsarpalis
Copy link
Member

FWIW a quick workaround to the issue is to make a local edit on this line, replacing the default $architecture parameter with 'x86'. Then just delete the .dotnet folder and re-run the build.

@carlossanlop
Copy link
Member

Followup of my last comment: After adding the line suggested by @ericstj to bypass the installer.tasks step, and editing the line suggested by @eiriktsarpalis to force the usage of x86, I built using this command: .\build.cmd clr -c debug -arch arm64, which failed with a error in crossgen-corelib. Since it's unrelated to the installer.tasks problem reported here, I will open a separate issue to track that.

@ladipro
Copy link
Member

ladipro commented Jul 22, 2020

The root cause of the hang is #39701

It's rearing its ugly head at this callstack:

        Child SP               IP Call Site
000000DB4757EB40 00007ff9e6eb39ec [PrestubMethodFrame: 000000db4757eb40] System.Runtime.CompilerServices.DependentHandle.nSetPrimary(IntPtr, System.Object)
000000DB4757ED40 00007ff9aea62124 System.Runtime.CompilerServices.ConditionalWeakTable`2+Container[[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib]].Remove(System.__Canon) [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/ConditionalWeakTable.cs @ 577]
000000DB4757ED70 00007ff9aea60f18 System.Runtime.CompilerServices.ConditionalWeakTable`2[[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib]].Remove(System.__Canon)
000000DB4757EDB0 00007ff9aeb8b0ac System.Collections.Generic.Dictionary`2[[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib]].OnDeserialization(System.Object) [/_/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/Dictionary.cs @ 687]
000000DB4757EE00 00007ff9b2b99c18 System.Runtime.Serialization.ObjectManager.RaiseDeserializationEvent() [/_/src/libraries/System.Runtime.Serialization.Formatters/src/System/Runtime/Serialization/ObjectManager.cs @ 971]
000000DB4757EE20 00007ff9b2ba9908 System.Runtime.Serialization.Formatters.Binary.ObjectReader.Deserialize(System.Runtime.Serialization.Formatters.Binary.BinaryParser, Boolean) [/_/src/libraries/System.Runtime.Serialization.Formatters/src/System/Runtime/Serialization/Formatters/Binary/BinaryObjectReader.cs @ 128]
000000DB4757EE70 00007ff9b2ba31ac System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(System.IO.Stream, Boolean) [/_/src/libraries/System.Runtime.Serialization.Formatters/src/System/Runtime/Serialization/Formatters/Binary/BinaryFormatter.cs @ 69]
000000DB4757EEB0 00007ff9b2ba2ffc System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(System.IO.Stream) [/_/src/libraries/System.Runtime.Serialization.Formatters/src/System/Runtime/Serialization/Formatters/Binary/BinaryFormatter.cs @ 41]
000000DB4757EEC0 00007ff955bcd60c Microsoft.Build.BackEnd.BinaryTranslator+BinaryReadTranslator.TranslateDotNet[[System.__Canon, System.Private.CoreLib]](System.__Canon ByRef) [C:\src\msbuild\src\Shared\BinaryTranslator.cs @ 416]
000000DB4757EF20 00007ff955bcab70 Microsoft.Build.Shared.LogMessagePacketBase.ReadFromStream(Microsoft.Build.BackEnd.ITranslator) [C:\src\msbuild\src\Shared\LogMessagePacketBase.cs @ 400]
000000DB4757F000 00007ff955bc96ac Microsoft.Build.Shared.LogMessagePacketBase.Translate(Microsoft.Build.BackEnd.ITranslator) [C:\src\msbuild\src\Shared\LogMessagePacketBase.cs @ 261]
000000DB4757F030 00007ff955bc95e0 Microsoft.Build.Shared.LogMessagePacketBase..ctor(Microsoft.Build.BackEnd.ITranslator) [C:\src\msbuild\src\Shared\LogMessagePacketBase.cs @ 192]
000000DB4757F050 00007ff955bc957c Microsoft.Build.BackEnd.LogMessagePacket..ctor(Microsoft.Build.BackEnd.ITranslator) [C:\src\msbuild\src\Build\BackEnd\Components\Communications\LogMessagePacket.cs @ 36]
000000DB4757F070 00007ff955bc9538 Microsoft.Build.BackEnd.LogMessagePacket.FactoryForDeserialization(Microsoft.Build.BackEnd.ITranslator) [C:\src\msbuild\src\Build\BackEnd\Components\Communications\LogMessagePacket.cs @ 45]
000000DB4757F090 00007ff955bc94d0 Microsoft.Build.BackEnd.NodePacketFactory+PacketFactoryRecord.DeserializeAndRoutePacket(Int32, Microsoft.Build.BackEnd.ITranslator) [C:\src\msbuild\src\Shared\NodePacketFactory.cs @ 104]
000000DB4757F0D0 00007ff955bc93c0 Microsoft.Build.BackEnd.NodePacketFactory.DeserializeAndRoutePacket(Int32, Microsoft.Build.BackEnd.NodePacketType, Microsoft.Build.BackEnd.ITranslator) [C:\src\msbuild\src\Shared\NodePacketFactory.cs @ 60]
000000DB4757F110 00007ff955bc92ac Microsoft.Build.BackEnd.NodeManager.DeserializeAndRoutePacket(Int32, Microsoft.Build.BackEnd.NodePacketType, Microsoft.Build.BackEnd.ITranslator) [C:\src\msbuild\src\Build\BackEnd\Components\Communications\NodeManager.cs @ 277]
000000DB4757F140 00007ff955bc7d68 Microsoft.Build.BackEnd.NodeProviderOutOfProcBase+NodeContext.ReadAndRoutePacket(Microsoft.Build.BackEnd.NodePacketType, Byte[], Int32) [C:\src\msbuild\src\Build\BackEnd\Components\Communications\NodeProviderOutOfProcBase.cs @ 928]
000000DB4757F1D0 00007ff955b8b3d8 Microsoft.Build.BackEnd.NodeProviderOutOfProcBase+NodeContext+d__12.MoveNext() [C:\src\msbuild\src\Build\BackEnd\Components\Communications\NodeProviderOutOfProcBase.cs @ 719]

The exception goes unhandled and the out-of-proc packet processing loop breaks.

Until the underlying FCall issue is fixed, please try setting the MSBuildUseLegacyStringInterner environment variable to 1. This disables the code that calls GCHandle.set_Target, which should work around the collision.

@adamsitnik
Copy link
Member

I've confirmed that the incoming SDK version update from @ViktorHofer (#41947) solves this particular problem.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.