Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal CLR error (0x80131506) when running IBC collection #90962

Closed
DrewScoggins opened this issue Aug 22, 2023 · 16 comments
Closed

Internal CLR error (0x80131506) when running IBC collection #90962

DrewScoggins opened this issue Aug 22, 2023 · 16 comments
Milestone

Comments

@DrewScoggins
Copy link
Member

Starting between 8.0.100-rc.1.23404.2 and 8.0.100-rc.1.23415.5 the dotnet-optimization runs started failing with the below stack trace.

To repro this you will need to clone the dotnet-optimization repo, and then run

.\build.cmd -configuration Release -build
.\train.cmd -jobs CLRx64WINmasIBC -output C:\git\dotnet-optimization\artifacts\out

@davidwrighton @jeffschwMSFT

Fatal error. Internal CLR error. (0x80131506)
   at System.Runtime.Serialization.SerializationGuard.<ThrowIfDeserializationInProgress>g__ThrowIfDeserializationInProgress|0_0(System.Runtime.Serialization.SerializationInfo, System.String, Int32 ByRef)
   at System.Runtime.Serialization.SerializationGuard.ThrowIfDeserializationInProgress(System.String, Int32 ByRef)
   at System.Diagnostics.Process.Start()
   at Microsoft.DotNet.Cli.Utils.ProcessStartInfoExtensions.ExecuteAndCaptureOutput(System.Diagnostics.ProcessStartInfo, System.String ByRef, System.String ByRef)
   at Microsoft.DotNet.Cli.Telemetry.MacAddressGetter.GetShellOutMacAddressOutput()
   at Microsoft.DotNet.Cli.Telemetry.MacAddressGetter.GetMacAddress()
   at Microsoft.DotNet.Cli.Telemetry.TelemetryCommonProperties.GetMachineId()
   at Microsoft.DotNet.Configurer.UserLevelCacheWriter.RunWithCacheInFilePath(System.String, System.Func`1<System.String>)
   at Microsoft.DotNet.Configurer.UserLevelCacheWriter.RunWithCache(System.String, System.Func`1<System.String>)
   at Microsoft.DotNet.Cli.Telemetry.TelemetryCommonProperties.GetTelemetryCommonProperties()
   at Microsoft.DotNet.Cli.Telemetry.Telemetry.InitializeTelemetry()
   at Microsoft.DotNet.Cli.Telemetry.Telemetry.<.ctor>b__13_0()
   at System.Threading.Tasks.Task.InnerInvoke()
   at System.Threading.Tasks.Task+<>c.<.cctor>b__281_0(System.Object)
   at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(System.Threading.Thread, System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread)
   at System.Threading.Tasks.Task.ExecuteEntryUnsafe(System.Threading.Thread)
   at System.Threading.Tasks.Task.ExecuteFromThreadPool(System.Threading.Thread)
   at System.Threading.ThreadPoolWorkQueue.DispatchWorkItem(System.Object, System.Threading.Thread)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool+WorkerThread.WorkerDoWork(System.Threading.PortableThreadPool, Boolean ByRef)
   at System.Threading.PortableThreadPool+WorkerThread.WorkerThreadStart()
   at System.Threading.Thread+StartHelper.RunWorker()
   at System.Threading.Thread+StartHelper.Run()
   at System.Threading.Thread.StartCallback()
[12:05:08] Failure during training scenario 'DotNet_FirstTimeXP'.  Exception details: System.Exception: C:\git\dotnet-optimization\artifacts\out\sdk\x64\dotnet.exe new console exited with code -1073741819
   at Microsoft.DotNet.Optimization.Utilities.Execute(ProcessStartInfo startInfo, ICollection`1 allowedExitCodes) in C:\git\dotnet-optimization\src\core\Utilities.cs:line 36
   at Microsoft.DotNet.Optimization.NetCoreApp.DotNet_FirstTimeXP.Execute(Product product, OptimizationTool optTool, String destinationDirectory) in C:\git\dotnet-optimization\src\scenarios\netcoreapp\DotNet_FirstTimeXP.cs:line 44
   at Microsoft.DotNet.Optimization.AnyOS_IBC.RunScenario(Product product, Scenario scenario, String destDirectory) in C:\git\dotnet-optimization\src\optimizationtools\AnyOS_IBC.cs:line 102
   at Microsoft.DotNet.Optimization.TrainingJob.GenerateTrainingData(String layoutDirectory) in C:\git\dotnet-optimization\src\core\TrainingJob.cs:line 146
Unhandled exception. System.Exception: C:\git\dotnet-optimization\artifacts\out\sdk\x64\dotnet.exe new console exited with code -1073741819
   at Microsoft.DotNet.Optimization.Utilities.Execute(ProcessStartInfo startInfo, ICollection`1 allowedExitCodes) in C:\git\dotnet-optimization\src\core\Utilities.cs:line 36
   at Microsoft.DotNet.Optimization.NetCoreApp.DotNet_FirstTimeXP.Execute(Product product, OptimizationTool optTool, String destinationDirectory) in C:\git\dotnet-optimization\src\scenarios\netcoreapp\DotNet_FirstTimeXP.cs:line 44
   at Microsoft.DotNet.Optimization.AnyOS_IBC.RunScenario(Product product, Scenario scenario, String destDirectory) in C:\git\dotnet-optimization\src\optimizationtools\AnyOS_IBC.cs:line 102
   at Microsoft.DotNet.Optimization.TrainingJob.GenerateTrainingData(String layoutDirectory) in C:\git\dotnet-optimization\src\core\TrainingJob.cs:line 146
   at Microsoft.DotNet.Optimization.TrainingJob.Execute() in C:\git\dotnet-optimization\src\core\TrainingJob.cs:line 46
   at Microsoft.DotNet.Optimization.AutomatedOptimizationJob.processJob(Job job, Action`1 onSuccess) in C:\git\dotnet-optimization\src\jobs\AutomatedOptimizationJob.cs:line 21
   at Microsoft.DotNet.Optimization.AutomatedOptimizationJob.runJobs() in C:\git\dotnet-optimization\src\jobs\AutomatedOptimizationJob.cs:line 181
   at Microsoft.DotNet.Optimization.AutomatedOptimizationJob.Execute() in C:\git\dotnet-optimization\src\jobs\AutomatedOptimizationJob.cs:line 32
   at Microsoft.DotNet.Optimization.Program.Main(String[] args) in C:\git\dotnet-optimization\src\Program.cs:line 28
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 22, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Aug 22, 2023
@DrewScoggins
Copy link
Member Author

I didn't notice in the stack that we are doing telemetry when this crash happens. Going to add some people from sdk/installer side. Please feel free to add the right people. @marcpopMSFT @elinor-fung. Also adding @mangod9

@marcpopMSFT
Copy link
Member

The ExecuteAndCaptureOutput method hasn't changed on the SDK side. We changed the caller of that method slightly to pass in the full path to the application rather than relying on the path. That's the only thing I can think of that changed recently in those code paths.

https://github.com/dotnet/sdk/pull/34515/files#diff-e57aacb201a371dc4949ecd087b4ad58702db1155420f3a1d28d9368b823b8d7

CC @Forgind

@DrewScoggins
Copy link
Member Author

Yeah, I am not convinced that this is from the SDK side, but wanted to make sure, but given that we only see this error when we have the IBC collection infrastructure turned, it seemed unlikely.

@EgorBo
Copy link
Member

EgorBo commented Aug 22, 2023

only IBC is failing? I mean MIBC jobs are passing?

@mangod9
Copy link
Member

mangod9 commented Aug 22, 2023

Is there a dump or repro available? Assume its failing consistently?

@Forgind
Copy link
Member

Forgind commented Aug 22, 2023

As I understand it, that's an access violation:
The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

I don't think that would be the result if the file just didn't exist, but I may be wrong.

@DrewScoggins
Copy link
Member Author

only IBC is failing? I mean MIBC jobs are passing?

https://dev.azure.com/dnceng/internal/_build/results?buildId=2249779&view=logs&j=c2dee064-8ea3-54b9-cdc0-a020102b67c9&t=8b7fd894-4ebd-5855-62d2-e0ff2885cf11

This is what is failing. I thought this was technically MIBC.

@DrewScoggins
Copy link
Member Author

I mentioned in the issue how to repro this. I was able to get it quite easily reproed on my machine.

@EgorBo
Copy link
Member

EgorBo commented Aug 23, 2023

@marcpopMSFT how do we make dotnet to repeat "first time xp" every launch? Do we need to delete some file in sdk or change config?

@EgorBo
Copy link
Member

EgorBo commented Aug 23, 2023

From a quick glance it doesn't look TieredPGO related - still reproduces if I disabel TieredPGO and other flags.

@jkotas
Copy link
Member

jkotas commented Aug 23, 2023

at System.Runtime.Serialization.SerializationGuard.g__ThrowIfDeserializationInProgress|0_0(System.Runtime.Serialization.SerializationInfo, System.String, Int32 ByRef)
at System.Runtime.Serialization.SerializationGuard.ThrowIfDeserializationInProgress(System.String, Int32 ByRef)

This is likely the problem fixed by #90826. The fix was checked in 4 days ago. Has the runtime with the fix been ingested into dotnet-optimization?

@EgorBo
Copy link
Member

EgorBo commented Aug 23, 2023

at System.Runtime.Serialization.SerializationGuard.g__ThrowIfDeserializationInProgress|0_0(System.Runtime.Serialization.SerializationInfo, System.String, Int32 ByRef)
at System.Runtime.Serialization.SerializationGuard.ThrowIfDeserializationInProgress(System.String, Int32 ByRef)

This is likely the problem fixed by #90826. The fix was checked in 4 days ago. Has the runtime with the fix been ingested into dotnet-optimization?

dotnet-optimization tried to update to the latest dotnet/installer (yesterday's) that uses dotnet/runtime from this commit: 90b92bb

So looks like it doesn't yet include the fix.

@DrewScoggins
Copy link
Member Author

OK. I am going to go ahead and run the LKG runtime through the opt data collection pipeline, and then we will just wait for this change to flow through to installer. I didn't see any PRs in the SDK repo to take an updated version of runtime, but I assume this will land eventually.

@marcpopMSFT
Copy link
Member

@marcpopMSFT how do we make dotnet to repeat "first time xp" every launch? Do we need to delete some file in sdk or change config?
We check for sentinel files in the users<username>.dotnet folder (on windows and a similar location on other platforms).
https://github.com/dotnet/sdk/blob/main/src/Cli/Microsoft.DotNet.Configurer/DotnetFirstTimeUseConfigurer.cs

There are three of the sentinels. One for first ruse, one for tool path, and one for aspnet certificate and they are per version of the sdk. If you don't care about being specific (which I would guess you might not), you can probably just delete *sentinel in that folder.

@vcsjones vcsjones removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 28, 2023
@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Aug 31, 2023
@mangod9 mangod9 added this to the 9.0.0 milestone Aug 31, 2023
@jkotas
Copy link
Member

jkotas commented Aug 31, 2023

This is fixed: #89311 (comment)

@jkotas jkotas closed this as completed Aug 31, 2023
@EgorBo
Copy link
Member

EgorBo commented Sep 12, 2023

@ghost ghost locked as resolved and limited conversation to collaborators Oct 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants