Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bundle_Is_Extracted test failing intermittently on Linux #43316

Closed
runfoapp bot opened this issue Oct 12, 2020 · 12 comments · Fixed by #45523
Closed

Bundle_Is_Extracted test failing intermittently on Linux #43316

runfoapp bot opened this issue Oct 12, 2020 · 12 comments · Fixed by #45523
Assignees
Labels
area-Single-File blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'
Milestone

Comments

@runfoapp
Copy link

runfoapp bot commented Oct 12, 2020

Runfo Tracking Issue: Bundle_Is_Extracted test failing intermittently on Linux

Build Definition Kind Run Name
1306544 runtime PR 57707 Installer-coreclr-Linux_x64-Release
1306544 runtime PR 57707 Installer-coreclr-Linux_musl_x64-Release
1303326 runtime PR 57613 Installer-coreclr-Linux_musl_x64-Release
1299064 runtime PR 56501 Installer-coreclr-Linux_musl_x64-Release
1298012 runtime PR 57274 Installer-coreclr-Linux_x64-Release
1295343 runtime Rolling Installer-coreclr-Linux_musl_x64-Release
1295250 runtime PR 57471 Installer-coreclr-Linux_x64-Release
1292506 runtime PR 57357 Installer-coreclr-Linux_x64-Release
1289058 runtime PR 57155 Installer-coreclr-Linux_x64-Release
1278169 runtime PR 56907 Installer-coreclr-Linux_musl_x64-Release
1276600 runtime PR 56859 Installer-coreclr-Linux_x64-Release
1272901 runtime Rolling Installer-coreclr-Linux_musl_x64-Release
1271025 runtime PR 56714 Installer-coreclr-Linux_musl_x64-Release
1270207 runtime PR 56669 Installer-coreclr-Linux_x64-Release
1270207 runtime PR 56669 Installer-coreclr-Linux_musl_x64-Release
1270083 runtime PR 56669 Installer-coreclr-Linux_x64-Release
1270083 runtime PR 56669 Installer-coreclr-Linux_musl_x64-Release
1268562 runtime PR 56316 Installer-coreclr-Linux_x64-Release
1268095 runtime PR 56455 Installer-coreclr-Linux_musl_x64-Release
1265180 runtime PR 56324 Installer-coreclr-Linux_x64-Release
1260507 runtime PR 56363 Installer-coreclr-Linux_x64-Release
1260448 runtime PR 56360 Installer-coreclr-Linux_x64-Release

Build Result Summary

Day Hit Count Week Hit Count Month Hit Count
0 4 19
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-Tracing-coreclr untriaged New issue has not been triaged by the area owner labels Oct 12, 2020
@jkoritzinsky jkoritzinsky added area-Single-File blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' and removed area-Tracing-coreclr untriaged New issue has not been triaged by the area owner labels Oct 12, 2020
@ghost
Copy link

ghost commented Oct 12, 2020

Tagging subscribers to this area: @agocke, @vitek-karas
See info in area-owners.md if you want to be subscribed.

@stephentoub
Copy link
Member

Failed again here:
https://dev.azure.com/dnceng/public/_build/results?buildId=888977&view=ms.vss-test-web.build-test-results-tab&runId=28428970&resultId=100011&paneView=debug


Error message
System.ComponentModel.Win32Exception : Text file busy


Stack trace
   at System.Diagnostics.Process.ForkAndExecProcess(String filename, String[] argv, String[] envp, String cwd, Boolean redirectStdin, Boolean redirectStdout, Boolean redirectStderr, Boolean setCredentials, UInt32 userId, UInt32 groupId, UInt32[] groups, Int32& stdinFd, Int32& stdoutFd, Int32& stderrFd, Boolean usesTerminal, Boolean throwOnNoExec)
   at System.Diagnostics.Process.StartCore(ProcessStartInfo startInfo)
   at System.Diagnostics.Process.Start()
   at Microsoft.DotNet.Cli.Build.Framework.Command.Start() in /_/src/installer/tests/TestUtils/Command.cs:line 199
   at Microsoft.DotNet.Cli.Build.Framework.Command.Execute(Boolean fExpectedToFail) in /_/src/installer/tests/TestUtils/Command.cs:line 239
   at Microsoft.DotNet.Cli.Build.Framework.Command.Execute() in /_/src/installer/tests/TestUtils/Command.cs:line 171
   at AppHost.Bundle.Tests.NetCoreApp3CompatModeTests.Bundle_Is_Extracted() in /_/src/installer/tests/Microsoft.NET.HostModel.Tests/AppHost.Bundle.Tests/NetCoreApp3CompatModeTests.cs:line 33

@agocke
Copy link
Member

agocke commented Nov 17, 2020

I'll take a look at this

@stephentoub
Copy link
Member

Thanks, Andy

@agocke agocke self-assigned this Nov 30, 2020
agocke added a commit to agocke/runtime that referenced this issue Dec 3, 2020
Tests in the AppHost.Bundle.Tests assembly seem to randomly fail due to a race condition
with the file system. They try to create separate '0','1','2'... subdirectories to isolate
the published files for each test, but I think what's happening is that files may be
marked for deletion, but then not deleted until a later write. For instance, files in
'2' may be marked for deletion and some may fail a File.Exists check, which leads to
'2' being recreated, at which point deletion may occur, which will cause the current test
to fail due to a concurrent write operation.

This change tries to simplify the system by sharing the test state across all the classes
in the assembly, instead of per-class, and then cleaning up only when all of them are
finished executing.

Fixes dotnet#43316
agocke added a commit that referenced this issue Dec 8, 2020
Tests in the AppHost.Bundle.Tests assembly seem to randomly fail due to a race condition
with the file system. They try to create separate '0','1','2'... subdirectories to isolate
the published files for each test, but I think what's happening is that files may be
marked for deletion, but then not deleted until a later write. For instance, files in
'2' may be marked for deletion and some may fail a File.Exists check, which leads to
'2' being recreated, at which point deletion may occur, which will cause the current test
to fail due to a concurrent write operation.

This change tries to avoid locking & contention by randomly generating folder names and
using a (hopefully atomically created) lock file to indicate ownership of a particular name.

Fixes #43316
@ViktorHofer
Copy link
Member

ViktorHofer commented Dec 10, 2020

@hoyosjs
Copy link
Member

hoyosjs commented Mar 15, 2021

Closing this until this happens again. FYI @agocke

@hoyosjs hoyosjs closed this as completed Mar 15, 2021
@ghost ghost locked as resolved and limited conversation to collaborators May 12, 2021
@radical
Copy link
Member

radical commented May 28, 2021

Failed again on https://github.com/dotnet/runtime/pull/53280/checks?check_run_id=2690948600, build: https://dev.azure.com/dnceng/public/_build/results?buildId=1160943&view=logs&jobId=fcbd3ad6-b1a4-55f2-40cc-7db598750d43&j=fcbd3ad6-b1a4-55f2-40cc-7db598750d43&t=0fc7de3b-e85e-5109-ff09-70d9e625d9be

System.ComponentModel.Win32Exception : Text file busy

Stack trace
   at System.Diagnostics.Process.ForkAndExecProcess(String filename, String[] argv, String[] envp, String cwd, Boolean redirectStdin, Boolean redirectStdout, Boolean redirectStderr, Boolean setCredentials, UInt32 userId, UInt32 groupId, UInt32[] groups, Int32& stdinFd, Int32& stdoutFd, Int32& stderrFd, Boolean usesTerminal, Boolean throwOnNoExec)
   at System.Diagnostics.Process.StartCore(ProcessStartInfo startInfo)
   at System.Diagnostics.Process.Start()
   at Microsoft.DotNet.Cli.Build.Framework.Command.Start() in /_/src/installer/tests/TestUtils/Command.cs:line 199
   at Microsoft.DotNet.Cli.Build.Framework.Command.Execute(Boolean fExpectedToFail) in /_/src/installer/tests/TestUtils/Command.cs:line 239
   at Microsoft.DotNet.Cli.Build.Framework.Command.Execute() in /_/src/installer/tests/TestUtils/Command.cs:line 171
   at AppHost.Bundle.Tests.NetCoreApp3CompatModeTests.Bundle_Is_Extracted() in /_/src/installer/tests/Microsoft.NET.HostModel.Tests/AppHost.Bundle.Tests/NetCoreApp3CompatModeTests.cs:line 33

@radical radical reopened this May 28, 2021
@ViktorHofer
Copy link
Member

Still failing every day in CI but likely with different causes?

System.AggregateException : One or more errors occurred. (Command failed with exit code 1: /root/runtime/.dotnet/dotnet publish /bl:PublishProject.binlog --no-restore --runtime linux-musl-x64 --framework net6.0 /p:NetCoreAppCurrent=net6.0 -o /root/runtime/artifacts/tests/Release/ahb/smlum2yw.yee/SingleFileApiTests/publish /p:TestTargetRid=linux-musl-x64 /p:MNAVersion=7.0.0-ci /p:AddFile=/root/runtime/artifacts/bin/linux-musl-x64.Release/corehost_test/libmockcoreclr.so\nStandard Output:\nMicrosoft (R) Build Engine version 17.0.0-preview-21411-04+6ca861613 for .NET\nCopyright (C) Microsoft Corporation. All rights reserved.\n\n/root/runtime/.dotnet/sdk/6.0.100-rc.1.21411.28/MSBuild.dll -maxcpucount -property:NetCoreAppCurrent=net6.0 -property:TestTargetRid=linux-musl-x64 -property:MNAVersion=7.0.0-ci -property:AddFile=/root/runtime/artifacts/bin/linux-musl-x64.Release/corehost_test/libmockcoreclr.so -property:PublishDir=/root/runtime/artifacts/tests/Release/ahb/smlum2yw.yee/SingleFileApiTests/publish -property:TargetFramework=net6.0 -property:RuntimeIdentifier=linux-musl-x64 -property:_CommandLineDefinedRuntimeIdentifier=true -target:Publish -verbosity:m /bl:PublishProject.binlog ./SingleFileApiTests.csproj\n/root/runtime/.dotnet/sdk/6.0.100-rc.1.21411.28/Current/Microsoft.Common.props(33,3): error MSB4024: The imported project file "/root/runtime/artifacts/tests/Release/ahb/Directory.Build.props" could not be loaded. Root element is missing. [/root/runtime/artifacts/tests/Release/ahb/smlum2yw.yee/SingleFileApiTests/SingleFileApiTests.csproj]\n\n) (The following constructor parameters did not have matching fixture data: SingleFileSharedState fixture)\n---- Microsoft.DotNet.Cli.Build.Framework.BuildFailureException : Command failed with exit code 1: /root/runtime/.dotnet/dotnet publish /bl:PublishProject.binlog --no-restore --runtime linux-musl-x64 --framework net6.0 /p:NetCoreAppCurrent=net6.0 -o /root/runtime/artifacts/tests/Release/ahb/smlum2yw.yee/SingleFileApiTests/publish /p:TestTargetRid=linux-musl-x64 /p:MNAVersion=7.0.0-ci /p:AddFile=/root/runtime/artifacts/bin/linux-musl-x64.Release/corehost_test/libmockcoreclr.so\nStandard Output:\nMicrosoft (R) Build Engine version 17.0.0-preview-21411-04+6ca861613 for .NET\nCopyright (C) Microsoft Corporation. All rights reserved.\n\n/root/runtime/.dotnet/sdk/6.0.100-rc.1.21411.28/MSBuild.dll -maxcpucount -property:NetCoreAppCurrent=net6.0 -property:TestTargetRid=linux-musl-x64 -property:MNAVersion=7.0.0-ci -property:AddFile=/root/runtime/artifacts/bin/linux-musl-x64.Release/corehost_test/libmockcoreclr.so -property:PublishDir=/root/runtime/artifacts/tests/Release/ahb/smlum2yw.yee/SingleFileApiTests/publish -property:TargetFramework=net6.0 -property:RuntimeIdentifier=linux-musl-x64 -property:_CommandLineDefinedRuntimeIdentifier=true -target:Publish -verbosity:m /bl:PublishProject.binlog ./SingleFileApiTests.csproj\n/root/runtime/.dotnet/sdk/6.0.100-rc.1.21411.28/Current/Microsoft.Common.props(33,3): error MSB4024: The imported project file "/root/runtime/artifacts/tests/Release/ahb/Directory.Build.props" could not be loaded. Root element is missing. [/root/runtime/artifacts/tests/Release/ahb/smlum2yw.yee/SingleFileApiTests/SingleFileApiTests.csproj]\n\n\n---- The following constructor parameters did not have matching fixture data: SingleFileSharedState fixture
----- Inner Stack Trace #1 (Microsoft.DotNet.Cli.Build.Framework.BuildFailureException) -----
   at Microsoft.DotNet.Cli.Build.Framework.CommandResult.EnsureSuccessful(Boolean suppressOutput) in /_/src/installer/tests/TestUtils/CommandResult.cs:line 46
   at Microsoft.DotNet.CoreSetup.Test.TestProjectFixture.PublishProject(DotNetCli dotnet, String runtime, String framework, Nullable`1 selfContained, String outputDirectory, Boolean singleFile, Boolean restore, String[] extraArgs) in /_/src/installer/tests/TestUtils/TestProjectFixture.cs:line 317
   at AppHost.Bundle.Tests.BundleTestBase.SharedTestStateBase.PreparePublishedSelfContainedTestProject(String projectName, String[] extraArgs) in /_/src/installer/tests/Microsoft.NET.HostModel.Tests/AppHost.Bundle.Tests/BundleTestBase.cs:line 83
   at AppHost.Bundle.Tests.SingleFileSharedState..ctor() in /_/src/installer/tests/Microsoft.NET.HostModel.Tests/AppHost.Bundle.Tests/SingleFileSharedState.cs:line 21
----- Inner Stack Trace #2 (Xunit.Sdk.TestClassException) -----

@agocke can you please take a look?

Example build: https://dev.azure.com/dnceng/public/_build/results?buildId=1303326&view=ms.vss-test-web.build-test-results-tab&runId=38393040&resultId=100001&paneView=debug

@agocke
Copy link
Member

agocke commented Aug 20, 2021

Yup, I'm actively investigating this. Currently trying to repro the latest version. It still seems like a race condition

@agocke
Copy link
Member

agocke commented Aug 23, 2021

Dup of #53587

@agocke agocke closed this as completed Aug 23, 2021
@ViktorHofer
Copy link
Member

@agocke isn't that a different cause? In the log above (in my post) I don't see a text file is busy mention.

@agocke
Copy link
Member

agocke commented Aug 23, 2021

I think there's some race condition in the publishing, and they might be the same one. ~~From that build in particular, there were dozens of different failures from different tests and I'm not sure that it should be even be categorized as the same failure. ~~

Nvm, was looking at the wrong build. For this one, I think it's still a race in publishing the test, since it looks like the file hasn't been completely written. I think the same thing is probably causing this failure as the others. We can re-asses if we can't find a similar root cause.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Single-File blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants