Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

win-arm64 tests failing with Assert failure: (uImageBase == uImageBaseFromOS) && (memcmp(pFunctionEntry, pFunctionEntryFromOS, sizeof(RUNTIME_FUNCTION)) == 0) #102337

Closed
jakobbotsch opened this issue May 16, 2024 · 8 comments · Fixed by #102350
Assignees
Labels
area-VM-coreclr blocking-clean-ci-optional Blocking optional rolling runs Known Build Error Use this to report build issues in the .NET Helix tab

Comments

@jakobbotsch
Copy link
Member

jakobbotsch commented May 16, 2024

Build Information

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=678076&view=ms.vss-test-web.build-test-results-tab
Build error leg or test failing:

Example console log: https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-102261-merge-14575ba9d22e4201a6/Common.Tests/1/console.59225bcf.log?helixlogtype=result

C:\h\w\A215090B\w\AA7609C4\e>"C:\h\w\A215090B\p\dotnet.exe" exec --runtimeconfig Common.Tests.runtimeconfig.json --depsfile Common.Tests.deps.json xunit.console.dll Common.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing  
  Discovering: Common.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  Common.Tests (found 247 of 257 test cases)
  Starting:    Common.Tests (parallel test collections = on [2 threads], stop on fail = off)

Assert failure(PID 7932 [0x00001efc], Thread: 8332 [0x208c]): (uImageBase == uImageBaseFromOS) && (memcmp(pFunctionEntry, pFunctionEntryFromOS, sizeof(RUNTIME_FUNCTION)) == 0)

CORECLR! Thread::VirtualUnwindCallFrame + 0xDC (0x00007ff8`8e589f7c)
CORECLR! EECodeManager::EnsureCallerContextIsValid + 0x70 (0x00007ff8`8e4c8f38)
CORECLR! StackFrameIterator::CheckForSkippedFrames + 0x34 (0x00007ff8`8e585dac)
CORECLR! StackFrameIterator::ProcessCurrentFrame + 0x12C (0x00007ff8`8e58928c)
CORECLR! StackFrameIterator::NextRaw + 0x5C4 (0x00007ff8`8e588b1c)
CORECLR! Thread::StackWalkFramesEx + 0x28C (0x00007ff8`8e589d7c)
CORECLR! Thread::StackWalkFrames + 0x130 (0x00007ff8`8e589a48)
CORECLR! ScanStackRoots + 0x290 (0x00007ff8`8e6297f0)
CORECLR! GCToEEInterface::GcScanRoots + 0x1A4 (0x00007ff8`8e628564)
CORECLR! WKS::gc_heap::background_mark_phase + 0xFC (0x00007ff8`8e7d6b54)
    File: D:\a\_work\1\s\src\coreclr\vm\stackwalk.cpp:596
    Image: C:\h\w\A215090B\p\dotnet.exe

Error Message

Fill the error message using step by step known issues guidance.

{
  "ErrorMessage": "(uImageBase == uImageBaseFromOS) && (memcmp(pFunctionEntry, pFunctionEntryFromOS, sizeof(RUNTIME_FUNCTION)) == 0)",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Observed in libraries-jitstress, but I think it's unlikely to be related to jitstress.

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=678076
Error message validated: [(uImageBase == uImageBaseFromOS) && (memcmp(pFunctionEntry, pFunctionEntryFromOS, sizeof(RUNTIME_FUNCTION)) == 0)]
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 5/16/2024 8:32:28 PM UTC

Report

Build Definition Test Pull Request
678333 dotnet/runtime System.Formats.Tar.Tests.WorkItemExecution #102261
678076 dotnet/runtime System.Globalization.Nls.Tests.WorkItemExecution #102261

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
2 2 2
@jakobbotsch jakobbotsch added the Known Build Error Use this to report build issues in the .NET Helix tab label May 16, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label May 16, 2024
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

@jakobbotsch jakobbotsch added the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label May 16, 2024
@mangod9
Copy link
Member

mangod9 commented May 16, 2024

@janvorli, could this be related to any recent change on win-arm64?

@jakobbotsch
Copy link
Member Author

Hmm, I don't see the failures outside jitstress, so perhaps there is some relation after all. I'm going to put this into blocking-clean-ci-optional instead.

@jakobbotsch jakobbotsch added blocking-clean-ci-optional Blocking optional rolling runs and removed blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' labels May 16, 2024
@mangod9
Copy link
Member

mangod9 commented May 16, 2024

is it a consistent failure under jitstress?

@janvorli
Copy link
Member

I wonder if it could have stemmed from my last Windows ARM64 fix of the unwinding. I'll take a look.

@janvorli janvorli self-assigned this May 16, 2024
@jakobbotsch
Copy link
Member Author

is it a consistent failure under jitstress?

Yes, seems like it (at least widespread). Here is a run from main: https://dev.azure.com/dnceng-public/public/_build/results?buildId=678186&view=ms.vss-test-web.build-test-results-tab

@janvorli
Copy link
Member

I am investigating it now.

@janvorli
Copy link
Member

I think I understand what's wrong. My change now moves PC back to the call instruction when unwinding from frame that was unwound to call. The reason is that in native code, when a no-return function is called, the return address is out of any function.
The problem here is that this PC adjustment doesn't happen for managed code frames. In a dump from one of the failures, I can see that the return address is at a beginning of a block with a separate RUNTIME_FUNCTION from the block where was the call. For validation purposes, we compare the runtime function we get from the codeInfo and the one we get from OS for the adjusted PC. But the code info was created using unadjusted PC, hence they don't match.

@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label May 17, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-VM-coreclr blocking-clean-ci-optional Blocking optional rolling runs Known Build Error Use this to report build issues in the .NET Helix tab
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants