-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI ONLY] [NativeAOT] ObjWriter in C# #92705
Conversation
Tagging subscribers to this area: @dotnet/area-system-reflection-metadata Issue DetailsRef: #77178 This is reimplementation of NativeAOT ObjWriter in pure C# instead of depending on LLVM. It implements Mach-O, ELF, COFF object file emitter with DWARF and CodeView debugging information. Only x64 and arm64 targets are implemented to cover officially supported platforms. Certain features are not implemented yet, e.g. COMDAT in ELF. Other features like DWARF debugging info generation are currently slower than the previous implementation. A limited testing was done on osx-arm64, win-x64, and linux-x64. Previous version of the branch was also tested on osx-x64, win-arm64, and linux-arm64. Caveat: This is NOT for review, the draft PR is opened specifically to run smoke tests only. The code was rebased over current main branch and updated to reflect most ObjWriter changes from the past year (both on the runtime repo side and the LLVM fork repo). The performance and structure of the code is not the final shape and I expect to rewrite certain parts before submitting this for actual review. cc @TIHan
|
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas Issue DetailsRef: #77178 This is reimplementation of NativeAOT ObjWriter in pure C# instead of depending on LLVM. It implements Mach-O, ELF, COFF object file emitter with DWARF and CodeView debugging information. Only x64 and arm64 targets are implemented to cover officially supported platforms. Certain features are not implemented yet, e.g. COMDAT in ELF. Other features like DWARF debugging info generation are currently slower than the previous implementation. A limited testing was done on osx-arm64, win-x64, and linux-x64. Previous version of the branch was also tested on osx-x64, win-arm64, and linux-arm64. Caveat: This is NOT for review, the draft PR is opened specifically to run smoke tests only. The code was rebased over current main branch and updated to reflect most ObjWriter changes from the past year (both on the runtime repo side and the LLVM fork repo). The performance and structure of the code is not the final shape and I expect to rewrite certain parts before submitting this for actual review. cc @TIHan
|
Still not sure what is going on with the win-arm64 tests. They pass locally on my Windows Dev Kit 2023. |
This one looks a bit less mysterious:
perhaps BTW, CI is using precisely this toolchain version:
to cross-compile for arm64 on x64: |
src/coreclr/tools/aot/ILCompiler.Compiler/ILCompiler.Compiler.csproj
Outdated
Show resolved
Hide resolved
Thanks, @am11. I will check that one. I will likely wait till tomorrow to have direct access to the Win/ARM machine instead of just RDP-over-RDP-over-Tailscale. I have an uncommitted code where I match more closely the COFF output from the old ObjWriter so it's diffable to certain extent, and easier to spot differences. The DynamicGenerics test produces high number of sections and relocations and triggers the "big obj" code paths which didn't get much testing, so that's my primary suspicion. |
Thank you for making the PR @filipnavara . I'm out for a bit and hopefully will look at this next week. So far it looks promising. |
Turns out the error was somewhere between the chair and the keyboard. I had a wrong branch checked out. |
I found the issue in the COFF/ARM64 code. I was incorrectly ignoring the addend for While searching for the root cause I found couple more issues that are mostly harmless but should be fixed nevertheless. I'll clean up the code and commit it soon. |
I think the code now reached a point where it passes the smoke tests for the supported platforms, which was the primary purpose of this PR. I'll keep it open for the moment, but it served its purpose. Aside from some general structural improvements these are the areas I intend to explore next (in no particular order):
|
According to
during System.Runtime.Tests publishing (and output object size is 273M vs. main's 244M). With published-ilc, this test OOMs (code 137, the current CI failure). |
@am11 Thanks for looking into it, really appreciated! I didn't focus on the performance and memory usage outside of isolated scenarios. I profiled some code paths for COFF and CodeView but there's very little overlap with what ELF and DWARF does. I'll check the trace you provided and do some profiling on my side as well. The output size difference is expected. It's mostly caused by extra relocations in the output and non-optimized string table (as mentioned in the "future work" list above). There's also some difference in debugging info size but not nearly as big. Notably, the memory usage of the DWARF debugging info emitter is pretty high. It's not easy to refactor without significant changes to LibObjectFile. It operates on a "document" model (akin to I did some comparisons with the debugging info turned off, and the results were largely favorable to the C# implementation. That said, I did it primarily on Mach-O, not ELF. |
Wow, I totally didn't expect |
Since some people are apparently following the PR and looking at some of the performance issues, I made an isolated sample showing the problem with I used three different algorithms to build the string table:
Note that Without further ado, here are the results on my MacBook Air M1 for a sample string set taken from
You can clearly see that the approach used by -- The input is pre-sorted, so the 3-way radix quick sort in -- I updated the |
00bfd74
to
1c0d928
Compare
Great optimizations! linux-arm64 object size is 241M vs. 244M on main, and CI leg isn't jamming. :) |
@filipnavara really good info. Agreed that ElfStringTable isn't looking great |
I really appreciate that you checked and helped diagnose this issue. 👍 I am really happy to have some working baseline version of the changes before I proceed to do further optimizations and experiments. |
I committed an experimental fix and it passed the CI. I'll submit it upstream and then focus on the further work I mentioned above. |
@filipnavara , this is looking really good and glad others were able to look at it. The performance improvements do look great. How far do you think your solution is from matching the existing functionality? It looks like the smoke tests are passing on all the platforms. @agocke , are there any scenarios that we need to cover that are not covered by the tests? |
- Section names need to come first in the string table because of limited space for their reference by offset. This caused the "managedcode$I" and "modules$I" section names to be garbage when there were many symbols. - Fix missing array pool return.
Remove LibObjectFile dependency
…h NativeAOT-LLVM branch
b8c46e3
to
c7b4984
Compare
@@ -238,7 +238,7 @@ private CodeViewRegister GetCVRegNum(uint regNum) | |||
|
|||
foreach (var sequencePoint in sequencePoints) | |||
{ | |||
if (lastFileName == null || lastFileName != sequencePoint.FileName) | |||
if (lastFileName is null || lastFileName != sequencePoint.FileName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you prefer pattern matching terse syntax, we can reduce the verbosity a bit in those long RelocType.XX conditions:
using static ILCompiler.DependencyAnalysis.RelocType;
...
if (relocType is IMAGE_REL_BASED_ARM64_BRANCH26 or IMAGE_REL_BASED_ARM64_PAGEBASE_REL21 or
IMAGE_REL_BASED_ARM64_PAGEOFFSET_12A or IMAGE_REL_AARCH64_TLSLE_ADD_TPREL_HI12 or ..)
...
(the duplication due to IMAGE_REL_
prefix is enough as-is 😅)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used several different code styles as I progressed, sometimes intentionally, sometimes as a consequence of reusing existing code... I am generally open to ideas how to keep the code as terse and readable as possible ;-)
Ref: #77178
This is reimplementation of NativeAOT ObjWriter in pure C# instead of depending on LLVM. It implements Mach-O, ELF, COFF object file emitter with DWARF and CodeView debugging information. Only x64 and arm64 targets are implemented to cover officially supported platforms. Certain features are not implemented yet, e.g. COMDAT in ELF. Other features like DWARF debugging info generation are currently slower than the previous implementation. A limited testing was done on osx-arm64, win-x64, and linux-x64. Previous version of the branch was also tested on osx-x64, win-arm64, and linux-arm64.
Caveat: This is NOT for review, the draft PR is opened specifically to run smoke tests only. The code was rebased over current main branch and updated to reflect most ObjWriter changes from the past year (both on the runtime repo side and the LLVM fork repo). The performance and structure of the code is not the final shape and I expect to rewrite certain parts before submitting this for actual review.
cc @TIHan
Tracking list of issues found by the CI:
System.NotSupportedException: Unsupported relocation: IMAGE_REL_TLSGD
relocation R_AARCH64_TLSDESC_ADR_PAGE21 cannot be used against symbol 'tls_InlinedThreadStatics'
libunwind: malformed DW_CFA_register DWARF unwind, reg too big
warning: ignoring file /tmp/helix/working/B5450932/w/C6420AD0/e/publish/native/iOS.Device.Aot.Test.o, building for iOS-arm64 but attempting to link with file built for unknown-unsupported file format ( 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 )
Framework.lib(System.Private.CoreLib.obj) : fatal error LNK1243: invalid or corrupt file: COMDAT section 0x20001 associated with following section 0x0
(MultiModule test)CoreFXTestLibrary.AssertTestException: Assert.AreEqual: Expected: [System.Func2[CommonType10[],CommonType10[]]]. Actual: [System.Func2[CommonType10[],CommonType10[]]].