-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NativeAOT codegen optimization opportunities #64242
Comments
Tagging subscribers to this area: @JulieLeeMSFT Issue Details(Note this is milestoned as Future on purpose - none of this is blocking.) A couple relatively low hanging optimization opportunities in the JIT space.
Cc @EgorBo - might be in your area of interest
|
What are the requirements for that? |
On a high level, the restrictions are around allocating reference types with GC-pointer fields and p/invokes. The interpreter lives here: https://github.com/dotnet/runtime/blob/a5cf724aa2c9a67eb45827b56d6315acd4edea84/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/TypePreinit.cs The tests for this paint a pretty good picture of what's supported: https://github.com/dotnet/runtime/blob/a5cf724aa2c9a67eb45827b56d6315acd4edea84/src/tests/nativeaot/SmokeTests/Preinitialization/Preinitialization.cs (look for The coolest demo is probably: runtime/src/tests/nativeaot/SmokeTests/Preinitialization/Preinitialization.cs Lines 747 to 793 in a5cf724
Which compiles to an actual circle in the |
cc @dotnet/jit-contrib. |
@MichalStrehovsky I can definitely help! For loop alignment we need help of @kunalspathak
Isn't it better to run an ILLink substep and devirtualize everything there? E.g. Xamarin.iOS does that https://github.com/xamarin/xamarin-macios/blob/a20d417bf794445aac19a5b07c5db5a70d16dde5/tools/linker/MonoTouch.Tuner/SealerSubStep.cs#L11-L13 |
That just puts runtime/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/ILScanner.cs Lines 418 to 441 in 4d16d09
Doing that helps for classes (not interfaces), but we can do better if we can hook this into the PGO-driven devirtualization mechanism in RyuJIT: it can handle things like "this interface is implemented by only two types" or "this class is derived by two types". I looked into actually feeding this through the PGO JitInterface methods (by having the driver fake PGO data), but it looked it would be messy and this can actually have better codegen than PGO (because no fallback virtual call is needed - the list of classes is complete). |
@MichalStrehovsky - Happy to help for loop alignment. Let me know if you have any questions. |
Great idea! I have a PR that expands GDV to handle multiple cases so I guess it shouldn't be difficult to do + also we basically need to implement getLikelyClasses API in runtime then |
Early on we were going to have the runtime track which classes implemented an interface so it could report that to the jit and we could use that to guess if we did not have PGO. But the bookkeeping got messy. For the AOT case we should add some extra info to the |
I initialize program in |
readonly fields aren't, but readonly static fields are. |
cc @kunalspathak to capture JIT work.
|
TLS access and classes with few derived classes are the only checkboxes left but we did that work. I think this is fixed 🎉 |
(Note this is milestoned as Future on purpose - none of this is blocking.)
A couple relatively low hanging optimization opportunities in the JIT space.
Loop alignment seems to have some pretty nice benefits. NativeAOT/crossgen2 will already respect requests to align at 32-byte boundaries (Add support in crossgen2 for 32-byte alignment #32602), but RyuJIT never requests that for prejit. RyuJIT seems to also have assumptions that the code buffer to write instructions into is 32-byte aligned. It's a managed
byte[]
in our AOT compilers and it's hard to align that. RyuJIT should ideally count bytes from the beginning of the buffer.Static fields always go through a helper call. It can be done better. Improve static field access for NativeAOT #63620 is a WIP pull request. I don't know when I'll get back to it.
If there's a static constructor, the EE side will ask RyuJIT to run the class constructor check (
CORINFO_INITCLASS_USE_HELPER
). This is a helper call that checks whether the class constructor already ran and if not, runs it. The helper looks like this:runtime/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/Target_X64/X64ReadyToRunHelperNode.cs
Lines 58 to 66 in 74d8d0d
Emit the platform-specific native sequence to access thread statics inline instead of calling a helper.
NativeAOT can run static constructors at compile time and provide values of
readonly static
fields. RyuJIT can already consume that (see code aroundCORINFO_FLG_FIELD_FINAL
), but in doing so it assumes the address returned fromgetFieldAddress
is an actual address that can be dereferenced. It's a handle when precompiling. We would want to introduce a proper JitInterface API for this. The static data preinitializer can return more than just primitive types - we could also make this work for reference types (we can return a handle that points to an object in the frozen data segment). Might be better to look into this after "Optimize static field access" above is done.The compiler driver has a pretty good idea of what types will be allocated over the lifetime of the app. It can provide answers to questions such as "what are all the types that could implement this interface". This can be used to drive devirtualization. Some of it doesn't really even have to be guarded (there might only be a single possibility, for example).
Cc @EgorBo - might be in your area of interest
category:cq
theme:ready-to-run
skill-level:expert
cost:medium
impact:medium
The text was updated successfully, but these errors were encountered: