-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NativeAOT] Move RequiresAlign8 flag from RareFlags into ExtendedFlags #106010
Conversation
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas |
Closing for now, the benchmark is flawed. |
Updated the OP with a fixed benchmark and more meaningful numbers. I'll will tinker a bit more with it. |
I rewrote the Benchmark results (on Raspberry Pi):
|
How do we store |
We don’t. We just check the element type of the array for |
If the getter is not supposed to be called on arrays, we can omit the HasComponentSizeFlag tricks from it and just test the single bit. |
The |
I would expect that string allocations to be always special cased. Where is the |
Presumably you are right. I checked all the code paths and couldn’t find one that would be hit with |
This should definitely run through the NativeAOT outerloop tests if the PR looks reasonable in other aspects. |
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
A lot of tests on linux-arm crashing with stackoverflow:
|
It was a very stupid mistake in the last change. |
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
Implementation of idea from https://github.com/dotnet/runtime/pull/105931/files#r1704319780.
The benchmark code and results are below. I tested it on Raspberry Pi. The tests were run in succession, CPU temperature was around 48 degrees, so no thermal throttling should be involved.
Predictably, the performance improves for `RequiresAlign8` cases where we don't need to decode the optional fields block. There's about 5% perf hit for non-align8 array allocations and 2% perf hit for non-align8 value type allocations. Some perf hit for arrays was expected, the other code path needs more investigation though as no hit is expected there.Updated benchmark results here: #106010 (comment)
This would make it easier to set the
RequiresAlign8
flag for truncated method tables used inGCStaticEETypeNode
.Benchmark code:
Benchmark results (on Raspberry Pi):