-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compile composite with avx2 on x64 #55057
Conversation
@@ -32,6 +32,9 @@ | |||
|
|||
<ItemGroup> | |||
<PublishReadyToRunCrossgen2ExtraArgsList Include="--compositekeyfile:$(AssemblyOriginatorKeyFile)"/> | |||
<!-- Compile with avx2 on x64 --> | |||
<PublishReadyToRunCrossgen2ExtraArgsList Condition="'$(TargetArchitecture)' == 'x64'" Include="--inputbubble"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently inputbubble
is required with instruction-set
. Shouldnt composite
just imply that? @AntonLapounov @trylek ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it implies; I think the check in Crossgen2 should be relaxed accordingly. @davidwrighton has filed dotnet/sdk#17760, but I am not sure why it is filed on the SDK side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this needs to be fixed in cg2. perhaps we should just move the bug over.
src/installer/pkg/sfx/Microsoft.NETCore.App/Microsoft.NETCore.App.Runtime.Composite.sfxproj
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
wonder whether we should also compile with |
That comment was about "workloads that use R2R with tiered JIT compilation disabled", which is not the normal configuration. |
yeah true, that's unlikely for containers. |
@@ -32,6 +32,9 @@ | |||
|
|||
<ItemGroup> | |||
<PublishReadyToRunCrossgen2ExtraArgsList Include="--compositekeyfile:$(AssemblyOriginatorKeyFile)"/> | |||
<!-- Compile with avx2 on x64 --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reasoning for "just AVX2"?
That is, specifying avx2
will imply anything in the direct heirarchy, so it will include:
- X86Base
- SSE
- SSE2
- SSE3
- SSSE3
- SSE4.1
- SSE4.2
- AVX
However, there are certain branch instruction sets that aren't part of the direct hierarchy. These are generally considered to exist due to age or vendor (AMD vs Intel) differences in how things are exposed and so should be reasonable to also include:
- AES (2010+)
- PCLMULQDQ (2010+)
- POPCNT (2008+)
Likewise, there are instruction sets that were introduced alongside AVX2 (which was introduced in Haswell, 2013+) and are realistically provided SxS in machines that provide AVX2 (only missing in machines like newer Atom processors that are also missing AVX2):
- BMI1
- BMI2
- FMA
- LZCNT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avx2 is what we have been running our benchmarks with, which showed decent startup improvements. We could subsequently add others as you suggest, if there is measurable perf improvement when enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
POPCNT
and LZCNT
are both used in various hot paths (e.g. in Span
or UTF8<->UTF16
conversion).
Intel technically considers POPCNT
part of SSE4.2
and LZCNT
part of BMI1
, they are split out as not all vendors (such as AMD) consider them this way (but they are functionally present in the same scenarios on the other vendors).
FMA
and BMI2
aren't used by any of our own startup paths but may be called from user code if they are doing anything around Math.BigMul
or Math/MathF.FusedMultiplyAdd
AES
and PCLMULQDQ
are likely fine to not enable for R2R as they are edge case scenarios, but they should also be harmless to enable on any machine that supports AVX2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok thanks for the info.
POPCNT and LZCNT are both used in various hot paths (e.g. in Span or UTF8<->UTF16 conversion).
Will add these to current benchmarks to measure the perf improvement and then enable for composite, assuming most modern CPUs support these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assuming most modern CPUs support these.
They should. POPCNT will be in basically any CPU that supports SSE4.2 (2008+) and LZCNT/BMI1 in basically any CPU that supports AVX2 (2013+).
Since we're enabling AVX2 here, both should effectively be available in the same scenarios.
- There are minor nuances here for pre-AVX2 CPUs between AMD and Intel, but that isn't particularly relevant if we are saying AVX2 is "safe" to be the baseline/default
Compile Fx composite images using avx2.