-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove unnecessary Avx512VL ISA flags. #103144
Conversation
@@ -73,17 +73,13 @@ private static class XArchIntrinsicConstants | |||
public const int Avx512f = 0x8000; | |||
public const int Avx512f_vl = 0x10000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just have this be Avx512vl
, as that's the actual CPUID feature name and matches the casing we use elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, will rename this part accordingly.
public const int Avx10v1_v256 = 0x800000; | ||
public const int Avx10v1_v512 = 0x1000000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to remove these as well.
The former is unnecessary and the latter is implied by Avx512F
existing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make this change, and just make sure, does this part have conflicting changes with the ongoing Avx10 PR? I know some changes happening there for this as well, but didn't follow closely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That PR is removing Avx10v1_v256
, so removing it here as well should help avoid a conflict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw Avx10v1_V256 was removed from InstructionSetDesc.txt, so what we want here is don't even define Avx10v1_V256
and Avx10v1_V512
, and rely on EVEX
to make sure JIT backend will correctly handle Avx10 nodes? Just to confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Avx10v1_v256
is just part of Avx10v1
, so it's unnecessary.
Likewise, Avx10v1_v512
is really just Avx10v1 + Avx512
, so doing those checks instead fits the need and saves us bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the inputs,
There are duplicated implications there: https://github.com/dotnet/runtime/blob/main/src/coreclr/tools/Common/JitInterface/ThunkGenerator/InstructionSetDesc.txt#L141
Are they intended to be like this, or this shall be removed, I can make the changes if needed.
Edit:
If we remove Avx10v1_V512
, then the node definitions here might need some refactoring? It looks fine to just remove it from XArchIntrinsicConstants
but keep the definition as an ISA, and as mentioned, set InstructionSet_Avx10v1_V512
based on XArchIntrinsicConstants_Avx10v1
and XArchIntrinsicConstants_Avx512f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are they intended to be like this, or this shall be removed, I can make the changes if needed.
No and I'm actually fixing that in #103241
It looks fine to just remove it from XArchIntrinsicConstants but keep the definition as an ISA, and as mentioned, set InstructionSet_Avx10v1_V512 based on XArchIntrinsicConstants_Avx10v1 and XArchIntrinsicConstants_Avx512f
Right, this is what I had meant should happen.
Any refactoring that changes how InstructionSet_*
works might be possible but would be a more involved change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, in that case, I can wait for this PR go in first, and I will handle the conflict accordingly.
You will also need to fix this to check that all feature bits are present for features described by multiple bits. |
switch (instructionSet) | ||
{ | ||
case InstructionSet.X64_AVX10v1_V512: | ||
case InstructionSet.X64_AVX10v1_V512_X64: | ||
case InstructionSet.X64_AVX512F_VL: | ||
case InstructionSet.X64_AVX512F_VL_X64: | ||
case InstructionSet.X64_AVX512BW_VL: | ||
case InstructionSet.X64_AVX512BW_VL_X64: | ||
case InstructionSet.X64_AVX512CD_VL: | ||
case InstructionSet.X64_AVX512CD_VL_X64: | ||
case InstructionSet.X64_AVX512DQ_VL: | ||
case InstructionSet.X64_AVX512DQ_VL_X64: | ||
case InstructionSet.X64_AVX512VBMI_VL: | ||
case InstructionSet.X64_AVX512VBMI_VL_X64: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switch (instructionSet) | |
{ | |
case InstructionSet.X64_AVX10v1_V512: | |
case InstructionSet.X64_AVX10v1_V512_X64: | |
case InstructionSet.X64_AVX512F_VL: | |
case InstructionSet.X64_AVX512F_VL_X64: | |
case InstructionSet.X64_AVX512BW_VL: | |
case InstructionSet.X64_AVX512BW_VL_X64: | |
case InstructionSet.X64_AVX512CD_VL: | |
case InstructionSet.X64_AVX512CD_VL_X64: | |
case InstructionSet.X64_AVX512DQ_VL: | |
case InstructionSet.X64_AVX512DQ_VL_X64: | |
case InstructionSet.X64_AVX512VBMI_VL: | |
case InstructionSet.X64_AVX512VBMI_VL_X64: | |
if (!uint.IsPow2((uint)flag)) | |
{ |
You should be able to just check whether the flag has one or more bits.
70c8947
to
a1d8a65
Compare
codeStream.EmitLdc(flag); | ||
codeStream.Emit(ILOpcode.and); | ||
codeStream.EmitLdc(flag); | ||
codeStream.Emit(ILOpcode.beq); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
codeStream.Emit(ILOpcode.beq); | |
codeStream.Emit(ILOpcode.ceq); |
beq
is "branch on equal". This is needs to be compare instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected, thanks for pointing out!
rebased the branch to resolve conflicts. I kept The fails look like connection issue. |
Hi @tannergooding, Just making sure, I am looking at the #103241, in that PR, we have folded all the Avx512 bits into 1, is that expected moving on, or we are still looking for separate them out to be Edit: |
ad4b0f9
to
613494e
Compare
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics |
Except known failures, the build fail looks like due to timeout, and not related the changes. |
Rerunning CI to get it in a clean state |
Build Analysis has shown green, PR should be in clean state. |
@jkotas, did you have any other feedback here or can I merge once CI is showing green? |
LGTM |
Thanks for all the suggestions and help! |
More context in #103019 (comment)
To save some space in
XArchIntrinsicConstants
, this PR turnsAvx512*_VL
ISA flags per subset into a converged flag:Avx512F_VL
, such that we can hold more ISAs within this enum, this will be the pre-work for the ongoing APX project.