Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove unnecessary Avx512VL ISA flags. #103144

Merged
merged 2 commits into from
Jun 29, 2024

Conversation

Ruihan-Yin
Copy link
Contributor

More context in #103019 (comment)

To save some space in XArchIntrinsicConstants, this PR turns Avx512*_VL ISA flags per subset into a converged flag: Avx512F_VL, such that we can hold more ISAs within this enum, this will be the pre-work for the ongoing APX project.

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jun 6, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jun 6, 2024
@@ -73,17 +73,13 @@ private static class XArchIntrinsicConstants
public const int Avx512f = 0x8000;
public const int Avx512f_vl = 0x10000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just have this be Avx512vl, as that's the actual CPUID feature name and matches the casing we use elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, will rename this part accordingly.

Comment on lines 81 to 82
public const int Avx10v1_v256 = 0x800000;
public const int Avx10v1_v512 = 0x1000000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to remove these as well.

The former is unnecessary and the latter is implied by Avx512F existing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make this change, and just make sure, does this part have conflicting changes with the ongoing Avx10 PR? I know some changes happening there for this as well, but didn't follow closely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That PR is removing Avx10v1_v256, so removing it here as well should help avoid a conflict.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw Avx10v1_V256 was removed from InstructionSetDesc.txt, so what we want here is don't even define Avx10v1_V256 and Avx10v1_V512, and rely on EVEX to make sure JIT backend will correctly handle Avx10 nodes? Just to confirm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Avx10v1_v256 is just part of Avx10v1, so it's unnecessary.

Likewise, Avx10v1_v512 is really just Avx10v1 + Avx512, so doing those checks instead fits the need and saves us bits.

Copy link
Contributor Author

@Ruihan-Yin Ruihan-Yin Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the inputs,

There are duplicated implications there: https://github.com/dotnet/runtime/blob/main/src/coreclr/tools/Common/JitInterface/ThunkGenerator/InstructionSetDesc.txt#L141

Are they intended to be like this, or this shall be removed, I can make the changes if needed.

Edit:
If we remove Avx10v1_V512, then the node definitions here might need some refactoring? It looks fine to just remove it from XArchIntrinsicConstants but keep the definition as an ISA, and as mentioned, set InstructionSet_Avx10v1_V512 based on XArchIntrinsicConstants_Avx10v1 and XArchIntrinsicConstants_Avx512f

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are they intended to be like this, or this shall be removed, I can make the changes if needed.

No and I'm actually fixing that in #103241

It looks fine to just remove it from XArchIntrinsicConstants but keep the definition as an ISA, and as mentioned, set InstructionSet_Avx10v1_V512 based on XArchIntrinsicConstants_Avx10v1 and XArchIntrinsicConstants_Avx512f

Right, this is what I had meant should happen.

Any refactoring that changes how InstructionSet_* works might be possible but would be a more involved change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, in that case, I can wait for this PR go in first, and I will handle the conflict accordingly.

@jkotas
Copy link
Member

jkotas commented Jun 6, 2024

You will also need to fix this to check that all feature bits are present for features described by multiple bits.

Comment on lines 52 to 65
switch (instructionSet)
{
case InstructionSet.X64_AVX10v1_V512:
case InstructionSet.X64_AVX10v1_V512_X64:
case InstructionSet.X64_AVX512F_VL:
case InstructionSet.X64_AVX512F_VL_X64:
case InstructionSet.X64_AVX512BW_VL:
case InstructionSet.X64_AVX512BW_VL_X64:
case InstructionSet.X64_AVX512CD_VL:
case InstructionSet.X64_AVX512CD_VL_X64:
case InstructionSet.X64_AVX512DQ_VL:
case InstructionSet.X64_AVX512DQ_VL_X64:
case InstructionSet.X64_AVX512VBMI_VL:
case InstructionSet.X64_AVX512VBMI_VL_X64:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
switch (instructionSet)
{
case InstructionSet.X64_AVX10v1_V512:
case InstructionSet.X64_AVX10v1_V512_X64:
case InstructionSet.X64_AVX512F_VL:
case InstructionSet.X64_AVX512F_VL_X64:
case InstructionSet.X64_AVX512BW_VL:
case InstructionSet.X64_AVX512BW_VL_X64:
case InstructionSet.X64_AVX512CD_VL:
case InstructionSet.X64_AVX512CD_VL_X64:
case InstructionSet.X64_AVX512DQ_VL:
case InstructionSet.X64_AVX512DQ_VL_X64:
case InstructionSet.X64_AVX512VBMI_VL:
case InstructionSet.X64_AVX512VBMI_VL_X64:
if (!uint.IsPow2((uint)flag))
{

You should be able to just check whether the flag has one or more bits.

codeStream.EmitLdc(flag);
codeStream.Emit(ILOpcode.and);
codeStream.EmitLdc(flag);
codeStream.Emit(ILOpcode.beq);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
codeStream.Emit(ILOpcode.beq);
codeStream.Emit(ILOpcode.ceq);

beq is "branch on equal". This is needs to be compare instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected, thanks for pointing out!

@Ruihan-Yin
Copy link
Contributor Author

rebased the branch to resolve conflicts.

I kept Avx10v1_V512 as an ISA to make it consistent with the node definition in hwintrinsicxarchlist.h. If we need further changes, I can try to make it within this PR.

The fails look like connection issue.

@Ruihan-Yin Ruihan-Yin marked this pull request as ready for review June 12, 2024 16:30
@Ruihan-Yin
Copy link
Contributor Author

Ruihan-Yin commented Jun 18, 2024

Hi @tannergooding,

Just making sure, I am looking at the #103241, in that PR, we have folded all the Avx512 bits into 1, is that expected moving on, or we are still looking for separate them out to be F+VL+BW+CD+DQ as discussed here. If not, I suppose the only thing left for this PR would be removing Avx10v1_V512?

Edit:
Conflicts have been resolved, I have only pushed the changes for Avx10v1 part first. For Avx512, I kept it to what has been changes in #103241, if we want to expand Avx512 flags, I will make the changes accordingly.

@am11 am11 added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI avx512 Related to the AVX-512 architecture area-VM-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Jun 18, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

@Ruihan-Yin
Copy link
Contributor Author

Except known failures, the build fail looks like due to timeout, and not related the changes.

@tannergooding
Copy link
Member

Rerunning CI to get it in a clean state

@Ruihan-Yin
Copy link
Contributor Author

Build Analysis has shown green, PR should be in clean state.

@tannergooding
Copy link
Member

@jkotas, did you have any other feedback here or can I merge once CI is showing green?

@jkotas
Copy link
Member

jkotas commented Jun 28, 2024

LGTM

@Ruihan-Yin
Copy link
Contributor Author

image
I was not able to reproduce the CI fail locally.

From the call stack, it is more likely to be connection issue.

@tannergooding tannergooding merged commit 9906682 into dotnet:main Jun 29, 2024
162 checks passed
@Ruihan-Yin
Copy link
Contributor Author

Thanks for all the suggestions and help!

@github-actions github-actions bot locked and limited conversation to collaborators Aug 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Runtime.Intrinsics avx512 Related to the AVX-512 architecture community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants