Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JIT] Fixed improper peephole zero-extension removal when cdq/cwde instructions are involved #82733

Merged
merged 16 commits into from
Mar 5, 2023
6 changes: 6 additions & 0 deletions src/coreclr/jit/emitinl.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,12 @@ inline bool emitter::instrHasImplicitRegPairDest(instruction ins)
return (ins == INS_mulEAX) || (ins == INS_imulEAX) || (ins == INS_div) || (ins == INS_idiv);
}

/* static */
inline bool emitter::instrHasImplicitRegSingleDest(instruction ins)
{
return (ins == INS_cdq) || (ins == INS_cwde);
}

// Because we don't actually have support for encoding these 3-op
// multiplies we fake it with special opcodes. Make sure they are
// contiguous.
Expand Down
1 change: 1 addition & 0 deletions src/coreclr/jit/emitpub.h
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ UNATIVE_OFFSET emitDataSize();
static bool instrIs3opImul(instruction ins);
static bool instrIsExtendedReg3opImul(instruction ins);
static bool instrHasImplicitRegPairDest(instruction ins);
static bool instrHasImplicitRegSingleDest(instruction ins);
static void check3opImulValues();
static regNumber inst3opImulReg(instruction ins);
static instruction inst3opImulForReg(regNumber reg);
Expand Down
10 changes: 10 additions & 0 deletions src/coreclr/jit/emitxarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -526,6 +526,16 @@ bool emitter::AreUpper32BitsZero(regNumber reg)
}
}

// This is a special case for cdq/cdqe/cwde.
// They always write to and sign-extend RAX.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notably: cwd is AX -> DX:AX
cdq is EAX -> EDX:EAX
cqo is RAX -> RDX:RAX

There are also a few other instructions that implicitly write specific outputs (typically edx and eax) or which take implicit inputs (often edx, eax, ecx, esi, or edi)

Wonder if they should be more generally modeled or tracked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They probably should be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need to check cdq for EDX.

Copy link
Member

@tannergooding tannergooding Feb 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmpxchg16b also can write RDX:RAX (and reads RCX:RBX)

div, idiv, imul, and mul writes RDX:RAX (and generally, but not always, reads the same)

Might be others that I'm forgetting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we emit cmpxchg16b or even cmpxchg8b AFAIK. But I did need to check for cmpxchg.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if there is a good way to track such instructions or generalize the process with adequate asserts so we can catch the issues earlier? Tracking these issues with bad code gen are really hard to investigate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All I can think of is to make it more table-driven, otherwise a function that asks the question should be adequate.

Given we are only doing these checks here, we probably could create a function that simply asks "Did the instruction write to this register?" and the result is either 'true' or 'false', and if 'true' we can also provide if it zero-extended or not.
I think I'm going to have to do this anyway for #82733 because I need to re-use the checks of whether or not a register was written to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am more worried if we miss an entry of one of the instructions that we support, we might have similar bug. Is there a manual that we can refer and put all those instructions in the table or whatever data structure we have and also post the link of the manual so someone looking at the code in future can know where it is coming from?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Intel and AMD architecture manuals are the relevant resource.

The ones I listed are the ones that write two registers. There are others, like cmpxchg, which may write a single dedicated register.

It's likely just going to require someone going through the instructions and checking for cases that use a fixed/non-specified register (for input and/or output)

if (instrHasImplicitRegSingleDest(id->idIns()))
{
if (reg == REG_RAX)
{
return PEEPHOLE_ABORT;
}
}

switch (id->idInsFmt())
{
case IF_RWR:
Expand Down
40 changes: 40 additions & 0 deletions src/tests/JIT/opt/Regressions/Regression6.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

// Generated by Fuzzlyn v1.5 on 2023-02-26 16:54:59
TIHan marked this conversation as resolved.
Show resolved Hide resolved
// Run on X64 Linux
// Seed: 15812590312404596729
public class C0
{
public ulong F2;
public C0(ulong f2)
{
F2 = f2;
}
}

public class Program
{
public static ushort s_2;
public static C0 s_7;
public static int Main()
{
ulong result = 1234;
for (int vr0 = 0; vr0 < 2; vr0++)
{
s_7 = M5();
ulong vr1 = s_7.F2;
result = vr1;
}

if (result != 18446744069414584320)
return 0;

return 100;
}

public static C0 M5()
{
return new C0(~(ulong)(4294967295U & (-(1 % (s_2-- | 1)))));
}
}
12 changes: 12 additions & 0 deletions src/tests/JIT/opt/Regressions/Regression6.csproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
</PropertyGroup>
<PropertyGroup>
<DebugType>None</DebugType>
<Optimize>True</Optimize>
</PropertyGroup>
<ItemGroup>
<Compile Include="$(MSBuildProjectName).cs" />
</ItemGroup>
</Project>