Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86][MC] Support decoding of EGPR for APX #72102

Merged
merged 4 commits into from
Nov 15, 2023
Merged

Conversation

KanRobert
Copy link
Contributor

@KanRobert KanRobert commented Nov 13, 2023

#70958 adds registers R16-R31 (EGPR), this patch

  1. Supports decoding of EGPR for instruction w/ REX2 prefix
  2. Supports decoding of EGPR for instruction w/ EVEX prefix

For simplicity's sake, we

  1. Simulate the REX prefix w/ the 1st payload of REX2
  2. Simulate the REX2 prefix w/ the 2nd and 3rd payloads of EVEX

RFC:
https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4

Explanations for some changes:

  1. invalid-EVEX-R2.txt is deleted b/c 0x62 0xe1 0xff 0x08 0x79 0xc0 is valid and decoded to vcvtsd2usi %xmm0, %r16 now.
  2. One line in x86-64-err.txt is removed b/c APX relaxes the limitation of the 1st and 2nd payloads of EVEX prefix, so the error message changes

@llvmbot llvmbot added backend:X86 mc Machine (object) code labels Nov 13, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 13, 2023

@llvm/pr-subscribers-backend-x86

Author: Shengchen Kan (KanRobert)

Changes

Patch is 33.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/72102.diff

7 Files Affected:

  • (modified) llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp (+89-27)
  • (modified) llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoder.h (+132-6)
  • (added) llvm/test/MC/Disassembler/X86/apx/evex-format.txt (+70)
  • (added) llvm/test/MC/Disassembler/X86/apx/rex2-bit.txt (+238)
  • (added) llvm/test/MC/Disassembler/X86/apx/rex2-format.txt (+344)
  • (removed) llvm/test/MC/Disassembler/X86/invalid-EVEX-R2.txt (-4)
  • (modified) llvm/test/MC/Disassembler/X86/x86-64-err.txt (-2)
diff --git a/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp b/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
index 2ec7a57093f4ba3..2c477758ba8ec7b 100644
--- a/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
+++ b/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
@@ -206,6 +206,10 @@ static bool isREX(struct InternalInstruction *insn, uint8_t prefix) {
   return insn->mode == MODE_64BIT && prefix >= 0x40 && prefix <= 0x4f;
 }
 
+static bool isREX2(struct InternalInstruction *insn, uint8_t prefix) {
+  return insn->mode == MODE_64BIT && prefix == 0xd5;
+}
+
 // Consumes all of an instruction's prefix bytes, and marks the
 // instruction as having them.  Also sets the instruction's default operand,
 // address, and other relevant data sizes to report operands correctly.
@@ -337,8 +341,7 @@ static int readPrefixes(struct InternalInstruction *insn) {
       return -1;
     }
 
-    if ((insn->mode == MODE_64BIT || (byte1 & 0xc0) == 0xc0) &&
-        ((~byte1 & 0x8) == 0x8) && ((byte2 & 0x4) == 0x4)) {
+    if ((insn->mode == MODE_64BIT || (byte1 & 0xc0) == 0xc0)) {
       insn->vectorExtensionType = TYPE_EVEX;
     } else {
       --insn->readerCursor; // unconsume byte1
@@ -357,13 +360,19 @@ static int readPrefixes(struct InternalInstruction *insn) {
         return -1;
       }
 
-      // We simulate the REX prefix for simplicity's sake
       if (insn->mode == MODE_64BIT) {
+        // We simulate the REX prefix for simplicity's sake
         insn->rexPrefix = 0x40 |
                           (wFromEVEX3of4(insn->vectorExtensionPrefix[2]) << 3) |
                           (rFromEVEX2of4(insn->vectorExtensionPrefix[1]) << 2) |
                           (xFromEVEX2of4(insn->vectorExtensionPrefix[1]) << 1) |
                           (bFromEVEX2of4(insn->vectorExtensionPrefix[1]) << 0);
+
+        // We simulate the REX2 prefix for simplicity's sake
+        insn->rex2ExtensionPrefix[1] =
+            (r2FromEVEX2of4(insn->vectorExtensionPrefix[1]) << 6) |
+            (x2FromEVEX3of4(insn->vectorExtensionPrefix[2]) << 5) |
+            (b2FromEVEX2of4(insn->vectorExtensionPrefix[1]) << 4);
       }
 
       LLVM_DEBUG(
@@ -474,6 +483,23 @@ static int readPrefixes(struct InternalInstruction *insn) {
                                   insn->vectorExtensionPrefix[1],
                                   insn->vectorExtensionPrefix[2]));
     }
+  } else if (isREX2(insn, byte)) {
+    uint8_t byte1;
+    if (peek(insn, byte1)) {
+      LLVM_DEBUG(dbgs() << "Couldn't read second byte of REX2");
+      return -1;
+    }
+    insn->rex2ExtensionPrefix[0] = byte;
+    consume(insn, insn->rex2ExtensionPrefix[1]);
+
+    // We simulate the REX prefix for simplicity's sake
+    insn->rexPrefix = 0x40 | (wFromREX2(insn->rex2ExtensionPrefix[1]) << 3) |
+                      (rFromREX2(insn->rex2ExtensionPrefix[1]) << 2) |
+                      (xFromREX2(insn->rex2ExtensionPrefix[1]) << 1) |
+                      (bFromREX2(insn->rex2ExtensionPrefix[1]) << 0);
+    LLVM_DEBUG(dbgs() << format("Found REX2 prefix 0x%hhx 0x%hhx",
+                                insn->rex2ExtensionPrefix[0],
+                                insn->rex2ExtensionPrefix[1]));
   } else if (isREX(insn, byte)) {
     if (peek(insn, nextByte))
       return -1;
@@ -532,7 +558,8 @@ static int readSIB(struct InternalInstruction *insn) {
   if (consume(insn, insn->sib))
     return -1;
 
-  index = indexFromSIB(insn->sib) | (xFromREX(insn->rexPrefix) << 3);
+  index = indexFromSIB(insn->sib) | (xFromREX(insn->rexPrefix) << 3) |
+          (x2FromREX2(insn->rex2ExtensionPrefix[1]) << 4);
 
   if (index == 0x4) {
     insn->sibIndex = SIB_INDEX_NONE;
@@ -542,7 +569,8 @@ static int readSIB(struct InternalInstruction *insn) {
 
   insn->sibScale = 1 << scaleFromSIB(insn->sib);
 
-  base = baseFromSIB(insn->sib) | (bFromREX(insn->rexPrefix) << 3);
+  base = baseFromSIB(insn->sib) | (bFromREX(insn->rexPrefix) << 3) |
+         (b2FromREX2(insn->rex2ExtensionPrefix[1]) << 4);
 
   switch (base) {
   case 0x5:
@@ -604,7 +632,7 @@ static int readDisplacement(struct InternalInstruction *insn) {
 
 // Consumes all addressing information (ModR/M byte, SIB byte, and displacement.
 static int readModRM(struct InternalInstruction *insn) {
-  uint8_t mod, rm, reg, evexrm;
+  uint8_t mod, rm, reg;
   LLVM_DEBUG(dbgs() << "readModRM()");
 
   if (insn->consumedModRM)
@@ -636,14 +664,13 @@ static int readModRM(struct InternalInstruction *insn) {
     break;
   }
 
-  reg |= rFromREX(insn->rexPrefix) << 3;
-  rm |= bFromREX(insn->rexPrefix) << 3;
+  reg |= (rFromREX(insn->rexPrefix) << 3) |
+         (r2FromREX2(insn->rex2ExtensionPrefix[1]) << 4);
+  rm |= (bFromREX(insn->rexPrefix) << 3) |
+        (b2FromREX2(insn->rex2ExtensionPrefix[1]) << 4);
 
-  evexrm = 0;
-  if (insn->vectorExtensionType == TYPE_EVEX && insn->mode == MODE_64BIT) {
+  if (insn->vectorExtensionType == TYPE_EVEX && insn->mode == MODE_64BIT)
     reg |= r2FromEVEX2of4(insn->vectorExtensionPrefix[1]) << 4;
-    evexrm = xFromEVEX2of4(insn->vectorExtensionPrefix[1]) << 4;
-  }
 
   insn->reg = (Reg)(insn->regBase + reg);
 
@@ -731,7 +758,7 @@ static int readModRM(struct InternalInstruction *insn) {
       break;
     case 0x3:
       insn->eaDisplacement = EA_DISP_NONE;
-      insn->eaBase = (EABase)(insn->eaRegBase + rm + evexrm);
+      insn->eaBase = (EABase)(insn->eaRegBase + rm);
       break;
     }
     break;
@@ -741,6 +768,8 @@ static int readModRM(struct InternalInstruction *insn) {
   return 0;
 }
 
+#define MAX_GPR_NUM (0x1f)
+
 #define GENERIC_FIXUP_FUNC(name, base, prefix, mask)                           \
   static uint16_t name(struct InternalInstruction *insn, OperandType type,     \
                        uint8_t index, uint8_t *valid) {                        \
@@ -754,7 +783,7 @@ static int readModRM(struct InternalInstruction *insn) {
       return base + index;                                                     \
     case TYPE_R8:                                                              \
       index &= mask;                                                           \
-      if (index > 0xf)                                                         \
+      if (index > MAX_GPR_NUM)                                                 \
         *valid = 0;                                                            \
       if (insn->rexPrefix && index >= 4 && index <= 7) {                       \
         return prefix##_SPL + (index - 4);                                     \
@@ -763,17 +792,17 @@ static int readModRM(struct InternalInstruction *insn) {
       }                                                                        \
     case TYPE_R16:                                                             \
       index &= mask;                                                           \
-      if (index > 0xf)                                                         \
+      if (index > MAX_GPR_NUM)                                                 \
         *valid = 0;                                                            \
       return prefix##_AX + index;                                              \
     case TYPE_R32:                                                             \
       index &= mask;                                                           \
-      if (index > 0xf)                                                         \
+      if (index > MAX_GPR_NUM)                                                 \
         *valid = 0;                                                            \
       return prefix##_EAX + index;                                             \
     case TYPE_R64:                                                             \
       index &= mask;                                                           \
-      if (index > 0xf)                                                         \
+      if (index > MAX_GPR_NUM)                                                 \
         *valid = 0;                                                            \
       return prefix##_RAX + index;                                             \
     case TYPE_ZMM:                                                             \
@@ -825,7 +854,7 @@ static int readModRM(struct InternalInstruction *insn) {
 //                field is valid for the register class; 0 if not.
 // @return      - The proper value.
 GENERIC_FIXUP_FUNC(fixupRegValue, insn->regBase, MODRM_REG, 0x1f)
-GENERIC_FIXUP_FUNC(fixupRMValue, insn->eaRegBase, EA_REG, 0xf)
+GENERIC_FIXUP_FUNC(fixupRMValue, insn->eaRegBase, EA_REG, 0x1f)
 
 // Consult an operand specifier to determine which of the fixup*Value functions
 // to use in correcting readModRM()'ss interpretation.
@@ -855,8 +884,31 @@ static int fixupReg(struct InternalInstruction *insn,
     if (!valid)
       return -1;
     break;
-  case ENCODING_SIB:
   CASE_ENCODING_RM:
+    if (insn->vectorExtensionType == TYPE_EVEX && insn->mode == MODE_64BIT &&
+        modFromModRM(insn->modRM) == 3) {
+      // EVEX_X can extend the register id to 32 for a non-GPR register that is
+      // encoded in RM.
+      // mode : MODE_64_BIT
+      //  Only 8 vector registers are available in 32 bit mode
+      // mod : 3
+      //  RM encodes a register
+      switch (op->type) {
+      case TYPE_Rv:
+      case TYPE_R8:
+      case TYPE_R16:
+      case TYPE_R32:
+      case TYPE_R64:
+        break;
+      default:
+        insn->eaBase =
+            (EABase)(insn->eaBase +
+                     (xFromEVEX2of4(insn->vectorExtensionPrefix[1]) << 4));
+        break;
+      }
+    }
+    [[fallthrough]];
+  case ENCODING_SIB:
     if (insn->eaBase >= insn->eaRegBase) {
       insn->eaBase = (EABase)fixupRMValue(
           insn, (OperandType)op->type, insn->eaBase - insn->eaRegBase, &valid);
@@ -945,6 +997,10 @@ static bool readOpcode(struct InternalInstruction *insn) {
       insn->opcodeType = XOPA_MAP;
       return consume(insn, insn->opcode);
     }
+  } else if (mFromREX2(insn->rex2ExtensionPrefix[1])) {
+    // m bit indicates opcode map 1
+    insn->opcodeType = TWOBYTE;
+    return consume(insn, insn->opcode);
   }
 
   if (consume(insn, current))
@@ -1390,8 +1446,10 @@ static int readOpcodeRegister(struct InternalInstruction *insn, uint8_t size) {
 
   switch (size) {
   case 1:
-    insn->opcodeRegister = (Reg)(
-        MODRM_REG_AL + ((bFromREX(insn->rexPrefix) << 3) | (insn->opcode & 7)));
+    insn->opcodeRegister =
+        (Reg)(MODRM_REG_AL + ((bFromREX(insn->rexPrefix) << 3) |
+                              (b2FromREX2(insn->rex2ExtensionPrefix[1]) << 4) |
+                              (insn->opcode & 7)));
     if (insn->rexPrefix && insn->opcodeRegister >= MODRM_REG_AL + 0x4 &&
         insn->opcodeRegister < MODRM_REG_AL + 0x8) {
       insn->opcodeRegister =
@@ -1400,18 +1458,22 @@ static int readOpcodeRegister(struct InternalInstruction *insn, uint8_t size) {
 
     break;
   case 2:
-    insn->opcodeRegister = (Reg)(
-        MODRM_REG_AX + ((bFromREX(insn->rexPrefix) << 3) | (insn->opcode & 7)));
+    insn->opcodeRegister =
+        (Reg)(MODRM_REG_AX + ((bFromREX(insn->rexPrefix) << 3) |
+                              (b2FromREX2(insn->rex2ExtensionPrefix[1]) << 4) |
+                              (insn->opcode & 7)));
     break;
   case 4:
     insn->opcodeRegister =
-        (Reg)(MODRM_REG_EAX +
-              ((bFromREX(insn->rexPrefix) << 3) | (insn->opcode & 7)));
+        (Reg)(MODRM_REG_EAX + ((bFromREX(insn->rexPrefix) << 3) |
+                               (b2FromREX2(insn->rex2ExtensionPrefix[1]) << 4) |
+                               (insn->opcode & 7)));
     break;
   case 8:
     insn->opcodeRegister =
-        (Reg)(MODRM_REG_RAX +
-              ((bFromREX(insn->rexPrefix) << 3) | (insn->opcode & 7)));
+        (Reg)(MODRM_REG_RAX + ((bFromREX(insn->rexPrefix) << 3) |
+                               (b2FromREX2(insn->rex2ExtensionPrefix[1]) << 4) |
+                               (insn->opcode & 7)));
     break;
   }
 
diff --git a/llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoder.h b/llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoder.h
index 2d728143d3c9aa4..afbe5c38964fb9d 100644
--- a/llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoder.h
+++ b/llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoder.h
@@ -33,13 +33,24 @@ namespace X86Disassembler {
 #define xFromREX(rex)        (((rex) & 0x2) >> 1)
 #define bFromREX(rex)        ((rex) & 0x1)
 
+#define mFromREX2(rex2)        (((rex2) >> 7) & 0x1)
+#define r2FromREX2(rex2)       (((rex2) >> 6) & 0x1)
+#define x2FromREX2(rex2)       (((rex2) >> 5) & 0x1)
+#define b2FromREX2(rex2)       (((rex2) >> 4) & 0x1)
+#define wFromREX2(rex2)        (((rex2) >> 3) & 0x1)
+#define rFromREX2(rex2)        (((rex2) >> 2) & 0x1)
+#define xFromREX2(rex2)        (((rex2) >> 1) & 0x1)
+#define bFromREX2(rex2)        ((rex2) & 0x1)
+
 #define rFromEVEX2of4(evex)     (((~(evex)) & 0x80) >> 7)
 #define xFromEVEX2of4(evex)     (((~(evex)) & 0x40) >> 6)
 #define bFromEVEX2of4(evex)     (((~(evex)) & 0x20) >> 5)
 #define r2FromEVEX2of4(evex)    (((~(evex)) & 0x10) >> 4)
+#define b2FromEVEX2of4(evex)    (((evex) & 0x8) >> 3)
 #define mmmFromEVEX2of4(evex)   ((evex) & 0x7)
 #define wFromEVEX3of4(evex)     (((evex) & 0x80) >> 7)
 #define vvvvFromEVEX3of4(evex)  (((~(evex)) & 0x78) >> 3)
+#define x2FromEVEX3of4(evex)    (((~(evex)) & 0x4) >> 2)
 #define ppFromEVEX3of4(evex)    ((evex) & 0x3)
 #define zFromEVEX4of4(evex)     (((evex) & 0x80) >> 7)
 #define l2FromEVEX4of4(evex)    (((evex) & 0x40) >> 6)
@@ -89,6 +100,22 @@ namespace X86Disassembler {
   ENTRY(R13B)         \
   ENTRY(R14B)         \
   ENTRY(R15B)         \
+  ENTRY(R16B)         \
+  ENTRY(R17B)         \
+  ENTRY(R18B)         \
+  ENTRY(R19B)         \
+  ENTRY(R20B)         \
+  ENTRY(R21B)         \
+  ENTRY(R22B)         \
+  ENTRY(R23B)         \
+  ENTRY(R24B)         \
+  ENTRY(R25B)         \
+  ENTRY(R26B)         \
+  ENTRY(R27B)         \
+  ENTRY(R28B)         \
+  ENTRY(R29B)         \
+  ENTRY(R30B)         \
+  ENTRY(R31B)         \
   ENTRY(SPL)          \
   ENTRY(BPL)          \
   ENTRY(SIL)          \
@@ -110,7 +137,23 @@ namespace X86Disassembler {
   ENTRY(R12W)           \
   ENTRY(R13W)           \
   ENTRY(R14W)           \
-  ENTRY(R15W)
+  ENTRY(R15W)           \
+  ENTRY(R16W)           \
+  ENTRY(R17W)           \
+  ENTRY(R18W)           \
+  ENTRY(R19W)           \
+  ENTRY(R20W)           \
+  ENTRY(R21W)           \
+  ENTRY(R22W)           \
+  ENTRY(R23W)           \
+  ENTRY(R24W)           \
+  ENTRY(R25W)           \
+  ENTRY(R26W)           \
+  ENTRY(R27W)           \
+  ENTRY(R28W)           \
+  ENTRY(R29W)           \
+  ENTRY(R30W)           \
+  ENTRY(R31W)
 
 #define REGS_16BIT    \
   ENTRY(AX)           \
@@ -128,7 +171,23 @@ namespace X86Disassembler {
   ENTRY(R12W)         \
   ENTRY(R13W)         \
   ENTRY(R14W)         \
-  ENTRY(R15W)
+  ENTRY(R15W)         \
+  ENTRY(R16W)         \
+  ENTRY(R17W)         \
+  ENTRY(R18W)         \
+  ENTRY(R19W)         \
+  ENTRY(R20W)         \
+  ENTRY(R21W)         \
+  ENTRY(R22W)         \
+  ENTRY(R23W)         \
+  ENTRY(R24W)         \
+  ENTRY(R25W)         \
+  ENTRY(R26W)         \
+  ENTRY(R27W)         \
+  ENTRY(R28W)         \
+  ENTRY(R29W)         \
+  ENTRY(R30W)         \
+  ENTRY(R31W)
 
 #define EA_BASES_32BIT  \
   ENTRY(EAX)            \
@@ -146,7 +205,23 @@ namespace X86Disassembler {
   ENTRY(R12D)           \
   ENTRY(R13D)           \
   ENTRY(R14D)           \
-  ENTRY(R15D)
+  ENTRY(R15D)           \
+  ENTRY(R16D)           \
+  ENTRY(R17D)           \
+  ENTRY(R18D)           \
+  ENTRY(R19D)           \
+  ENTRY(R20D)           \
+  ENTRY(R21D)           \
+  ENTRY(R22D)           \
+  ENTRY(R23D)           \
+  ENTRY(R24D)           \
+  ENTRY(R25D)           \
+  ENTRY(R26D)           \
+  ENTRY(R27D)           \
+  ENTRY(R28D)           \
+  ENTRY(R29D)           \
+  ENTRY(R30D)           \
+  ENTRY(R31D)
 
 #define REGS_32BIT  \
   ENTRY(EAX)        \
@@ -164,7 +239,24 @@ namespace X86Disassembler {
   ENTRY(R12D)       \
   ENTRY(R13D)       \
   ENTRY(R14D)       \
-  ENTRY(R15D)
+  ENTRY(R15D)       \
+  ENTRY(R16D)       \
+  ENTRY(R17D)       \
+  ENTRY(R18D)       \
+  ENTRY(R19D)       \
+  ENTRY(R20D)       \
+  ENTRY(R21D)       \
+  ENTRY(R22D)       \
+  ENTRY(R23D)       \
+  ENTRY(R24D)       \
+  ENTRY(R25D)       \
+  ENTRY(R26D)       \
+  ENTRY(R27D)       \
+  ENTRY(R28D)       \
+  ENTRY(R29D)       \
+  ENTRY(R30D)       \
+  ENTRY(R31D)
+
 
 #define EA_BASES_64BIT  \
   ENTRY(RAX)            \
@@ -182,7 +274,23 @@ namespace X86Disassembler {
   ENTRY(R12)            \
   ENTRY(R13)            \
   ENTRY(R14)            \
-  ENTRY(R15)
+  ENTRY(R15)            \
+  ENTRY(R16)            \
+  ENTRY(R17)            \
+  ENTRY(R18)            \
+  ENTRY(R19)            \
+  ENTRY(R20)            \
+  ENTRY(R21)            \
+  ENTRY(R22)            \
+  ENTRY(R23)            \
+  ENTRY(R24)            \
+  ENTRY(R25)            \
+  ENTRY(R26)            \
+  ENTRY(R27)            \
+  ENTRY(R28)            \
+  ENTRY(R29)            \
+  ENTRY(R30)            \
+  ENTRY(R31)
 
 #define REGS_64BIT  \
   ENTRY(RAX)        \
@@ -200,7 +308,23 @@ namespace X86Disassembler {
   ENTRY(R12)        \
   ENTRY(R13)        \
   ENTRY(R14)        \
-  ENTRY(R15)
+  ENTRY(R15)        \
+  ENTRY(R16)        \
+  ENTRY(R17)        \
+  ENTRY(R18)        \
+  ENTRY(R19)        \
+  ENTRY(R20)        \
+  ENTRY(R21)        \
+  ENTRY(R22)        \
+  ENTRY(R23)        \
+  ENTRY(R24)        \
+  ENTRY(R25)        \
+  ENTRY(R26)        \
+  ENTRY(R27)        \
+  ENTRY(R28)        \
+  ENTRY(R29)        \
+  ENTRY(R30)        \
+  ENTRY(R31)
 
 #define REGS_MMX  \
   ENTRY(MM0)      \
@@ -540,6 +664,8 @@ struct InternalInstruction {
   uint8_t vectorExtensionPrefix[4];
   // The type of the vector extension prefix
   VectorExtensionType vectorExtensionType;
+  // The value of the REX2 prefix, if present
+  uint8_t rex2ExtensionPrefix[2];
   // The value of the REX prefix, if present
   uint8_t rexPrefix;
   // The segment override type
diff --git a/llvm/test/MC/Disassembler/X86/apx/evex-format.txt b/llvm/test/MC/Disassembler/X86/apx/evex-format.txt
new file mode 100644
index 000000000000000..4543413c2d4a4f1
--- /dev/null
+++ b/llvm/test/MC/Disassembler/X86/apx/evex-format.txt
@@ -0,0 +1,70 @@
+## NOTE: This file needs to be updated after promoted instruction is supported
+# RUN: llvm-mc -triple x86_64 -disassemble %s | FileCheck %s --check-prefix=ATT
+# RUN: llvm-mc -triple x86_64 -disassemble -output-asm-variant=1 %s | FileCheck %s --check-prefix=INTEL
+
+## MRMDestMem
+
+# ATT:   vextractf32x4	$1, %zmm0, (%r16,%r17)
+# INTEL: vextractf32x4	xmmword ptr [r16 + r17], zmm0, 1
+0x62,0xfb,0x79,0x48,0x19,0x04,0x08,0x01
+
+## MRMSrcMem
+
+# ATT:   vbroadcasti32x4	(%r16,%r17), %zmm0
+# INTEL: vbroadcasti32x4	zmm0, xmmword ptr [r16 + r17]
+0x62,0xfa,0x79,0x48,0x5a,0x04,0x08
+
+## MRM0m
+
+# ATT:   vprorq	$0, (%r16,%r17), %zmm0
+# INTEL: vprorq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0xf9,0x48,0x72,0x04,0x08,0x00
+
+## MRM1m
+
+# ATT:   vprolq	$0, (%r16,%r17), %zmm0
+# INTEL: vprolq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0xf9,0x48,0x72,0x0c,0x08,0x00
+
+## MRM2m
+
+# ATT:   vpsrlq	$0, (%r16,%r17), %zmm0
+# INTEL: vpsrlq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0xf9,0x48,0x73,0x14,0x08,0x00
+
+## MRM3m
+
+# ATT:   vpsrldq	$0, (%r16,%r17), %zmm0
+# INTEL: vpsrldq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0x79,0x48,0x73,0x1c,0x08,0x00
+
+## MRM4m
+
+# ATT:   vpsraq	$0, (%r16,%r17), %zmm0
+# INTEL: vpsraq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0xf9,0x48,0x72,0x24,0x08,0x00
+
+## MRM5m
+
+## xed bug
+# ATT:   vscatterpf0dps	(%r16,%zmm0) {%k1}
+# INTEL: vscatterpf0dps	{k1}, zmmword ptr [r16 + zmm0]
+0x62,0xfa,0x7d,0x49,0xc6,0x2c,0x00
+
+## MRM6m
+
+# ATT:   vpsllq	$0, (%r16,%r17), %zmm0
+# INTEL: vpsllq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0xf9,0x48,0x73,0x34,0x08,0x00
+
+## MRM7m
+
+# ATT:   vpslldq	$0, (%r16,%r17), %zmm0
+# INTEL: vpslldq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0x79,0x48,0x73,0x3c,0x08,0x00
+
+## MRMDestReg
+
+# ATT:   vextractps	$1, %xmm16, %r16d
+# INTEL: vextractps	r16d, xmm16, 1
+0x62,0xeb,0x7d,0x08,0x17,0xc0,0x01
diff --git a/llvm/test/MC/Disassembler/X86/apx/rex2-bit...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 13, 2023

@llvm/pr-subscribers-mc

Author: Shengchen Kan (KanRobert)

Changes

Patch is 33.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/72102.diff

7 Files Affected:

  • (modified) llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp (+89-27)
  • (modified) llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoder.h (+132-6)
  • (added) llvm/test/MC/Disassembler/X86/apx/evex-format.txt (+70)
  • (added) llvm/test/MC/Disassembler/X86/apx/rex2-bit.txt (+238)
  • (added) llvm/test/MC/Disassembler/X86/apx/rex2-format.txt (+344)
  • (removed) llvm/test/MC/Disassembler/X86/invalid-EVEX-R2.txt (-4)
  • (modified) llvm/test/MC/Disassembler/X86/x86-64-err.txt (-2)
diff --git a/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp b/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
index 2ec7a57093f4ba3..2c477758ba8ec7b 100644
--- a/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
+++ b/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
@@ -206,6 +206,10 @@ static bool isREX(struct InternalInstruction *insn, uint8_t prefix) {
   return insn->mode == MODE_64BIT && prefix >= 0x40 && prefix <= 0x4f;
 }
 
+static bool isREX2(struct InternalInstruction *insn, uint8_t prefix) {
+  return insn->mode == MODE_64BIT && prefix == 0xd5;
+}
+
 // Consumes all of an instruction's prefix bytes, and marks the
 // instruction as having them.  Also sets the instruction's default operand,
 // address, and other relevant data sizes to report operands correctly.
@@ -337,8 +341,7 @@ static int readPrefixes(struct InternalInstruction *insn) {
       return -1;
     }
 
-    if ((insn->mode == MODE_64BIT || (byte1 & 0xc0) == 0xc0) &&
-        ((~byte1 & 0x8) == 0x8) && ((byte2 & 0x4) == 0x4)) {
+    if ((insn->mode == MODE_64BIT || (byte1 & 0xc0) == 0xc0)) {
       insn->vectorExtensionType = TYPE_EVEX;
     } else {
       --insn->readerCursor; // unconsume byte1
@@ -357,13 +360,19 @@ static int readPrefixes(struct InternalInstruction *insn) {
         return -1;
       }
 
-      // We simulate the REX prefix for simplicity's sake
       if (insn->mode == MODE_64BIT) {
+        // We simulate the REX prefix for simplicity's sake
         insn->rexPrefix = 0x40 |
                           (wFromEVEX3of4(insn->vectorExtensionPrefix[2]) << 3) |
                           (rFromEVEX2of4(insn->vectorExtensionPrefix[1]) << 2) |
                           (xFromEVEX2of4(insn->vectorExtensionPrefix[1]) << 1) |
                           (bFromEVEX2of4(insn->vectorExtensionPrefix[1]) << 0);
+
+        // We simulate the REX2 prefix for simplicity's sake
+        insn->rex2ExtensionPrefix[1] =
+            (r2FromEVEX2of4(insn->vectorExtensionPrefix[1]) << 6) |
+            (x2FromEVEX3of4(insn->vectorExtensionPrefix[2]) << 5) |
+            (b2FromEVEX2of4(insn->vectorExtensionPrefix[1]) << 4);
       }
 
       LLVM_DEBUG(
@@ -474,6 +483,23 @@ static int readPrefixes(struct InternalInstruction *insn) {
                                   insn->vectorExtensionPrefix[1],
                                   insn->vectorExtensionPrefix[2]));
     }
+  } else if (isREX2(insn, byte)) {
+    uint8_t byte1;
+    if (peek(insn, byte1)) {
+      LLVM_DEBUG(dbgs() << "Couldn't read second byte of REX2");
+      return -1;
+    }
+    insn->rex2ExtensionPrefix[0] = byte;
+    consume(insn, insn->rex2ExtensionPrefix[1]);
+
+    // We simulate the REX prefix for simplicity's sake
+    insn->rexPrefix = 0x40 | (wFromREX2(insn->rex2ExtensionPrefix[1]) << 3) |
+                      (rFromREX2(insn->rex2ExtensionPrefix[1]) << 2) |
+                      (xFromREX2(insn->rex2ExtensionPrefix[1]) << 1) |
+                      (bFromREX2(insn->rex2ExtensionPrefix[1]) << 0);
+    LLVM_DEBUG(dbgs() << format("Found REX2 prefix 0x%hhx 0x%hhx",
+                                insn->rex2ExtensionPrefix[0],
+                                insn->rex2ExtensionPrefix[1]));
   } else if (isREX(insn, byte)) {
     if (peek(insn, nextByte))
       return -1;
@@ -532,7 +558,8 @@ static int readSIB(struct InternalInstruction *insn) {
   if (consume(insn, insn->sib))
     return -1;
 
-  index = indexFromSIB(insn->sib) | (xFromREX(insn->rexPrefix) << 3);
+  index = indexFromSIB(insn->sib) | (xFromREX(insn->rexPrefix) << 3) |
+          (x2FromREX2(insn->rex2ExtensionPrefix[1]) << 4);
 
   if (index == 0x4) {
     insn->sibIndex = SIB_INDEX_NONE;
@@ -542,7 +569,8 @@ static int readSIB(struct InternalInstruction *insn) {
 
   insn->sibScale = 1 << scaleFromSIB(insn->sib);
 
-  base = baseFromSIB(insn->sib) | (bFromREX(insn->rexPrefix) << 3);
+  base = baseFromSIB(insn->sib) | (bFromREX(insn->rexPrefix) << 3) |
+         (b2FromREX2(insn->rex2ExtensionPrefix[1]) << 4);
 
   switch (base) {
   case 0x5:
@@ -604,7 +632,7 @@ static int readDisplacement(struct InternalInstruction *insn) {
 
 // Consumes all addressing information (ModR/M byte, SIB byte, and displacement.
 static int readModRM(struct InternalInstruction *insn) {
-  uint8_t mod, rm, reg, evexrm;
+  uint8_t mod, rm, reg;
   LLVM_DEBUG(dbgs() << "readModRM()");
 
   if (insn->consumedModRM)
@@ -636,14 +664,13 @@ static int readModRM(struct InternalInstruction *insn) {
     break;
   }
 
-  reg |= rFromREX(insn->rexPrefix) << 3;
-  rm |= bFromREX(insn->rexPrefix) << 3;
+  reg |= (rFromREX(insn->rexPrefix) << 3) |
+         (r2FromREX2(insn->rex2ExtensionPrefix[1]) << 4);
+  rm |= (bFromREX(insn->rexPrefix) << 3) |
+        (b2FromREX2(insn->rex2ExtensionPrefix[1]) << 4);
 
-  evexrm = 0;
-  if (insn->vectorExtensionType == TYPE_EVEX && insn->mode == MODE_64BIT) {
+  if (insn->vectorExtensionType == TYPE_EVEX && insn->mode == MODE_64BIT)
     reg |= r2FromEVEX2of4(insn->vectorExtensionPrefix[1]) << 4;
-    evexrm = xFromEVEX2of4(insn->vectorExtensionPrefix[1]) << 4;
-  }
 
   insn->reg = (Reg)(insn->regBase + reg);
 
@@ -731,7 +758,7 @@ static int readModRM(struct InternalInstruction *insn) {
       break;
     case 0x3:
       insn->eaDisplacement = EA_DISP_NONE;
-      insn->eaBase = (EABase)(insn->eaRegBase + rm + evexrm);
+      insn->eaBase = (EABase)(insn->eaRegBase + rm);
       break;
     }
     break;
@@ -741,6 +768,8 @@ static int readModRM(struct InternalInstruction *insn) {
   return 0;
 }
 
+#define MAX_GPR_NUM (0x1f)
+
 #define GENERIC_FIXUP_FUNC(name, base, prefix, mask)                           \
   static uint16_t name(struct InternalInstruction *insn, OperandType type,     \
                        uint8_t index, uint8_t *valid) {                        \
@@ -754,7 +783,7 @@ static int readModRM(struct InternalInstruction *insn) {
       return base + index;                                                     \
     case TYPE_R8:                                                              \
       index &= mask;                                                           \
-      if (index > 0xf)                                                         \
+      if (index > MAX_GPR_NUM)                                                 \
         *valid = 0;                                                            \
       if (insn->rexPrefix && index >= 4 && index <= 7) {                       \
         return prefix##_SPL + (index - 4);                                     \
@@ -763,17 +792,17 @@ static int readModRM(struct InternalInstruction *insn) {
       }                                                                        \
     case TYPE_R16:                                                             \
       index &= mask;                                                           \
-      if (index > 0xf)                                                         \
+      if (index > MAX_GPR_NUM)                                                 \
         *valid = 0;                                                            \
       return prefix##_AX + index;                                              \
     case TYPE_R32:                                                             \
       index &= mask;                                                           \
-      if (index > 0xf)                                                         \
+      if (index > MAX_GPR_NUM)                                                 \
         *valid = 0;                                                            \
       return prefix##_EAX + index;                                             \
     case TYPE_R64:                                                             \
       index &= mask;                                                           \
-      if (index > 0xf)                                                         \
+      if (index > MAX_GPR_NUM)                                                 \
         *valid = 0;                                                            \
       return prefix##_RAX + index;                                             \
     case TYPE_ZMM:                                                             \
@@ -825,7 +854,7 @@ static int readModRM(struct InternalInstruction *insn) {
 //                field is valid for the register class; 0 if not.
 // @return      - The proper value.
 GENERIC_FIXUP_FUNC(fixupRegValue, insn->regBase, MODRM_REG, 0x1f)
-GENERIC_FIXUP_FUNC(fixupRMValue, insn->eaRegBase, EA_REG, 0xf)
+GENERIC_FIXUP_FUNC(fixupRMValue, insn->eaRegBase, EA_REG, 0x1f)
 
 // Consult an operand specifier to determine which of the fixup*Value functions
 // to use in correcting readModRM()'ss interpretation.
@@ -855,8 +884,31 @@ static int fixupReg(struct InternalInstruction *insn,
     if (!valid)
       return -1;
     break;
-  case ENCODING_SIB:
   CASE_ENCODING_RM:
+    if (insn->vectorExtensionType == TYPE_EVEX && insn->mode == MODE_64BIT &&
+        modFromModRM(insn->modRM) == 3) {
+      // EVEX_X can extend the register id to 32 for a non-GPR register that is
+      // encoded in RM.
+      // mode : MODE_64_BIT
+      //  Only 8 vector registers are available in 32 bit mode
+      // mod : 3
+      //  RM encodes a register
+      switch (op->type) {
+      case TYPE_Rv:
+      case TYPE_R8:
+      case TYPE_R16:
+      case TYPE_R32:
+      case TYPE_R64:
+        break;
+      default:
+        insn->eaBase =
+            (EABase)(insn->eaBase +
+                     (xFromEVEX2of4(insn->vectorExtensionPrefix[1]) << 4));
+        break;
+      }
+    }
+    [[fallthrough]];
+  case ENCODING_SIB:
     if (insn->eaBase >= insn->eaRegBase) {
       insn->eaBase = (EABase)fixupRMValue(
           insn, (OperandType)op->type, insn->eaBase - insn->eaRegBase, &valid);
@@ -945,6 +997,10 @@ static bool readOpcode(struct InternalInstruction *insn) {
       insn->opcodeType = XOPA_MAP;
       return consume(insn, insn->opcode);
     }
+  } else if (mFromREX2(insn->rex2ExtensionPrefix[1])) {
+    // m bit indicates opcode map 1
+    insn->opcodeType = TWOBYTE;
+    return consume(insn, insn->opcode);
   }
 
   if (consume(insn, current))
@@ -1390,8 +1446,10 @@ static int readOpcodeRegister(struct InternalInstruction *insn, uint8_t size) {
 
   switch (size) {
   case 1:
-    insn->opcodeRegister = (Reg)(
-        MODRM_REG_AL + ((bFromREX(insn->rexPrefix) << 3) | (insn->opcode & 7)));
+    insn->opcodeRegister =
+        (Reg)(MODRM_REG_AL + ((bFromREX(insn->rexPrefix) << 3) |
+                              (b2FromREX2(insn->rex2ExtensionPrefix[1]) << 4) |
+                              (insn->opcode & 7)));
     if (insn->rexPrefix && insn->opcodeRegister >= MODRM_REG_AL + 0x4 &&
         insn->opcodeRegister < MODRM_REG_AL + 0x8) {
       insn->opcodeRegister =
@@ -1400,18 +1458,22 @@ static int readOpcodeRegister(struct InternalInstruction *insn, uint8_t size) {
 
     break;
   case 2:
-    insn->opcodeRegister = (Reg)(
-        MODRM_REG_AX + ((bFromREX(insn->rexPrefix) << 3) | (insn->opcode & 7)));
+    insn->opcodeRegister =
+        (Reg)(MODRM_REG_AX + ((bFromREX(insn->rexPrefix) << 3) |
+                              (b2FromREX2(insn->rex2ExtensionPrefix[1]) << 4) |
+                              (insn->opcode & 7)));
     break;
   case 4:
     insn->opcodeRegister =
-        (Reg)(MODRM_REG_EAX +
-              ((bFromREX(insn->rexPrefix) << 3) | (insn->opcode & 7)));
+        (Reg)(MODRM_REG_EAX + ((bFromREX(insn->rexPrefix) << 3) |
+                               (b2FromREX2(insn->rex2ExtensionPrefix[1]) << 4) |
+                               (insn->opcode & 7)));
     break;
   case 8:
     insn->opcodeRegister =
-        (Reg)(MODRM_REG_RAX +
-              ((bFromREX(insn->rexPrefix) << 3) | (insn->opcode & 7)));
+        (Reg)(MODRM_REG_RAX + ((bFromREX(insn->rexPrefix) << 3) |
+                               (b2FromREX2(insn->rex2ExtensionPrefix[1]) << 4) |
+                               (insn->opcode & 7)));
     break;
   }
 
diff --git a/llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoder.h b/llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoder.h
index 2d728143d3c9aa4..afbe5c38964fb9d 100644
--- a/llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoder.h
+++ b/llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoder.h
@@ -33,13 +33,24 @@ namespace X86Disassembler {
 #define xFromREX(rex)        (((rex) & 0x2) >> 1)
 #define bFromREX(rex)        ((rex) & 0x1)
 
+#define mFromREX2(rex2)        (((rex2) >> 7) & 0x1)
+#define r2FromREX2(rex2)       (((rex2) >> 6) & 0x1)
+#define x2FromREX2(rex2)       (((rex2) >> 5) & 0x1)
+#define b2FromREX2(rex2)       (((rex2) >> 4) & 0x1)
+#define wFromREX2(rex2)        (((rex2) >> 3) & 0x1)
+#define rFromREX2(rex2)        (((rex2) >> 2) & 0x1)
+#define xFromREX2(rex2)        (((rex2) >> 1) & 0x1)
+#define bFromREX2(rex2)        ((rex2) & 0x1)
+
 #define rFromEVEX2of4(evex)     (((~(evex)) & 0x80) >> 7)
 #define xFromEVEX2of4(evex)     (((~(evex)) & 0x40) >> 6)
 #define bFromEVEX2of4(evex)     (((~(evex)) & 0x20) >> 5)
 #define r2FromEVEX2of4(evex)    (((~(evex)) & 0x10) >> 4)
+#define b2FromEVEX2of4(evex)    (((evex) & 0x8) >> 3)
 #define mmmFromEVEX2of4(evex)   ((evex) & 0x7)
 #define wFromEVEX3of4(evex)     (((evex) & 0x80) >> 7)
 #define vvvvFromEVEX3of4(evex)  (((~(evex)) & 0x78) >> 3)
+#define x2FromEVEX3of4(evex)    (((~(evex)) & 0x4) >> 2)
 #define ppFromEVEX3of4(evex)    ((evex) & 0x3)
 #define zFromEVEX4of4(evex)     (((evex) & 0x80) >> 7)
 #define l2FromEVEX4of4(evex)    (((evex) & 0x40) >> 6)
@@ -89,6 +100,22 @@ namespace X86Disassembler {
   ENTRY(R13B)         \
   ENTRY(R14B)         \
   ENTRY(R15B)         \
+  ENTRY(R16B)         \
+  ENTRY(R17B)         \
+  ENTRY(R18B)         \
+  ENTRY(R19B)         \
+  ENTRY(R20B)         \
+  ENTRY(R21B)         \
+  ENTRY(R22B)         \
+  ENTRY(R23B)         \
+  ENTRY(R24B)         \
+  ENTRY(R25B)         \
+  ENTRY(R26B)         \
+  ENTRY(R27B)         \
+  ENTRY(R28B)         \
+  ENTRY(R29B)         \
+  ENTRY(R30B)         \
+  ENTRY(R31B)         \
   ENTRY(SPL)          \
   ENTRY(BPL)          \
   ENTRY(SIL)          \
@@ -110,7 +137,23 @@ namespace X86Disassembler {
   ENTRY(R12W)           \
   ENTRY(R13W)           \
   ENTRY(R14W)           \
-  ENTRY(R15W)
+  ENTRY(R15W)           \
+  ENTRY(R16W)           \
+  ENTRY(R17W)           \
+  ENTRY(R18W)           \
+  ENTRY(R19W)           \
+  ENTRY(R20W)           \
+  ENTRY(R21W)           \
+  ENTRY(R22W)           \
+  ENTRY(R23W)           \
+  ENTRY(R24W)           \
+  ENTRY(R25W)           \
+  ENTRY(R26W)           \
+  ENTRY(R27W)           \
+  ENTRY(R28W)           \
+  ENTRY(R29W)           \
+  ENTRY(R30W)           \
+  ENTRY(R31W)
 
 #define REGS_16BIT    \
   ENTRY(AX)           \
@@ -128,7 +171,23 @@ namespace X86Disassembler {
   ENTRY(R12W)         \
   ENTRY(R13W)         \
   ENTRY(R14W)         \
-  ENTRY(R15W)
+  ENTRY(R15W)         \
+  ENTRY(R16W)         \
+  ENTRY(R17W)         \
+  ENTRY(R18W)         \
+  ENTRY(R19W)         \
+  ENTRY(R20W)         \
+  ENTRY(R21W)         \
+  ENTRY(R22W)         \
+  ENTRY(R23W)         \
+  ENTRY(R24W)         \
+  ENTRY(R25W)         \
+  ENTRY(R26W)         \
+  ENTRY(R27W)         \
+  ENTRY(R28W)         \
+  ENTRY(R29W)         \
+  ENTRY(R30W)         \
+  ENTRY(R31W)
 
 #define EA_BASES_32BIT  \
   ENTRY(EAX)            \
@@ -146,7 +205,23 @@ namespace X86Disassembler {
   ENTRY(R12D)           \
   ENTRY(R13D)           \
   ENTRY(R14D)           \
-  ENTRY(R15D)
+  ENTRY(R15D)           \
+  ENTRY(R16D)           \
+  ENTRY(R17D)           \
+  ENTRY(R18D)           \
+  ENTRY(R19D)           \
+  ENTRY(R20D)           \
+  ENTRY(R21D)           \
+  ENTRY(R22D)           \
+  ENTRY(R23D)           \
+  ENTRY(R24D)           \
+  ENTRY(R25D)           \
+  ENTRY(R26D)           \
+  ENTRY(R27D)           \
+  ENTRY(R28D)           \
+  ENTRY(R29D)           \
+  ENTRY(R30D)           \
+  ENTRY(R31D)
 
 #define REGS_32BIT  \
   ENTRY(EAX)        \
@@ -164,7 +239,24 @@ namespace X86Disassembler {
   ENTRY(R12D)       \
   ENTRY(R13D)       \
   ENTRY(R14D)       \
-  ENTRY(R15D)
+  ENTRY(R15D)       \
+  ENTRY(R16D)       \
+  ENTRY(R17D)       \
+  ENTRY(R18D)       \
+  ENTRY(R19D)       \
+  ENTRY(R20D)       \
+  ENTRY(R21D)       \
+  ENTRY(R22D)       \
+  ENTRY(R23D)       \
+  ENTRY(R24D)       \
+  ENTRY(R25D)       \
+  ENTRY(R26D)       \
+  ENTRY(R27D)       \
+  ENTRY(R28D)       \
+  ENTRY(R29D)       \
+  ENTRY(R30D)       \
+  ENTRY(R31D)
+
 
 #define EA_BASES_64BIT  \
   ENTRY(RAX)            \
@@ -182,7 +274,23 @@ namespace X86Disassembler {
   ENTRY(R12)            \
   ENTRY(R13)            \
   ENTRY(R14)            \
-  ENTRY(R15)
+  ENTRY(R15)            \
+  ENTRY(R16)            \
+  ENTRY(R17)            \
+  ENTRY(R18)            \
+  ENTRY(R19)            \
+  ENTRY(R20)            \
+  ENTRY(R21)            \
+  ENTRY(R22)            \
+  ENTRY(R23)            \
+  ENTRY(R24)            \
+  ENTRY(R25)            \
+  ENTRY(R26)            \
+  ENTRY(R27)            \
+  ENTRY(R28)            \
+  ENTRY(R29)            \
+  ENTRY(R30)            \
+  ENTRY(R31)
 
 #define REGS_64BIT  \
   ENTRY(RAX)        \
@@ -200,7 +308,23 @@ namespace X86Disassembler {
   ENTRY(R12)        \
   ENTRY(R13)        \
   ENTRY(R14)        \
-  ENTRY(R15)
+  ENTRY(R15)        \
+  ENTRY(R16)        \
+  ENTRY(R17)        \
+  ENTRY(R18)        \
+  ENTRY(R19)        \
+  ENTRY(R20)        \
+  ENTRY(R21)        \
+  ENTRY(R22)        \
+  ENTRY(R23)        \
+  ENTRY(R24)        \
+  ENTRY(R25)        \
+  ENTRY(R26)        \
+  ENTRY(R27)        \
+  ENTRY(R28)        \
+  ENTRY(R29)        \
+  ENTRY(R30)        \
+  ENTRY(R31)
 
 #define REGS_MMX  \
   ENTRY(MM0)      \
@@ -540,6 +664,8 @@ struct InternalInstruction {
   uint8_t vectorExtensionPrefix[4];
   // The type of the vector extension prefix
   VectorExtensionType vectorExtensionType;
+  // The value of the REX2 prefix, if present
+  uint8_t rex2ExtensionPrefix[2];
   // The value of the REX prefix, if present
   uint8_t rexPrefix;
   // The segment override type
diff --git a/llvm/test/MC/Disassembler/X86/apx/evex-format.txt b/llvm/test/MC/Disassembler/X86/apx/evex-format.txt
new file mode 100644
index 000000000000000..4543413c2d4a4f1
--- /dev/null
+++ b/llvm/test/MC/Disassembler/X86/apx/evex-format.txt
@@ -0,0 +1,70 @@
+## NOTE: This file needs to be updated after promoted instruction is supported
+# RUN: llvm-mc -triple x86_64 -disassemble %s | FileCheck %s --check-prefix=ATT
+# RUN: llvm-mc -triple x86_64 -disassemble -output-asm-variant=1 %s | FileCheck %s --check-prefix=INTEL
+
+## MRMDestMem
+
+# ATT:   vextractf32x4	$1, %zmm0, (%r16,%r17)
+# INTEL: vextractf32x4	xmmword ptr [r16 + r17], zmm0, 1
+0x62,0xfb,0x79,0x48,0x19,0x04,0x08,0x01
+
+## MRMSrcMem
+
+# ATT:   vbroadcasti32x4	(%r16,%r17), %zmm0
+# INTEL: vbroadcasti32x4	zmm0, xmmword ptr [r16 + r17]
+0x62,0xfa,0x79,0x48,0x5a,0x04,0x08
+
+## MRM0m
+
+# ATT:   vprorq	$0, (%r16,%r17), %zmm0
+# INTEL: vprorq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0xf9,0x48,0x72,0x04,0x08,0x00
+
+## MRM1m
+
+# ATT:   vprolq	$0, (%r16,%r17), %zmm0
+# INTEL: vprolq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0xf9,0x48,0x72,0x0c,0x08,0x00
+
+## MRM2m
+
+# ATT:   vpsrlq	$0, (%r16,%r17), %zmm0
+# INTEL: vpsrlq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0xf9,0x48,0x73,0x14,0x08,0x00
+
+## MRM3m
+
+# ATT:   vpsrldq	$0, (%r16,%r17), %zmm0
+# INTEL: vpsrldq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0x79,0x48,0x73,0x1c,0x08,0x00
+
+## MRM4m
+
+# ATT:   vpsraq	$0, (%r16,%r17), %zmm0
+# INTEL: vpsraq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0xf9,0x48,0x72,0x24,0x08,0x00
+
+## MRM5m
+
+## xed bug
+# ATT:   vscatterpf0dps	(%r16,%zmm0) {%k1}
+# INTEL: vscatterpf0dps	{k1}, zmmword ptr [r16 + zmm0]
+0x62,0xfa,0x7d,0x49,0xc6,0x2c,0x00
+
+## MRM6m
+
+# ATT:   vpsllq	$0, (%r16,%r17), %zmm0
+# INTEL: vpsllq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0xf9,0x48,0x73,0x34,0x08,0x00
+
+## MRM7m
+
+# ATT:   vpslldq	$0, (%r16,%r17), %zmm0
+# INTEL: vpslldq	zmm0, zmmword ptr [r16 + r17], 0
+0x62,0xf9,0x79,0x48,0x73,0x3c,0x08,0x00
+
+## MRMDestReg
+
+# ATT:   vextractps	$1, %xmm16, %r16d
+# INTEL: vextractps	r16d, xmm16, 1
+0x62,0xeb,0x7d,0x08,0x17,0xc0,0x01
diff --git a/llvm/test/MC/Disassembler/X86/apx/rex2-bit...
[truncated]

Copy link

github-actions bot commented Nov 13, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

LLVM_DEBUG(dbgs() << "Couldn't read second byte of REX2");
return -1;
}
insn->rex2ExtensionPrefix[0] = byte;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this since it's a constant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b/c we need to check there is rex2 prefix later. If not insn->rex2ExtensionPrefix[0], then we need insn->isrex2. They're similar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we directly compare payload with 0? I don't think there's rex2 prefix with all payload 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have rex2 prefix with payload 0 for two cases

  1. jmpabs
  2. pseudo {rex2}

llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoder.h Outdated Show resolved Hide resolved
Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except one comment not addressed

@KanRobert KanRobert merged commit 51c351f into llvm:main Nov 15, 2023
2 of 3 checks passed
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
llvm#70958 adds registers R16-R31
(EGPR), this patch


1. Supports decoding of EGPR for instruction w/ REX2 prefix
2. Supports decoding of EGPR for instruction w/ EVEX prefix

For simplicity's sake,  we 
1.  Simulate the REX prefix w/ the 1st payload of REX2
2.  Simulate the REX2 prefix w/ the 2nd and 3rd payloads of EVEX

RFC:

https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4

Explanations for some changes:
1. invalid-EVEX-R2.txt is deleted b/c `0x62 0xe1 0xff 0x08 0x79 0xc0` is
valid and decoded to `vcvtsd2usi %xmm0, %r16` now.
2. One line in x86-64-err.txt is removed b/c APX relaxes the limitation
of the 1st and 2nd payloads of EVEX prefix, so the error message changes
Guzhu-AMD pushed a commit to GPUOpen-Drivers/llvm-project that referenced this pull request Nov 23, 2023
Local branch amd-gfx bedf99a Merged main:3f743fd3a319 into amd-gfx:fa9cd7924a4e
Remote branch main 51c351f [X86][MC] Support decoding of EGPR for APX (llvm#72102)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants