Skip to content

Commit

Permalink
[BPF] Add load-acquire and store-release instructions under -mcpu=v5
Browse files Browse the repository at this point in the history
As discussed in [1], introduce BPF instructions with load-acquire and
store-release semantics under -mcpu=v5.

A "load_acquire" is a BPF_LDX instruction with a new mode modifier,
BPF_MEMACQ ("acquiring atomic load").  Similarly, a "store_release" is a
BPF_STX instruction with another new mode modifier, BPF_MEMREL
("releasing atomic store").

BPF_MEMACQ and BPF_MEMREL share the same numeric value, 0x7 (or 0b111).
For example:

  long foo(long *ptr) {
      return __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
  }

foo() can be compiled to:

  f9 10 00 00 00 00 00 00 r0 = load_acquire((u64 *)(r1 + 0x0))
  95 00 00 00 00 00 00 00 exit

Opcode 0xf9, or 0b11111001, can be decoded as:

  0b 111        11     001
     BPF_MEMACQ BPF_DW BPF_LDX

Similarly:

  void bar(short *ptr, short val) {
      __atomic_store_n(ptr, val, __ATOMIC_RELEASE);
  }

bar() can be compiled to:

  eb 21 00 00 00 00 00 00 store_release((u16 *)(r1 + 0x0), w2)
  95 00 00 00 00 00 00 00 exit

Opcode 0xeb, or 0b11101011, can be decoded as:

  0b 111        01    011
     BPF_MEMREL BPF_H BPF_STX

Inline assembly is also supported.  For example:

  asm volatile("%0 = load_acquire((u64 *)(%1 + 0x0))" :
               "=r"(ret) : "r"(ptr) : "memory");

Let 'llvm-objdump -d' use -mcpu=v5 by default, just like commit
0395868 ("[BPF] Make llvm-objdump disasm default cpu v4
(#102166)").

Add two macros, __BPF_FEATURE_LOAD_ACQUIRE and
__BPF_FEATURE_STORE_RELEASE, to let developers detect these new features
in source code.  They can also be disabled using two new llc options,
-disable-load-acquire and -disable-store-release, respectively.

Also use ACQUIRE or RELEASE if user requested weaker memory orders
(RELAXED or CONSUME) until we actually support them.  Requesting a
stronger memory order (i.e. SEQ_CST) will cause an error.

[1] https://lore.kernel.org/all/[email protected]/
  • Loading branch information
peilin-ye committed Sep 20, 2024
1 parent 5b64851 commit 692d130
Show file tree
Hide file tree
Showing 14 changed files with 257 additions and 10 deletions.
9 changes: 7 additions & 2 deletions clang/lib/Basic/Targets/BPF.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,15 @@ void BPFTargetInfo::getTargetDefines(const LangOptions &Opts,
Builder.defineMacro("__BPF_FEATURE_GOTOL");
Builder.defineMacro("__BPF_FEATURE_ST");
}

if (CpuVerNum >= 5) {
Builder.defineMacro("__BPF_FEATURE_LOAD_ACQUIRE");
Builder.defineMacro("__BPF_FEATURE_STORE_RELEASE");
}
}

static constexpr llvm::StringLiteral ValidCPUNames[] = {"generic", "v1", "v2",
"v3", "v4", "probe"};
static constexpr llvm::StringLiteral ValidCPUNames[] = {
"generic", "v1", "v2", "v3", "v4", "v5", "probe"};

bool BPFTargetInfo::isValidCPUName(StringRef Name) const {
return llvm::is_contained(ValidCPUNames, Name);
Expand Down
2 changes: 1 addition & 1 deletion clang/lib/Basic/Targets/BPF.h
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ class LLVM_LIBRARY_VISIBILITY BPFTargetInfo : public TargetInfo {
void fillValidCPUList(SmallVectorImpl<StringRef> &Values) const override;

bool setCPU(const std::string &Name) override {
if (Name == "v3" || Name == "v4") {
if (Name == "v3" || Name == "v4" || Name == "v5") {
HasAlu32 = true;
}

Expand Down
1 change: 1 addition & 0 deletions clang/test/Misc/target-invalid-cpu-note/bpf.c
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
// CHECK-SAME: {{^}}, v2
// CHECK-SAME: {{^}}, v3
// CHECK-SAME: {{^}}, v4
// CHECK-SAME: {{^}}, v5
// CHECK-SAME: {{^}}, probe
// CHECK-SAME: {{$}}

2 changes: 1 addition & 1 deletion llvm/lib/Object/ELFObjectFile.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -442,7 +442,7 @@ std::optional<StringRef> ELFObjectFileBase::tryGetCPUName() const {
case ELF::EM_PPC64:
return StringRef("future");
case ELF::EM_BPF:
return StringRef("v4");
return StringRef("v5");
default:
return std::nullopt;
}
Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,7 @@ struct BPFOperand : public MCParsedAsmOperand {
.Case("exit", true)
.Case("lock", true)
.Case("ld_pseudo", true)
.Case("store_release", true)
.Default(false);
}

Expand Down Expand Up @@ -273,6 +274,7 @@ struct BPFOperand : public MCParsedAsmOperand {
.Case("cmpxchg_64", true)
.Case("cmpxchg32_32", true)
.Case("addr_space_cast", true)
.Case("load_acquire", true)
.Default(false);
}
};
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/Target/BPF/BPF.td
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ def : Proc<"v1", []>;
def : Proc<"v2", []>;
def : Proc<"v3", [ALU32]>;
def : Proc<"v4", [ALU32]>;
def : Proc<"v5", [ALU32]>;
def : Proc<"probe", []>;

def BPFInstPrinter : AsmWriter {
Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/Target/BPF/BPFInstrFormats.td
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@ def BPF_IND : BPFModeModifer<0x2>;
def BPF_MEM : BPFModeModifer<0x3>;
def BPF_MEMSX : BPFModeModifer<0x4>;
def BPF_ATOMIC : BPFModeModifer<0x6>;
def BPF_MEMACQ : BPFModeModifer<0x7>;
def BPF_MEMREL : BPFModeModifer<0x7>;

class BPFAtomicFlag<bits<4> val> {
bits<4> Value = val;
Expand Down
94 changes: 94 additions & 0 deletions llvm/lib/Target/BPF/BPFInstrInfo.td
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@ def BPFHasSdivSmod : Predicate<"Subtarget->hasSdivSmod()">;
def BPFNoMovsx : Predicate<"!Subtarget->hasMovsx()">;
def BPFNoBswap : Predicate<"!Subtarget->hasBswap()">;
def BPFHasStoreImm : Predicate<"Subtarget->hasStoreImm()">;
def BPFHasLoadAcquire : Predicate<"Subtarget->hasLoadAcquire()">;
def BPFHasStoreRelease : Predicate<"Subtarget->hasStoreRelease()">;

class ImmediateAsmOperand<string name> : AsmOperandClass {
let Name = name;
Expand Down Expand Up @@ -514,13 +516,38 @@ class STORE<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string AsmString, list
class STOREi64<BPFWidthModifer Opc, string OpcodeStr, PatFrag OpNode>
: STORE<Opc, BPF_MEM, "*("#OpcodeStr#" *)($addr) = $src", [(OpNode GPR:$src, ADDRri:$addr)]>;

class STORE_RELEASEi64<BPFWidthModifer Opc, string OpcodeStr>
: STORE<Opc, BPF_MEMREL, "store_release(("#OpcodeStr#" *)($addr), $src)", []>;

let Predicates = [BPFNoALU32] in {
def STW : STOREi64<BPF_W, "u32", truncstorei32>;
def STH : STOREi64<BPF_H, "u16", truncstorei16>;
def STB : STOREi64<BPF_B, "u8", truncstorei8>;
}
def STD : STOREi64<BPF_DW, "u64", store>;

class relaxed_store<PatFrag base>
: PatFrag<(ops node:$val, node:$ptr), (base node:$val, node:$ptr)> {
let IsAtomic = 1;
let IsAtomicOrderingReleaseOrStronger = 0;
}

class releasing_store<PatFrag base>
: PatFrag<(ops node:$val, node:$ptr), (base node:$val, node:$ptr)> {
let IsAtomic = 1;
let IsAtomicOrderingRelease = 1;
}

let Predicates = [BPFHasStoreRelease] in {
def STDREL : STORE_RELEASEi64<BPF_DW, "u64">;

foreach P = [[relaxed_store<atomic_store_64>, STDREL],
[releasing_store<atomic_store_64>, STDREL],
] in {
def : Pat<(P[0] GPR:$val, ADDRri:$addr), (P[1] GPR:$val, ADDRri:$addr)>;
}
}

class STORE_imm<BPFWidthModifer SizeOp,
string OpcodeStr, dag Pattern>
: TYPE_LD_ST<BPF_MEM.Value, SizeOp.Value,
Expand Down Expand Up @@ -584,6 +611,9 @@ class LOADi64<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, Pa
: LOAD<SizeOp, ModOp, "$dst = *("#OpcodeStr#" *)($addr)",
[(set i64:$dst, (OpNode ADDRri:$addr))]>;

class LOAD_ACQUIREi64<BPFWidthModifer SizeOp, string OpcodeStr>
: LOAD<SizeOp, BPF_MEMACQ, "$dst = load_acquire(("#OpcodeStr#" *)($addr))", []>;

let isCodeGenOnly = 1 in {
class CORE_LD<RegisterClass RegClass, string Sz>
: TYPE_LD_ST<BPF_MEM.Value, BPF_W.Value,
Expand Down Expand Up @@ -621,6 +651,28 @@ let Predicates = [BPFHasLdsx] in {

def LDD : LOADi64<BPF_DW, BPF_MEM, "u64", load>;

class relaxed_load<PatFrags base>
: PatFrag<(ops node:$ptr), (base node:$ptr)> {
let IsAtomic = 1;
let IsAtomicOrderingAcquireOrStronger = 0;
}

class acquiring_load<PatFrags base>
: PatFrag<(ops node:$ptr), (base node:$ptr)> {
let IsAtomic = 1;
let IsAtomicOrderingAcquire = 1;
}

let Predicates = [BPFHasLoadAcquire] in {
def LDDACQ : LOAD_ACQUIREi64<BPF_DW, "u64">;

foreach P = [[relaxed_load<atomic_load_64>, LDDACQ],
[acquiring_load<atomic_load_64>, LDDACQ],
] in {
def : Pat<(P[0] ADDRri:$addr), (P[1] ADDRri:$addr)>;
}
}

class BRANCH<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>
: TYPE_ALU_JMP<Opc.Value, BPF_K.Value,
(outs),
Expand Down Expand Up @@ -1086,10 +1138,19 @@ class STOREi32<BPFWidthModifer Opc, string OpcodeStr, PatFrag OpNode>
: STORE32<Opc, BPF_MEM, "*("#OpcodeStr#" *)($addr) = $src",
[(OpNode GPR32:$src, ADDRri:$addr)]>;

class STORE_RELEASEi32<BPFWidthModifer Opc, string OpcodeStr>
: STORE32<Opc, BPF_MEMREL, "store_release(("#OpcodeStr#" *)($addr), $src)", []>;

let Predicates = [BPFHasALU32], DecoderNamespace = "BPFALU32" in {
def STW32 : STOREi32<BPF_W, "u32", store>;
def STH32 : STOREi32<BPF_H, "u16", truncstorei16>;
def STB32 : STOREi32<BPF_B, "u8", truncstorei8>;

let Predicates = [BPFHasStoreRelease] in {
def STWREL32 : STORE_RELEASEi32<BPF_W, "u32">;
def STHREL32 : STORE_RELEASEi32<BPF_H, "u16">;
def STBREL32 : STORE_RELEASEi32<BPF_B, "u8">;
}
}

class LOAD32<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string AsmString, list<dag> Pattern>
Expand All @@ -1110,10 +1171,19 @@ class LOADi32<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, Pa
: LOAD32<SizeOp, ModOp, "$dst = *("#OpcodeStr#" *)($addr)",
[(set i32:$dst, (OpNode ADDRri:$addr))]>;

class LOAD_ACQUIREi32<BPFWidthModifer SizeOp, string OpcodeStr>
: LOAD32<SizeOp, BPF_MEMACQ, "$dst = load_acquire(("#OpcodeStr#" *)($addr))", []>;

let Predicates = [BPFHasALU32], DecoderNamespace = "BPFALU32" in {
def LDW32 : LOADi32<BPF_W, BPF_MEM, "u32", load>;
def LDH32 : LOADi32<BPF_H, BPF_MEM, "u16", zextloadi16>;
def LDB32 : LOADi32<BPF_B, BPF_MEM, "u8", zextloadi8>;

let Predicates = [BPFHasLoadAcquire] in {
def LDWACQ32 : LOAD_ACQUIREi32<BPF_W, "u32">;
def LDHACQ32 : LOAD_ACQUIREi32<BPF_H, "u16">;
def LDBACQ32 : LOAD_ACQUIREi32<BPF_B, "u8">;
}
}

let Predicates = [BPFHasALU32] in {
Expand Down Expand Up @@ -1143,6 +1213,30 @@ let Predicates = [BPFHasALU32] in {
(SUBREG_TO_REG (i64 0), (LDH32 ADDRri:$src), sub_32)>;
def : Pat<(i64 (extloadi32 ADDRri:$src)),
(SUBREG_TO_REG (i64 0), (LDW32 ADDRri:$src), sub_32)>;

let Predicates = [BPFHasLoadAcquire] in {
foreach P = [[relaxed_load<atomic_load_32>, LDWACQ32],
[relaxed_load<atomic_load_az_16>, LDHACQ32],
[relaxed_load<atomic_load_az_8>, LDBACQ32],
[acquiring_load<atomic_load_32>, LDWACQ32],
[acquiring_load<atomic_load_az_16>, LDHACQ32],
[acquiring_load<atomic_load_az_8>, LDBACQ32],
] in {
def : Pat<(P[0] ADDRri:$addr), (P[1] ADDRri:$addr)>;
}
}

let Predicates = [BPFHasStoreRelease] in {
foreach P = [[relaxed_store<atomic_store_32>, STWREL32],
[relaxed_store<atomic_store_16>, STHREL32],
[relaxed_store<atomic_store_8>, STBREL32],
[releasing_store<atomic_store_32>, STWREL32],
[releasing_store<atomic_store_16>, STHREL32],
[releasing_store<atomic_store_8>, STBREL32],
] in {
def : Pat<(P[0] GPR32:$val, ADDRri:$addr), (P[1] GPR32:$val, ADDRri:$addr)>;
}
}
}

let usesCustomInserter = 1, isCodeGenOnly = 1 in {
Expand Down
12 changes: 8 additions & 4 deletions llvm/lib/Target/BPF/BPFMISimplifyPatchable.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -100,21 +100,25 @@ static bool isST(unsigned Opcode) {
}

static bool isSTX32(unsigned Opcode) {
return Opcode == BPF::STB32 || Opcode == BPF::STH32 || Opcode == BPF::STW32;
return Opcode == BPF::STB32 || Opcode == BPF::STH32 || Opcode == BPF::STW32 ||
Opcode == BPF::STBREL32 || Opcode == BPF::STHREL32 ||
Opcode == BPF::STWREL32;
}

static bool isSTX64(unsigned Opcode) {
return Opcode == BPF::STB || Opcode == BPF::STH || Opcode == BPF::STW ||
Opcode == BPF::STD;
Opcode == BPF::STD || Opcode == BPF::STDREL;
}

static bool isLDX32(unsigned Opcode) {
return Opcode == BPF::LDB32 || Opcode == BPF::LDH32 || Opcode == BPF::LDW32;
return Opcode == BPF::LDB32 || Opcode == BPF::LDH32 || Opcode == BPF::LDW32 ||
Opcode == BPF::LDBACQ32 || Opcode == BPF::LDHACQ32 ||
Opcode == BPF::LDWACQ32;
}

static bool isLDX64(unsigned Opcode) {
return Opcode == BPF::LDB || Opcode == BPF::LDH || Opcode == BPF::LDW ||
Opcode == BPF::LDD;
Opcode == BPF::LDD || Opcode == BPF::LDDACQ;
}

static bool isLDSX(unsigned Opcode) {
Expand Down
13 changes: 13 additions & 0 deletions llvm/lib/Target/BPF/BPFSubtarget.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,12 @@ static cl::opt<bool> Disable_gotol("disable-gotol", cl::Hidden, cl::init(false),
static cl::opt<bool>
Disable_StoreImm("disable-storeimm", cl::Hidden, cl::init(false),
cl::desc("Disable BPF_ST (immediate store) insn"));
static cl::opt<bool>
Disable_load_acquire("disable-load-acquire", cl::Hidden, cl::init(false),
cl::desc("Disable load-acquire insns"));
static cl::opt<bool>
Disable_store_release("disable-store-release", cl::Hidden, cl::init(false),
cl::desc("Disable store-release insns"));

void BPFSubtarget::anchor() {}

Expand All @@ -62,6 +68,8 @@ void BPFSubtarget::initializeEnvironment() {
HasSdivSmod = false;
HasGotol = false;
HasStoreImm = false;
HasLoadAcquire = false;
HasStoreRelease = false;
}

void BPFSubtarget::initSubtargetFeatures(StringRef CPU, StringRef FS) {
Expand Down Expand Up @@ -89,6 +97,11 @@ void BPFSubtarget::initSubtargetFeatures(StringRef CPU, StringRef FS) {
HasGotol = !Disable_gotol;
HasStoreImm = !Disable_StoreImm;
}

if (CpuVerNum >= 5) {
HasLoadAcquire = !Disable_load_acquire;
HasStoreRelease = !Disable_store_release;
}
}

BPFSubtarget::BPFSubtarget(const Triple &TT, const std::string &CPU,
Expand Down
5 changes: 5 additions & 0 deletions llvm/lib/Target/BPF/BPFSubtarget.h
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,9 @@ class BPFSubtarget : public BPFGenSubtargetInfo {
// whether cpu v4 insns are enabled.
bool HasLdsx, HasMovsx, HasBswap, HasSdivSmod, HasGotol, HasStoreImm;

// whether cpu v5 insns are enabled.
bool HasLoadAcquire, HasStoreRelease;

std::unique_ptr<CallLowering> CallLoweringInfo;
std::unique_ptr<InstructionSelector> InstSelector;
std::unique_ptr<LegalizerInfo> Legalizer;
Expand All @@ -92,6 +95,8 @@ class BPFSubtarget : public BPFGenSubtargetInfo {
bool hasSdivSmod() const { return HasSdivSmod; }
bool hasGotol() const { return HasGotol; }
bool hasStoreImm() const { return HasStoreImm; }
bool hasLoadAcquire() const { return HasLoadAcquire; }
bool hasStoreRelease() const { return HasStoreRelease; }

bool isLittleEndian() const { return IsLittleEndian; }

Expand Down
7 changes: 5 additions & 2 deletions llvm/lib/Target/BPF/Disassembler/BPFDisassembler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,9 @@ class BPFDisassembler : public MCDisassembler {
BPF_IND = 0x2,
BPF_MEM = 0x3,
BPF_MEMSX = 0x4,
BPF_ATOMIC = 0x6
BPF_ATOMIC = 0x6,
BPF_MEMACQ = 0x7,
BPF_MEMREL = 0x7
};

BPFDisassembler(const MCSubtargetInfo &STI, MCContext &Ctx)
Expand Down Expand Up @@ -177,7 +179,8 @@ DecodeStatus BPFDisassembler::getInstruction(MCInst &Instr, uint64_t &Size,
uint8_t InstMode = getInstMode(Insn);
if ((InstClass == BPF_LDX || InstClass == BPF_STX) &&
getInstSize(Insn) != BPF_DW &&
(InstMode == BPF_MEM || InstMode == BPF_ATOMIC) &&
(InstMode == BPF_MEM || InstMode == BPF_ATOMIC ||
InstMode == BPF_MEMACQ /* or BPF_MEMREL */) &&
STI.hasFeature(BPF::ALU32))
Result = decodeInstruction(DecoderTableBPFALU3264, Instr, Insn, Address,
this, STI);
Expand Down
Loading

0 comments on commit 692d130

Please sign in to comment.