-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[FMV][AArch64] Allow user to override version priority. #150267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-tablegen @llvm/pr-subscribers-backend-arm Author: Alexandros Lamprineas (labrinea) ChangesImplements ARM-software/acle#404 This allows the user to specify "priority=[1-32];featA+featB" where priority=31 means highest priority. If the explicit priority string is omitted then the priority of "featA+featB" is implied, which is lower than priority=1. Internally this gets expanded using special FMV features P0 ... P5 which can encode up to 31 priority levels (excluding all zeros). Those do not have corresponding detection bit at pos FEAT_#enum so I made this field optional in FMVInfo. Also they don't affect the codegen or name mangling of versioned functions. Patch is 39.33 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150267.diff 23 Files Affected:
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index b2ea65ae111be..05fecee73270f 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -12747,6 +12747,12 @@ def warn_target_clone_duplicate_options
def warn_target_clone_no_impact_options
: Warning<"version list contains entries that don't impact code generation">,
InGroup<FunctionMultiVersioning>;
+def warn_version_priority_out_of_range
+ : Warning<"version priority '%0' is outside the allowed range [1-31]; ignoring priority">,
+ InGroup<FunctionMultiVersioning>;
+def warn_invalid_default_version_priority
+ : Warning<"priority of default version cannot be overridden; ignoring priority">,
+ InGroup<FunctionMultiVersioning>;
// three-way comparison operator diagnostics
def err_implied_comparison_category_type_not_found : Error<
diff --git a/clang/include/clang/Sema/SemaARM.h b/clang/include/clang/Sema/SemaARM.h
index e77d65f9362d8..1a2775d7b050e 100644
--- a/clang/include/clang/Sema/SemaARM.h
+++ b/clang/include/clang/Sema/SemaARM.h
@@ -92,7 +92,8 @@ class SemaARM : public SemaBase {
/// false otherwise.
bool areLaxCompatibleSveTypes(QualType FirstType, QualType SecondType);
- bool checkTargetVersionAttr(const StringRef Str, const SourceLocation Loc);
+ bool checkTargetVersionAttr(const StringRef Param, const SourceLocation Loc,
+ SmallString<64> &NewParam);
bool checkTargetClonesAttr(SmallVectorImpl<StringRef> &Params,
SmallVectorImpl<SourceLocation> &Locs,
SmallVectorImpl<SmallString<64>> &NewParams);
diff --git a/clang/include/clang/Sema/SemaRISCV.h b/clang/include/clang/Sema/SemaRISCV.h
index 844cc3ce4a440..863b8a143f48a 100644
--- a/clang/include/clang/Sema/SemaRISCV.h
+++ b/clang/include/clang/Sema/SemaRISCV.h
@@ -56,7 +56,8 @@ class SemaRISCV : public SemaBase {
std::unique_ptr<sema::RISCVIntrinsicManager> IntrinsicManager;
- bool checkTargetVersionAttr(const StringRef Param, const SourceLocation Loc);
+ bool checkTargetVersionAttr(const StringRef Param, const SourceLocation Loc,
+ SmallString<64> &NewParam);
bool checkTargetClonesAttr(SmallVectorImpl<StringRef> &Params,
SmallVectorImpl<SourceLocation> &Locs,
SmallVectorImpl<SmallString<64>> &NewParams);
diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp b/clang/lib/CodeGen/Targets/AArch64.cpp
index b82c46966cf0b..f451562b86ec3 100644
--- a/clang/lib/CodeGen/Targets/AArch64.cpp
+++ b/clang/lib/CodeGen/Targets/AArch64.cpp
@@ -1337,9 +1337,10 @@ void AArch64ABIInfo::appendAttributeMangling(StringRef AttrStr,
llvm::SmallDenseSet<StringRef, 8> UniqueFeats;
for (auto &Feat : Features)
- if (auto Ext = llvm::AArch64::parseFMVExtension(Feat))
- if (UniqueFeats.insert(Ext->Name).second)
- Out << 'M' << Ext->Name;
+ if (getTarget().doesFeatureAffectCodeGen(Feat))
+ if (auto Ext = llvm::AArch64::parseFMVExtension(Feat))
+ if (UniqueFeats.insert(Ext->Name).second)
+ Out << 'M' << Ext->Name;
}
std::unique_ptr<TargetCodeGenInfo>
diff --git a/clang/lib/Sema/SemaARM.cpp b/clang/lib/Sema/SemaARM.cpp
index 8e27fabccd583..d3110dd18e927 100644
--- a/clang/lib/Sema/SemaARM.cpp
+++ b/clang/lib/Sema/SemaARM.cpp
@@ -1535,19 +1535,52 @@ bool SemaARM::areLaxCompatibleSveTypes(QualType FirstType,
IsLaxCompatible(SecondType, FirstType);
}
+static void appendFeature(StringRef Feat, SmallString<64> &Buffer) {
+ if (!Buffer.empty())
+ Buffer.append("+");
+ Buffer.append(Feat);
+}
+
+static void convertPriorityString(unsigned Priority,
+ SmallString<64> &NewParam) {
+ StringRef PriorityString[5] = {"P0", "P1", "P2", "P3", "P4"};
+
+ assert(Priority > 0 && Priority < 32 && "priority out of range");
+ // Convert priority=[1-31] -> P0 + ... + P4
+ for (unsigned BitPos = 0; BitPos < 5; ++BitPos)
+ if (Priority & (1U << BitPos))
+ appendFeature(PriorityString[BitPos], NewParam);
+}
+
bool SemaARM::checkTargetVersionAttr(const StringRef Param,
- const SourceLocation Loc) {
+ const SourceLocation Loc,
+ SmallString<64> &NewParam) {
using namespace DiagAttrParams;
+ auto [LHS, RHS] = Param.split(';');
+ bool IsDefault = false;
llvm::SmallVector<StringRef, 8> Features;
- Param.split(Features, '+');
+ LHS.split(Features, '+');
for (StringRef Feat : Features) {
Feat = Feat.trim();
if (Feat == "default")
- continue;
- if (!getASTContext().getTargetInfo().validateCpuSupports(Feat))
+ IsDefault = true;
+ else if (!getASTContext().getTargetInfo().validateCpuSupports(Feat))
return Diag(Loc, diag::warn_unsupported_target_attribute)
<< Unsupported << None << Feat << TargetVersion;
+ appendFeature(Feat, NewParam);
+ }
+
+ if (!RHS.empty() && RHS.consume_front("priority=")) {
+ if (IsDefault)
+ Diag(Loc, diag::warn_invalid_default_version_priority);
+ else {
+ unsigned Digit;
+ if (RHS.getAsInteger(0, Digit) || Digit < 1 || Digit > 31)
+ Diag(Loc, diag::warn_version_priority_out_of_range) << RHS;
+ else
+ convertPriorityString(Digit, NewParam);
+ }
}
return false;
}
@@ -1569,15 +1602,20 @@ bool SemaARM::checkTargetClonesAttr(
const StringRef Param = Params[I].trim();
const SourceLocation &Loc = Locs[I];
- if (Param.empty())
+ auto [LHS, RHS] = Param.split(';');
+ bool HasPriority = !RHS.empty() && RHS.consume_front("priority=");
+
+ if (LHS.empty())
return Diag(Loc, diag::warn_unsupported_target_attribute)
<< Unsupported << None << "" << TargetClones;
- if (Param == "default") {
+ if (LHS == "default") {
if (HasDefault)
Diag(Loc, diag::warn_target_clone_duplicate_options);
else {
- NewParams.push_back(Param);
+ if (HasPriority)
+ Diag(Loc, diag::warn_invalid_default_version_priority);
+ NewParams.push_back(LHS);
HasDefault = true;
}
continue;
@@ -1586,7 +1624,7 @@ bool SemaARM::checkTargetClonesAttr(
bool HasCodeGenImpact = false;
llvm::SmallVector<StringRef, 8> Features;
llvm::SmallVector<StringRef, 8> ValidFeatures;
- Param.split(Features, '+');
+ LHS.split(Features, '+');
for (StringRef Feat : Features) {
Feat = Feat.trim();
if (!getASTContext().getTargetInfo().validateCpuSupports(Feat)) {
@@ -1616,6 +1654,14 @@ bool SemaARM::checkTargetClonesAttr(
continue;
}
+ if (HasPriority) {
+ unsigned Digit;
+ if (RHS.getAsInteger(0, Digit) || Digit < 1 || Digit > 31)
+ Diag(Loc, diag::warn_version_priority_out_of_range) << RHS;
+ else
+ convertPriorityString(Digit, NewParam);
+ }
+
// Valid non-default argument.
NewParams.push_back(NewParam);
HasNonDefault = true;
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 9a2950cf1648e..46f4bab9edc71 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -3333,19 +3333,20 @@ bool Sema::checkTargetAttr(SourceLocation LiteralLoc, StringRef AttrStr) {
static void handleTargetVersionAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
StringRef Param;
SourceLocation Loc;
+ SmallString<64> NewParam;
if (!S.checkStringLiteralArgumentAttr(AL, 0, Param, &Loc))
return;
if (S.Context.getTargetInfo().getTriple().isAArch64()) {
- if (S.ARM().checkTargetVersionAttr(Param, Loc))
+ if (S.ARM().checkTargetVersionAttr(Param, Loc, NewParam))
return;
} else if (S.Context.getTargetInfo().getTriple().isRISCV()) {
- if (S.RISCV().checkTargetVersionAttr(Param, Loc))
+ if (S.RISCV().checkTargetVersionAttr(Param, Loc, NewParam))
return;
}
TargetVersionAttr *NewAttr =
- ::new (S.Context) TargetVersionAttr(S.Context, AL, Param);
+ ::new (S.Context) TargetVersionAttr(S.Context, AL, NewParam);
D->addAttr(NewAttr);
}
diff --git a/clang/lib/Sema/SemaRISCV.cpp b/clang/lib/Sema/SemaRISCV.cpp
index 994cd07c1e263..bb91bd7aefbb4 100644
--- a/clang/lib/Sema/SemaRISCV.cpp
+++ b/clang/lib/Sema/SemaRISCV.cpp
@@ -1636,7 +1636,8 @@ bool SemaRISCV::isValidFMVExtension(StringRef Ext) {
}
bool SemaRISCV::checkTargetVersionAttr(const StringRef Param,
- const SourceLocation Loc) {
+ const SourceLocation Loc,
+ SmallString<64> &NewParam) {
using namespace DiagAttrParams;
llvm::SmallVector<StringRef, 8> AttrStrs;
@@ -1682,6 +1683,7 @@ bool SemaRISCV::checkTargetVersionAttr(const StringRef Param,
return Diag(Loc, diag::warn_unsupported_target_attribute)
<< Unsupported << None << Param << TargetVersion;
+ NewParam = Param;
return false;
}
diff --git a/clang/test/AST/attr-target-version.c b/clang/test/AST/attr-target-version.c
index b537f5e685a31..c7b83bef1b91b 100644
--- a/clang/test/AST/attr-target-version.c
+++ b/clang/test/AST/attr-target-version.c
@@ -2,7 +2,23 @@
int __attribute__((target_version("sve2-bitperm + sha2"))) foov(void) { return 1; }
int __attribute__((target_clones(" lse + fp + sha3 ", "default"))) fooc(void) { return 2; }
-// CHECK: TargetVersionAttr
-// CHECK: sve2-bitperm + sha2
-// CHECK: TargetClonesAttr
-// CHECK: fp+lse+sha3 default
+
+int __attribute__((target_version("aes;priority=1"))) explicit_priority(void) { return 1; }
+int __attribute__((target_version("bf16;priority=2"))) explicit_priority(void) { return 2; }
+int __attribute__((target_version("crc;priority=4"))) explicit_priority(void) { return 4; }
+int __attribute__((target_version("dpb2;priority=8"))) explicit_priority(void) { return 8; }
+int __attribute__((target_version("fp16fml;priority=16"))) explicit_priority(void) { return 16; }
+
+int __attribute__((target_clones("simd;priority=31", "default"))) explicit_priority(void) {
+ return 0;
+}
+
+// CHECK: TargetVersionAttr {{.*}} "sve2-bitperm+sha2"
+// CHECK: TargetClonesAttr {{.*}} fp+lse+sha3 default
+
+// CHECK: TargetVersionAttr {{.*}} "aes+P0"
+// CHECK: TargetVersionAttr {{.*}} "bf16+P1"
+// CHECK: TargetVersionAttr {{.*}} "crc+P2"
+// CHECK: TargetVersionAttr {{.*}} "dpb2+P3"
+// CHECK: TargetVersionAttr {{.*}} "fp16fml+P4"
+// CHECK: TargetClonesAttr {{.*}} simd+P0+P1+P2+P3+P4 default
diff --git a/clang/test/CodeGen/AArch64/fmv-duplicate-mangled-name.c b/clang/test/CodeGen/AArch64/fmv-duplicate-mangled-name.c
index e7e611e09542e..ebe5b75cf7946 100644
--- a/clang/test/CodeGen/AArch64/fmv-duplicate-mangled-name.c
+++ b/clang/test/CodeGen/AArch64/fmv-duplicate-mangled-name.c
@@ -1,5 +1,7 @@
// RUN: %clang_cc1 -triple aarch64-linux-gnu -verify -emit-llvm-only %s -DCHECK_IMPLICIT_DEFAULT
// RUN: %clang_cc1 -triple aarch64-linux-gnu -verify -emit-llvm-only %s -DCHECK_EXPLICIT_DEFAULT
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -verify -emit-llvm-only %s -DCHECK_EXPLICIT_VERSION_PRIORITY
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -verify -emit-llvm-only %s -DCHECK_EXPLICIT_CLONES_PRIORITY
#if defined(CHECK_IMPLICIT_DEFAULT)
@@ -21,4 +23,18 @@ __attribute__((target_version("default"))) int explicit_default_bad(void) { retu
// expected-note@-2 {{previous definition is here}}
__attribute__((target_clones("aes", "lse", "default"))) int explicit_default_bad(void) { return 1; }
+#elif defined(CHECK_EXPLICIT_VERSION_PRIORITY)
+
+__attribute__((target_version("aes"))) int explicit_version_priority(void) { return 0; }
+// expected-error@+2 {{definition with same mangled name 'explicit_version_priority._Maes' as another definition}}
+// expected-note@-2 {{previous definition is here}}
+__attribute__((target_version("aes;priority=10"))) int explicit_version_priority(void) { return 1; }
+
+#elif defined(CHECK_EXPLICIT_CLONES_PRIORITY)
+
+__attribute__((target_version("aes;priority=20"))) int explicit_clones_priority(void) { return 0; }
+// expected-error@+2 {{definition with same mangled name 'explicit_clones_priority._Maes' as another definition}}
+// expected-note@-2 {{previous definition is here}}
+__attribute__((target_clones("aes;priority=5", "lse"))) int explicit_clones_priority(void) { return 1; }
+
#endif
diff --git a/clang/test/CodeGen/AArch64/fmv-explicit-priority.c b/clang/test/CodeGen/AArch64/fmv-explicit-priority.c
new file mode 100644
index 0000000000000..437221c95542b
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/fmv-explicit-priority.c
@@ -0,0 +1,193 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature --check-attributes --check-globals --include-generated-funcs
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -O3 -fno-inline -emit-llvm -o - %s | FileCheck %s
+
+__attribute__((target_version("lse;priority=30"))) int foo(void) { return 1; }
+__attribute__((target_version("sve2;priority=20"))) int foo(void) { return 2; }
+__attribute__((target_version("sve;priority=10"))) int foo(void) { return 3; }
+__attribute__((target_version( "default"))) int foo(void) { return 0; }
+
+__attribute__((target_clones("lse+sve2;priority=3", "lse;priority=2", "sve;priority=1", "default")))
+int fmv_caller(void) { return foo(); }
+
+
+__attribute__((target_version("aes"))) int bar(void) { return 1; }
+__attribute__((target_version("sm4;priority=5"))) int bar(void) { return 2; }
+__attribute__((target_version("default"))) int bar(void) { return 0; }
+
+__attribute__((target("aes"))) int regular_caller_aes() { return bar(); }
+__attribute__((target("sm4"))) int regular_caller_sm4() { return bar(); }
+//.
+// CHECK: @__aarch64_cpu_features = external dso_local local_unnamed_addr global { i64 }
+// CHECK: @foo = weak_odr ifunc i32 (), ptr @foo.resolver
+// CHECK: @fmv_caller = weak_odr ifunc i32 (), ptr @fmv_caller.resolver
+// CHECK: @bar = weak_odr ifunc i32 (), ptr @bar.resolver
+//.
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@foo._Mlse
+// CHECK-SAME: () #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 1
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@foo._Msve2
+// CHECK-SAME: () #[[ATTR1:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 2
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@foo._Msve
+// CHECK-SAME: () #[[ATTR2:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 3
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@foo.default
+// CHECK-SAME: () #[[ATTR3:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 0
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller._MlseMsve2
+// CHECK-SAME: () #[[ATTR4:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @foo._Mlse()
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller._Mlse
+// CHECK-SAME: () #[[ATTR5:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @foo._Mlse()
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: noinline nounwind vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller._Msve
+// CHECK-SAME: () #[[ATTR6:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @foo() #[[ATTR12:[0-9]+]]
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller.default
+// CHECK-SAME: () #[[ATTR7:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @foo.default()
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@bar._Maes
+// CHECK-SAME: () #[[ATTR8:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 1
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@bar._Msm4
+// CHECK-SAME: () #[[ATTR9:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 2
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@bar.default
+// CHECK-SAME: () #[[ATTR3]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 0
+//
+//
+// CHECK: Function Attrs: noinline nounwind
+// CHECK-LABEL: define {{[^@]+}}@regular_caller_aes
+// CHECK-SAME: () local_unnamed_addr #[[ATTR10:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @bar() #[[ATTR12]]
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@regular_caller_sm4
+// CHECK-SAME: () local_unnamed_addr #[[ATTR11:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @bar._Msm4()
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK-LABEL: define {{[^@]+}}@foo.resolver() comdat {
+// CHECK-NEXT: resolver_entry:
+// CHECK-NEXT: tail call void @__init_cpu_features_resolver()
+// CHECK-NEXT: [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
+// CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 128
+// CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[TMP1]], 0
+// CHECK-NEXT: br i1 [[DOTNOT]], label [[RESOLVER_ELSE:%.*]], label [[COMMON_RET:%.*]]
+// CHECK: common.ret:
+// CHECK-NEXT: [[COMMON_RET_OP:%.*]] = phi ptr [ @foo._Mlse, [[RESOLVER_ENTRY:%.*]] ], [ @foo._Msve2, [[RESOLVER_ELSE]] ], [ [[FOO__MSVE_FOO_DEFAULT:%.*]], [[RESOLVER_ELSE2:%.*]] ]
+// CHECK-NEXT: ret ptr [[COMMON_RET_OP]]
+// CHECK: resolver_else:
+// CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP0]], 69793284352
+// CHECK-NEXT: [[TMP3:%.*]] = icmp eq i64 [[TMP2]], 69793284352
+// CHECK-NEXT: br i1 [[TMP3]], label [[COMMON_RET]], label [[RESOLVER_ELSE2]]
+// CHECK: resolver_else2:
+// CHECK-NEXT: [[TMP4:%.*]] = and i64 [[TMP0]], 1073807616
+// CHECK-NEXT: [[TMP5:%.*]] = icmp eq i64 [[TMP4]], 1073807616
+// CHECK-NEXT: [[FOO__MSVE_FOO_DEFAULT]] = select i1 [[TMP5]], ptr @foo._Msve, ptr @foo.default
+// CHECK-NEXT: br label [[COMMON_RET]]
+//
+//
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller.resolver() comdat {
+// CHECK-NEXT: resolver_entry:
+// CHECK-NEXT: tail call void @__init_cpu_features_resolver()
+// CHECK-NEXT: [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
+// CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 69793284480
+// CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 69793284480
+// CHECK-NEXT: br i1 [[TMP2]], label [[COMMON_RET:%.*]], label [[RESOLVER_ELSE:%.*]]
+// CHECK: common.ret:
+// CHECK-NEXT: [[COMMON_RET_OP:%.*]] = phi ptr [ @fmv_caller._MlseMsve2, [[RESOLVER_ENTRY:%.*]] ], [ @fmv_caller._Mlse, [[RESOLVER_ELSE]] ], [ [[FMV_CALLER__MSVE_FMV_CALLER_DEFAULT:%.*]], [[RESOLVER_ELSE2:%.*]] ]
+// CHECK-NEXT: ret ptr [[COMMON_RET_OP]]
+// CHECK: resolver_else:
+// CHECK-NEXT: [[TMP3:%.*]] = and i64 [[TMP0]], 128
+// CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[TMP3]], 0
+// CHECK-NEXT: br i1 [[DOTNOT]], label [[RESOLVER_ELSE2]], label [[COMMON_RET]]
+// CHECK: resolver_else2:
+// CHECK-NEXT: [[TMP4:%.*]] = and i64 [[TMP0]], 1073807616
+// CHECK-NEXT: [[TMP5:%.*]] = icmp eq i64 [[TMP4]], 1073807616
+// CHECK-NEXT: [[FMV_CALLER__MSVE_FMV_CALLER_DEFAULT]] = select i1 [[TMP5]], ptr @fmv_call...
[truncated]
|
|
@llvm/pr-subscribers-llvm-transforms Author: Alexandros Lamprineas (labrinea) ChangesImplements ARM-software/acle#404 This allows the user to specify "priority=[1-32];featA+featB" where priority=31 means highest priority. If the explicit priority string is omitted then the priority of "featA+featB" is implied, which is lower than priority=1. Internally this gets expanded using special FMV features P0 ... P5 which can encode up to 31 priority levels (excluding all zeros). Those do not have corresponding detection bit at pos FEAT_#enum so I made this field optional in FMVInfo. Also they don't affect the codegen or name mangling of versioned functions. Patch is 39.33 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150267.diff 23 Files Affected:
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index b2ea65ae111be..05fecee73270f 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -12747,6 +12747,12 @@ def warn_target_clone_duplicate_options
def warn_target_clone_no_impact_options
: Warning<"version list contains entries that don't impact code generation">,
InGroup<FunctionMultiVersioning>;
+def warn_version_priority_out_of_range
+ : Warning<"version priority '%0' is outside the allowed range [1-31]; ignoring priority">,
+ InGroup<FunctionMultiVersioning>;
+def warn_invalid_default_version_priority
+ : Warning<"priority of default version cannot be overridden; ignoring priority">,
+ InGroup<FunctionMultiVersioning>;
// three-way comparison operator diagnostics
def err_implied_comparison_category_type_not_found : Error<
diff --git a/clang/include/clang/Sema/SemaARM.h b/clang/include/clang/Sema/SemaARM.h
index e77d65f9362d8..1a2775d7b050e 100644
--- a/clang/include/clang/Sema/SemaARM.h
+++ b/clang/include/clang/Sema/SemaARM.h
@@ -92,7 +92,8 @@ class SemaARM : public SemaBase {
/// false otherwise.
bool areLaxCompatibleSveTypes(QualType FirstType, QualType SecondType);
- bool checkTargetVersionAttr(const StringRef Str, const SourceLocation Loc);
+ bool checkTargetVersionAttr(const StringRef Param, const SourceLocation Loc,
+ SmallString<64> &NewParam);
bool checkTargetClonesAttr(SmallVectorImpl<StringRef> &Params,
SmallVectorImpl<SourceLocation> &Locs,
SmallVectorImpl<SmallString<64>> &NewParams);
diff --git a/clang/include/clang/Sema/SemaRISCV.h b/clang/include/clang/Sema/SemaRISCV.h
index 844cc3ce4a440..863b8a143f48a 100644
--- a/clang/include/clang/Sema/SemaRISCV.h
+++ b/clang/include/clang/Sema/SemaRISCV.h
@@ -56,7 +56,8 @@ class SemaRISCV : public SemaBase {
std::unique_ptr<sema::RISCVIntrinsicManager> IntrinsicManager;
- bool checkTargetVersionAttr(const StringRef Param, const SourceLocation Loc);
+ bool checkTargetVersionAttr(const StringRef Param, const SourceLocation Loc,
+ SmallString<64> &NewParam);
bool checkTargetClonesAttr(SmallVectorImpl<StringRef> &Params,
SmallVectorImpl<SourceLocation> &Locs,
SmallVectorImpl<SmallString<64>> &NewParams);
diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp b/clang/lib/CodeGen/Targets/AArch64.cpp
index b82c46966cf0b..f451562b86ec3 100644
--- a/clang/lib/CodeGen/Targets/AArch64.cpp
+++ b/clang/lib/CodeGen/Targets/AArch64.cpp
@@ -1337,9 +1337,10 @@ void AArch64ABIInfo::appendAttributeMangling(StringRef AttrStr,
llvm::SmallDenseSet<StringRef, 8> UniqueFeats;
for (auto &Feat : Features)
- if (auto Ext = llvm::AArch64::parseFMVExtension(Feat))
- if (UniqueFeats.insert(Ext->Name).second)
- Out << 'M' << Ext->Name;
+ if (getTarget().doesFeatureAffectCodeGen(Feat))
+ if (auto Ext = llvm::AArch64::parseFMVExtension(Feat))
+ if (UniqueFeats.insert(Ext->Name).second)
+ Out << 'M' << Ext->Name;
}
std::unique_ptr<TargetCodeGenInfo>
diff --git a/clang/lib/Sema/SemaARM.cpp b/clang/lib/Sema/SemaARM.cpp
index 8e27fabccd583..d3110dd18e927 100644
--- a/clang/lib/Sema/SemaARM.cpp
+++ b/clang/lib/Sema/SemaARM.cpp
@@ -1535,19 +1535,52 @@ bool SemaARM::areLaxCompatibleSveTypes(QualType FirstType,
IsLaxCompatible(SecondType, FirstType);
}
+static void appendFeature(StringRef Feat, SmallString<64> &Buffer) {
+ if (!Buffer.empty())
+ Buffer.append("+");
+ Buffer.append(Feat);
+}
+
+static void convertPriorityString(unsigned Priority,
+ SmallString<64> &NewParam) {
+ StringRef PriorityString[5] = {"P0", "P1", "P2", "P3", "P4"};
+
+ assert(Priority > 0 && Priority < 32 && "priority out of range");
+ // Convert priority=[1-31] -> P0 + ... + P4
+ for (unsigned BitPos = 0; BitPos < 5; ++BitPos)
+ if (Priority & (1U << BitPos))
+ appendFeature(PriorityString[BitPos], NewParam);
+}
+
bool SemaARM::checkTargetVersionAttr(const StringRef Param,
- const SourceLocation Loc) {
+ const SourceLocation Loc,
+ SmallString<64> &NewParam) {
using namespace DiagAttrParams;
+ auto [LHS, RHS] = Param.split(';');
+ bool IsDefault = false;
llvm::SmallVector<StringRef, 8> Features;
- Param.split(Features, '+');
+ LHS.split(Features, '+');
for (StringRef Feat : Features) {
Feat = Feat.trim();
if (Feat == "default")
- continue;
- if (!getASTContext().getTargetInfo().validateCpuSupports(Feat))
+ IsDefault = true;
+ else if (!getASTContext().getTargetInfo().validateCpuSupports(Feat))
return Diag(Loc, diag::warn_unsupported_target_attribute)
<< Unsupported << None << Feat << TargetVersion;
+ appendFeature(Feat, NewParam);
+ }
+
+ if (!RHS.empty() && RHS.consume_front("priority=")) {
+ if (IsDefault)
+ Diag(Loc, diag::warn_invalid_default_version_priority);
+ else {
+ unsigned Digit;
+ if (RHS.getAsInteger(0, Digit) || Digit < 1 || Digit > 31)
+ Diag(Loc, diag::warn_version_priority_out_of_range) << RHS;
+ else
+ convertPriorityString(Digit, NewParam);
+ }
}
return false;
}
@@ -1569,15 +1602,20 @@ bool SemaARM::checkTargetClonesAttr(
const StringRef Param = Params[I].trim();
const SourceLocation &Loc = Locs[I];
- if (Param.empty())
+ auto [LHS, RHS] = Param.split(';');
+ bool HasPriority = !RHS.empty() && RHS.consume_front("priority=");
+
+ if (LHS.empty())
return Diag(Loc, diag::warn_unsupported_target_attribute)
<< Unsupported << None << "" << TargetClones;
- if (Param == "default") {
+ if (LHS == "default") {
if (HasDefault)
Diag(Loc, diag::warn_target_clone_duplicate_options);
else {
- NewParams.push_back(Param);
+ if (HasPriority)
+ Diag(Loc, diag::warn_invalid_default_version_priority);
+ NewParams.push_back(LHS);
HasDefault = true;
}
continue;
@@ -1586,7 +1624,7 @@ bool SemaARM::checkTargetClonesAttr(
bool HasCodeGenImpact = false;
llvm::SmallVector<StringRef, 8> Features;
llvm::SmallVector<StringRef, 8> ValidFeatures;
- Param.split(Features, '+');
+ LHS.split(Features, '+');
for (StringRef Feat : Features) {
Feat = Feat.trim();
if (!getASTContext().getTargetInfo().validateCpuSupports(Feat)) {
@@ -1616,6 +1654,14 @@ bool SemaARM::checkTargetClonesAttr(
continue;
}
+ if (HasPriority) {
+ unsigned Digit;
+ if (RHS.getAsInteger(0, Digit) || Digit < 1 || Digit > 31)
+ Diag(Loc, diag::warn_version_priority_out_of_range) << RHS;
+ else
+ convertPriorityString(Digit, NewParam);
+ }
+
// Valid non-default argument.
NewParams.push_back(NewParam);
HasNonDefault = true;
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 9a2950cf1648e..46f4bab9edc71 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -3333,19 +3333,20 @@ bool Sema::checkTargetAttr(SourceLocation LiteralLoc, StringRef AttrStr) {
static void handleTargetVersionAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
StringRef Param;
SourceLocation Loc;
+ SmallString<64> NewParam;
if (!S.checkStringLiteralArgumentAttr(AL, 0, Param, &Loc))
return;
if (S.Context.getTargetInfo().getTriple().isAArch64()) {
- if (S.ARM().checkTargetVersionAttr(Param, Loc))
+ if (S.ARM().checkTargetVersionAttr(Param, Loc, NewParam))
return;
} else if (S.Context.getTargetInfo().getTriple().isRISCV()) {
- if (S.RISCV().checkTargetVersionAttr(Param, Loc))
+ if (S.RISCV().checkTargetVersionAttr(Param, Loc, NewParam))
return;
}
TargetVersionAttr *NewAttr =
- ::new (S.Context) TargetVersionAttr(S.Context, AL, Param);
+ ::new (S.Context) TargetVersionAttr(S.Context, AL, NewParam);
D->addAttr(NewAttr);
}
diff --git a/clang/lib/Sema/SemaRISCV.cpp b/clang/lib/Sema/SemaRISCV.cpp
index 994cd07c1e263..bb91bd7aefbb4 100644
--- a/clang/lib/Sema/SemaRISCV.cpp
+++ b/clang/lib/Sema/SemaRISCV.cpp
@@ -1636,7 +1636,8 @@ bool SemaRISCV::isValidFMVExtension(StringRef Ext) {
}
bool SemaRISCV::checkTargetVersionAttr(const StringRef Param,
- const SourceLocation Loc) {
+ const SourceLocation Loc,
+ SmallString<64> &NewParam) {
using namespace DiagAttrParams;
llvm::SmallVector<StringRef, 8> AttrStrs;
@@ -1682,6 +1683,7 @@ bool SemaRISCV::checkTargetVersionAttr(const StringRef Param,
return Diag(Loc, diag::warn_unsupported_target_attribute)
<< Unsupported << None << Param << TargetVersion;
+ NewParam = Param;
return false;
}
diff --git a/clang/test/AST/attr-target-version.c b/clang/test/AST/attr-target-version.c
index b537f5e685a31..c7b83bef1b91b 100644
--- a/clang/test/AST/attr-target-version.c
+++ b/clang/test/AST/attr-target-version.c
@@ -2,7 +2,23 @@
int __attribute__((target_version("sve2-bitperm + sha2"))) foov(void) { return 1; }
int __attribute__((target_clones(" lse + fp + sha3 ", "default"))) fooc(void) { return 2; }
-// CHECK: TargetVersionAttr
-// CHECK: sve2-bitperm + sha2
-// CHECK: TargetClonesAttr
-// CHECK: fp+lse+sha3 default
+
+int __attribute__((target_version("aes;priority=1"))) explicit_priority(void) { return 1; }
+int __attribute__((target_version("bf16;priority=2"))) explicit_priority(void) { return 2; }
+int __attribute__((target_version("crc;priority=4"))) explicit_priority(void) { return 4; }
+int __attribute__((target_version("dpb2;priority=8"))) explicit_priority(void) { return 8; }
+int __attribute__((target_version("fp16fml;priority=16"))) explicit_priority(void) { return 16; }
+
+int __attribute__((target_clones("simd;priority=31", "default"))) explicit_priority(void) {
+ return 0;
+}
+
+// CHECK: TargetVersionAttr {{.*}} "sve2-bitperm+sha2"
+// CHECK: TargetClonesAttr {{.*}} fp+lse+sha3 default
+
+// CHECK: TargetVersionAttr {{.*}} "aes+P0"
+// CHECK: TargetVersionAttr {{.*}} "bf16+P1"
+// CHECK: TargetVersionAttr {{.*}} "crc+P2"
+// CHECK: TargetVersionAttr {{.*}} "dpb2+P3"
+// CHECK: TargetVersionAttr {{.*}} "fp16fml+P4"
+// CHECK: TargetClonesAttr {{.*}} simd+P0+P1+P2+P3+P4 default
diff --git a/clang/test/CodeGen/AArch64/fmv-duplicate-mangled-name.c b/clang/test/CodeGen/AArch64/fmv-duplicate-mangled-name.c
index e7e611e09542e..ebe5b75cf7946 100644
--- a/clang/test/CodeGen/AArch64/fmv-duplicate-mangled-name.c
+++ b/clang/test/CodeGen/AArch64/fmv-duplicate-mangled-name.c
@@ -1,5 +1,7 @@
// RUN: %clang_cc1 -triple aarch64-linux-gnu -verify -emit-llvm-only %s -DCHECK_IMPLICIT_DEFAULT
// RUN: %clang_cc1 -triple aarch64-linux-gnu -verify -emit-llvm-only %s -DCHECK_EXPLICIT_DEFAULT
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -verify -emit-llvm-only %s -DCHECK_EXPLICIT_VERSION_PRIORITY
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -verify -emit-llvm-only %s -DCHECK_EXPLICIT_CLONES_PRIORITY
#if defined(CHECK_IMPLICIT_DEFAULT)
@@ -21,4 +23,18 @@ __attribute__((target_version("default"))) int explicit_default_bad(void) { retu
// expected-note@-2 {{previous definition is here}}
__attribute__((target_clones("aes", "lse", "default"))) int explicit_default_bad(void) { return 1; }
+#elif defined(CHECK_EXPLICIT_VERSION_PRIORITY)
+
+__attribute__((target_version("aes"))) int explicit_version_priority(void) { return 0; }
+// expected-error@+2 {{definition with same mangled name 'explicit_version_priority._Maes' as another definition}}
+// expected-note@-2 {{previous definition is here}}
+__attribute__((target_version("aes;priority=10"))) int explicit_version_priority(void) { return 1; }
+
+#elif defined(CHECK_EXPLICIT_CLONES_PRIORITY)
+
+__attribute__((target_version("aes;priority=20"))) int explicit_clones_priority(void) { return 0; }
+// expected-error@+2 {{definition with same mangled name 'explicit_clones_priority._Maes' as another definition}}
+// expected-note@-2 {{previous definition is here}}
+__attribute__((target_clones("aes;priority=5", "lse"))) int explicit_clones_priority(void) { return 1; }
+
#endif
diff --git a/clang/test/CodeGen/AArch64/fmv-explicit-priority.c b/clang/test/CodeGen/AArch64/fmv-explicit-priority.c
new file mode 100644
index 0000000000000..437221c95542b
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/fmv-explicit-priority.c
@@ -0,0 +1,193 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature --check-attributes --check-globals --include-generated-funcs
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -O3 -fno-inline -emit-llvm -o - %s | FileCheck %s
+
+__attribute__((target_version("lse;priority=30"))) int foo(void) { return 1; }
+__attribute__((target_version("sve2;priority=20"))) int foo(void) { return 2; }
+__attribute__((target_version("sve;priority=10"))) int foo(void) { return 3; }
+__attribute__((target_version( "default"))) int foo(void) { return 0; }
+
+__attribute__((target_clones("lse+sve2;priority=3", "lse;priority=2", "sve;priority=1", "default")))
+int fmv_caller(void) { return foo(); }
+
+
+__attribute__((target_version("aes"))) int bar(void) { return 1; }
+__attribute__((target_version("sm4;priority=5"))) int bar(void) { return 2; }
+__attribute__((target_version("default"))) int bar(void) { return 0; }
+
+__attribute__((target("aes"))) int regular_caller_aes() { return bar(); }
+__attribute__((target("sm4"))) int regular_caller_sm4() { return bar(); }
+//.
+// CHECK: @__aarch64_cpu_features = external dso_local local_unnamed_addr global { i64 }
+// CHECK: @foo = weak_odr ifunc i32 (), ptr @foo.resolver
+// CHECK: @fmv_caller = weak_odr ifunc i32 (), ptr @fmv_caller.resolver
+// CHECK: @bar = weak_odr ifunc i32 (), ptr @bar.resolver
+//.
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@foo._Mlse
+// CHECK-SAME: () #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 1
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@foo._Msve2
+// CHECK-SAME: () #[[ATTR1:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 2
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@foo._Msve
+// CHECK-SAME: () #[[ATTR2:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 3
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@foo.default
+// CHECK-SAME: () #[[ATTR3:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 0
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller._MlseMsve2
+// CHECK-SAME: () #[[ATTR4:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @foo._Mlse()
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller._Mlse
+// CHECK-SAME: () #[[ATTR5:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @foo._Mlse()
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: noinline nounwind vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller._Msve
+// CHECK-SAME: () #[[ATTR6:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @foo() #[[ATTR12:[0-9]+]]
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller.default
+// CHECK-SAME: () #[[ATTR7:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @foo.default()
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@bar._Maes
+// CHECK-SAME: () #[[ATTR8:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 1
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@bar._Msm4
+// CHECK-SAME: () #[[ATTR9:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 2
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@bar.default
+// CHECK-SAME: () #[[ATTR3]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 0
+//
+//
+// CHECK: Function Attrs: noinline nounwind
+// CHECK-LABEL: define {{[^@]+}}@regular_caller_aes
+// CHECK-SAME: () local_unnamed_addr #[[ATTR10:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @bar() #[[ATTR12]]
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@regular_caller_sm4
+// CHECK-SAME: () local_unnamed_addr #[[ATTR11:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @bar._Msm4()
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK-LABEL: define {{[^@]+}}@foo.resolver() comdat {
+// CHECK-NEXT: resolver_entry:
+// CHECK-NEXT: tail call void @__init_cpu_features_resolver()
+// CHECK-NEXT: [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
+// CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 128
+// CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[TMP1]], 0
+// CHECK-NEXT: br i1 [[DOTNOT]], label [[RESOLVER_ELSE:%.*]], label [[COMMON_RET:%.*]]
+// CHECK: common.ret:
+// CHECK-NEXT: [[COMMON_RET_OP:%.*]] = phi ptr [ @foo._Mlse, [[RESOLVER_ENTRY:%.*]] ], [ @foo._Msve2, [[RESOLVER_ELSE]] ], [ [[FOO__MSVE_FOO_DEFAULT:%.*]], [[RESOLVER_ELSE2:%.*]] ]
+// CHECK-NEXT: ret ptr [[COMMON_RET_OP]]
+// CHECK: resolver_else:
+// CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP0]], 69793284352
+// CHECK-NEXT: [[TMP3:%.*]] = icmp eq i64 [[TMP2]], 69793284352
+// CHECK-NEXT: br i1 [[TMP3]], label [[COMMON_RET]], label [[RESOLVER_ELSE2]]
+// CHECK: resolver_else2:
+// CHECK-NEXT: [[TMP4:%.*]] = and i64 [[TMP0]], 1073807616
+// CHECK-NEXT: [[TMP5:%.*]] = icmp eq i64 [[TMP4]], 1073807616
+// CHECK-NEXT: [[FOO__MSVE_FOO_DEFAULT]] = select i1 [[TMP5]], ptr @foo._Msve, ptr @foo.default
+// CHECK-NEXT: br label [[COMMON_RET]]
+//
+//
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller.resolver() comdat {
+// CHECK-NEXT: resolver_entry:
+// CHECK-NEXT: tail call void @__init_cpu_features_resolver()
+// CHECK-NEXT: [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
+// CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 69793284480
+// CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 69793284480
+// CHECK-NEXT: br i1 [[TMP2]], label [[COMMON_RET:%.*]], label [[RESOLVER_ELSE:%.*]]
+// CHECK: common.ret:
+// CHECK-NEXT: [[COMMON_RET_OP:%.*]] = phi ptr [ @fmv_caller._MlseMsve2, [[RESOLVER_ENTRY:%.*]] ], [ @fmv_caller._Mlse, [[RESOLVER_ELSE]] ], [ [[FMV_CALLER__MSVE_FMV_CALLER_DEFAULT:%.*]], [[RESOLVER_ELSE2:%.*]] ]
+// CHECK-NEXT: ret ptr [[COMMON_RET_OP]]
+// CHECK: resolver_else:
+// CHECK-NEXT: [[TMP3:%.*]] = and i64 [[TMP0]], 128
+// CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[TMP3]], 0
+// CHECK-NEXT: br i1 [[DOTNOT]], label [[RESOLVER_ELSE2]], label [[COMMON_RET]]
+// CHECK: resolver_else2:
+// CHECK-NEXT: [[TMP4:%.*]] = and i64 [[TMP0]], 1073807616
+// CHECK-NEXT: [[TMP5:%.*]] = icmp eq i64 [[TMP4]], 1073807616
+// CHECK-NEXT: [[FMV_CALLER__MSVE_FMV_CALLER_DEFAULT]] = select i1 [[TMP5]], ptr @fmv_call...
[truncated]
|
|
@llvm/pr-subscribers-llvm-analysis Author: Alexandros Lamprineas (labrinea) ChangesImplements ARM-software/acle#404 This allows the user to specify "priority=[1-32];featA+featB" where priority=31 means highest priority. If the explicit priority string is omitted then the priority of "featA+featB" is implied, which is lower than priority=1. Internally this gets expanded using special FMV features P0 ... P5 which can encode up to 31 priority levels (excluding all zeros). Those do not have corresponding detection bit at pos FEAT_#enum so I made this field optional in FMVInfo. Also they don't affect the codegen or name mangling of versioned functions. Patch is 39.33 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150267.diff 23 Files Affected:
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index b2ea65ae111be..05fecee73270f 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -12747,6 +12747,12 @@ def warn_target_clone_duplicate_options
def warn_target_clone_no_impact_options
: Warning<"version list contains entries that don't impact code generation">,
InGroup<FunctionMultiVersioning>;
+def warn_version_priority_out_of_range
+ : Warning<"version priority '%0' is outside the allowed range [1-31]; ignoring priority">,
+ InGroup<FunctionMultiVersioning>;
+def warn_invalid_default_version_priority
+ : Warning<"priority of default version cannot be overridden; ignoring priority">,
+ InGroup<FunctionMultiVersioning>;
// three-way comparison operator diagnostics
def err_implied_comparison_category_type_not_found : Error<
diff --git a/clang/include/clang/Sema/SemaARM.h b/clang/include/clang/Sema/SemaARM.h
index e77d65f9362d8..1a2775d7b050e 100644
--- a/clang/include/clang/Sema/SemaARM.h
+++ b/clang/include/clang/Sema/SemaARM.h
@@ -92,7 +92,8 @@ class SemaARM : public SemaBase {
/// false otherwise.
bool areLaxCompatibleSveTypes(QualType FirstType, QualType SecondType);
- bool checkTargetVersionAttr(const StringRef Str, const SourceLocation Loc);
+ bool checkTargetVersionAttr(const StringRef Param, const SourceLocation Loc,
+ SmallString<64> &NewParam);
bool checkTargetClonesAttr(SmallVectorImpl<StringRef> &Params,
SmallVectorImpl<SourceLocation> &Locs,
SmallVectorImpl<SmallString<64>> &NewParams);
diff --git a/clang/include/clang/Sema/SemaRISCV.h b/clang/include/clang/Sema/SemaRISCV.h
index 844cc3ce4a440..863b8a143f48a 100644
--- a/clang/include/clang/Sema/SemaRISCV.h
+++ b/clang/include/clang/Sema/SemaRISCV.h
@@ -56,7 +56,8 @@ class SemaRISCV : public SemaBase {
std::unique_ptr<sema::RISCVIntrinsicManager> IntrinsicManager;
- bool checkTargetVersionAttr(const StringRef Param, const SourceLocation Loc);
+ bool checkTargetVersionAttr(const StringRef Param, const SourceLocation Loc,
+ SmallString<64> &NewParam);
bool checkTargetClonesAttr(SmallVectorImpl<StringRef> &Params,
SmallVectorImpl<SourceLocation> &Locs,
SmallVectorImpl<SmallString<64>> &NewParams);
diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp b/clang/lib/CodeGen/Targets/AArch64.cpp
index b82c46966cf0b..f451562b86ec3 100644
--- a/clang/lib/CodeGen/Targets/AArch64.cpp
+++ b/clang/lib/CodeGen/Targets/AArch64.cpp
@@ -1337,9 +1337,10 @@ void AArch64ABIInfo::appendAttributeMangling(StringRef AttrStr,
llvm::SmallDenseSet<StringRef, 8> UniqueFeats;
for (auto &Feat : Features)
- if (auto Ext = llvm::AArch64::parseFMVExtension(Feat))
- if (UniqueFeats.insert(Ext->Name).second)
- Out << 'M' << Ext->Name;
+ if (getTarget().doesFeatureAffectCodeGen(Feat))
+ if (auto Ext = llvm::AArch64::parseFMVExtension(Feat))
+ if (UniqueFeats.insert(Ext->Name).second)
+ Out << 'M' << Ext->Name;
}
std::unique_ptr<TargetCodeGenInfo>
diff --git a/clang/lib/Sema/SemaARM.cpp b/clang/lib/Sema/SemaARM.cpp
index 8e27fabccd583..d3110dd18e927 100644
--- a/clang/lib/Sema/SemaARM.cpp
+++ b/clang/lib/Sema/SemaARM.cpp
@@ -1535,19 +1535,52 @@ bool SemaARM::areLaxCompatibleSveTypes(QualType FirstType,
IsLaxCompatible(SecondType, FirstType);
}
+static void appendFeature(StringRef Feat, SmallString<64> &Buffer) {
+ if (!Buffer.empty())
+ Buffer.append("+");
+ Buffer.append(Feat);
+}
+
+static void convertPriorityString(unsigned Priority,
+ SmallString<64> &NewParam) {
+ StringRef PriorityString[5] = {"P0", "P1", "P2", "P3", "P4"};
+
+ assert(Priority > 0 && Priority < 32 && "priority out of range");
+ // Convert priority=[1-31] -> P0 + ... + P4
+ for (unsigned BitPos = 0; BitPos < 5; ++BitPos)
+ if (Priority & (1U << BitPos))
+ appendFeature(PriorityString[BitPos], NewParam);
+}
+
bool SemaARM::checkTargetVersionAttr(const StringRef Param,
- const SourceLocation Loc) {
+ const SourceLocation Loc,
+ SmallString<64> &NewParam) {
using namespace DiagAttrParams;
+ auto [LHS, RHS] = Param.split(';');
+ bool IsDefault = false;
llvm::SmallVector<StringRef, 8> Features;
- Param.split(Features, '+');
+ LHS.split(Features, '+');
for (StringRef Feat : Features) {
Feat = Feat.trim();
if (Feat == "default")
- continue;
- if (!getASTContext().getTargetInfo().validateCpuSupports(Feat))
+ IsDefault = true;
+ else if (!getASTContext().getTargetInfo().validateCpuSupports(Feat))
return Diag(Loc, diag::warn_unsupported_target_attribute)
<< Unsupported << None << Feat << TargetVersion;
+ appendFeature(Feat, NewParam);
+ }
+
+ if (!RHS.empty() && RHS.consume_front("priority=")) {
+ if (IsDefault)
+ Diag(Loc, diag::warn_invalid_default_version_priority);
+ else {
+ unsigned Digit;
+ if (RHS.getAsInteger(0, Digit) || Digit < 1 || Digit > 31)
+ Diag(Loc, diag::warn_version_priority_out_of_range) << RHS;
+ else
+ convertPriorityString(Digit, NewParam);
+ }
}
return false;
}
@@ -1569,15 +1602,20 @@ bool SemaARM::checkTargetClonesAttr(
const StringRef Param = Params[I].trim();
const SourceLocation &Loc = Locs[I];
- if (Param.empty())
+ auto [LHS, RHS] = Param.split(';');
+ bool HasPriority = !RHS.empty() && RHS.consume_front("priority=");
+
+ if (LHS.empty())
return Diag(Loc, diag::warn_unsupported_target_attribute)
<< Unsupported << None << "" << TargetClones;
- if (Param == "default") {
+ if (LHS == "default") {
if (HasDefault)
Diag(Loc, diag::warn_target_clone_duplicate_options);
else {
- NewParams.push_back(Param);
+ if (HasPriority)
+ Diag(Loc, diag::warn_invalid_default_version_priority);
+ NewParams.push_back(LHS);
HasDefault = true;
}
continue;
@@ -1586,7 +1624,7 @@ bool SemaARM::checkTargetClonesAttr(
bool HasCodeGenImpact = false;
llvm::SmallVector<StringRef, 8> Features;
llvm::SmallVector<StringRef, 8> ValidFeatures;
- Param.split(Features, '+');
+ LHS.split(Features, '+');
for (StringRef Feat : Features) {
Feat = Feat.trim();
if (!getASTContext().getTargetInfo().validateCpuSupports(Feat)) {
@@ -1616,6 +1654,14 @@ bool SemaARM::checkTargetClonesAttr(
continue;
}
+ if (HasPriority) {
+ unsigned Digit;
+ if (RHS.getAsInteger(0, Digit) || Digit < 1 || Digit > 31)
+ Diag(Loc, diag::warn_version_priority_out_of_range) << RHS;
+ else
+ convertPriorityString(Digit, NewParam);
+ }
+
// Valid non-default argument.
NewParams.push_back(NewParam);
HasNonDefault = true;
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 9a2950cf1648e..46f4bab9edc71 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -3333,19 +3333,20 @@ bool Sema::checkTargetAttr(SourceLocation LiteralLoc, StringRef AttrStr) {
static void handleTargetVersionAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
StringRef Param;
SourceLocation Loc;
+ SmallString<64> NewParam;
if (!S.checkStringLiteralArgumentAttr(AL, 0, Param, &Loc))
return;
if (S.Context.getTargetInfo().getTriple().isAArch64()) {
- if (S.ARM().checkTargetVersionAttr(Param, Loc))
+ if (S.ARM().checkTargetVersionAttr(Param, Loc, NewParam))
return;
} else if (S.Context.getTargetInfo().getTriple().isRISCV()) {
- if (S.RISCV().checkTargetVersionAttr(Param, Loc))
+ if (S.RISCV().checkTargetVersionAttr(Param, Loc, NewParam))
return;
}
TargetVersionAttr *NewAttr =
- ::new (S.Context) TargetVersionAttr(S.Context, AL, Param);
+ ::new (S.Context) TargetVersionAttr(S.Context, AL, NewParam);
D->addAttr(NewAttr);
}
diff --git a/clang/lib/Sema/SemaRISCV.cpp b/clang/lib/Sema/SemaRISCV.cpp
index 994cd07c1e263..bb91bd7aefbb4 100644
--- a/clang/lib/Sema/SemaRISCV.cpp
+++ b/clang/lib/Sema/SemaRISCV.cpp
@@ -1636,7 +1636,8 @@ bool SemaRISCV::isValidFMVExtension(StringRef Ext) {
}
bool SemaRISCV::checkTargetVersionAttr(const StringRef Param,
- const SourceLocation Loc) {
+ const SourceLocation Loc,
+ SmallString<64> &NewParam) {
using namespace DiagAttrParams;
llvm::SmallVector<StringRef, 8> AttrStrs;
@@ -1682,6 +1683,7 @@ bool SemaRISCV::checkTargetVersionAttr(const StringRef Param,
return Diag(Loc, diag::warn_unsupported_target_attribute)
<< Unsupported << None << Param << TargetVersion;
+ NewParam = Param;
return false;
}
diff --git a/clang/test/AST/attr-target-version.c b/clang/test/AST/attr-target-version.c
index b537f5e685a31..c7b83bef1b91b 100644
--- a/clang/test/AST/attr-target-version.c
+++ b/clang/test/AST/attr-target-version.c
@@ -2,7 +2,23 @@
int __attribute__((target_version("sve2-bitperm + sha2"))) foov(void) { return 1; }
int __attribute__((target_clones(" lse + fp + sha3 ", "default"))) fooc(void) { return 2; }
-// CHECK: TargetVersionAttr
-// CHECK: sve2-bitperm + sha2
-// CHECK: TargetClonesAttr
-// CHECK: fp+lse+sha3 default
+
+int __attribute__((target_version("aes;priority=1"))) explicit_priority(void) { return 1; }
+int __attribute__((target_version("bf16;priority=2"))) explicit_priority(void) { return 2; }
+int __attribute__((target_version("crc;priority=4"))) explicit_priority(void) { return 4; }
+int __attribute__((target_version("dpb2;priority=8"))) explicit_priority(void) { return 8; }
+int __attribute__((target_version("fp16fml;priority=16"))) explicit_priority(void) { return 16; }
+
+int __attribute__((target_clones("simd;priority=31", "default"))) explicit_priority(void) {
+ return 0;
+}
+
+// CHECK: TargetVersionAttr {{.*}} "sve2-bitperm+sha2"
+// CHECK: TargetClonesAttr {{.*}} fp+lse+sha3 default
+
+// CHECK: TargetVersionAttr {{.*}} "aes+P0"
+// CHECK: TargetVersionAttr {{.*}} "bf16+P1"
+// CHECK: TargetVersionAttr {{.*}} "crc+P2"
+// CHECK: TargetVersionAttr {{.*}} "dpb2+P3"
+// CHECK: TargetVersionAttr {{.*}} "fp16fml+P4"
+// CHECK: TargetClonesAttr {{.*}} simd+P0+P1+P2+P3+P4 default
diff --git a/clang/test/CodeGen/AArch64/fmv-duplicate-mangled-name.c b/clang/test/CodeGen/AArch64/fmv-duplicate-mangled-name.c
index e7e611e09542e..ebe5b75cf7946 100644
--- a/clang/test/CodeGen/AArch64/fmv-duplicate-mangled-name.c
+++ b/clang/test/CodeGen/AArch64/fmv-duplicate-mangled-name.c
@@ -1,5 +1,7 @@
// RUN: %clang_cc1 -triple aarch64-linux-gnu -verify -emit-llvm-only %s -DCHECK_IMPLICIT_DEFAULT
// RUN: %clang_cc1 -triple aarch64-linux-gnu -verify -emit-llvm-only %s -DCHECK_EXPLICIT_DEFAULT
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -verify -emit-llvm-only %s -DCHECK_EXPLICIT_VERSION_PRIORITY
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -verify -emit-llvm-only %s -DCHECK_EXPLICIT_CLONES_PRIORITY
#if defined(CHECK_IMPLICIT_DEFAULT)
@@ -21,4 +23,18 @@ __attribute__((target_version("default"))) int explicit_default_bad(void) { retu
// expected-note@-2 {{previous definition is here}}
__attribute__((target_clones("aes", "lse", "default"))) int explicit_default_bad(void) { return 1; }
+#elif defined(CHECK_EXPLICIT_VERSION_PRIORITY)
+
+__attribute__((target_version("aes"))) int explicit_version_priority(void) { return 0; }
+// expected-error@+2 {{definition with same mangled name 'explicit_version_priority._Maes' as another definition}}
+// expected-note@-2 {{previous definition is here}}
+__attribute__((target_version("aes;priority=10"))) int explicit_version_priority(void) { return 1; }
+
+#elif defined(CHECK_EXPLICIT_CLONES_PRIORITY)
+
+__attribute__((target_version("aes;priority=20"))) int explicit_clones_priority(void) { return 0; }
+// expected-error@+2 {{definition with same mangled name 'explicit_clones_priority._Maes' as another definition}}
+// expected-note@-2 {{previous definition is here}}
+__attribute__((target_clones("aes;priority=5", "lse"))) int explicit_clones_priority(void) { return 1; }
+
#endif
diff --git a/clang/test/CodeGen/AArch64/fmv-explicit-priority.c b/clang/test/CodeGen/AArch64/fmv-explicit-priority.c
new file mode 100644
index 0000000000000..437221c95542b
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/fmv-explicit-priority.c
@@ -0,0 +1,193 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature --check-attributes --check-globals --include-generated-funcs
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -O3 -fno-inline -emit-llvm -o - %s | FileCheck %s
+
+__attribute__((target_version("lse;priority=30"))) int foo(void) { return 1; }
+__attribute__((target_version("sve2;priority=20"))) int foo(void) { return 2; }
+__attribute__((target_version("sve;priority=10"))) int foo(void) { return 3; }
+__attribute__((target_version( "default"))) int foo(void) { return 0; }
+
+__attribute__((target_clones("lse+sve2;priority=3", "lse;priority=2", "sve;priority=1", "default")))
+int fmv_caller(void) { return foo(); }
+
+
+__attribute__((target_version("aes"))) int bar(void) { return 1; }
+__attribute__((target_version("sm4;priority=5"))) int bar(void) { return 2; }
+__attribute__((target_version("default"))) int bar(void) { return 0; }
+
+__attribute__((target("aes"))) int regular_caller_aes() { return bar(); }
+__attribute__((target("sm4"))) int regular_caller_sm4() { return bar(); }
+//.
+// CHECK: @__aarch64_cpu_features = external dso_local local_unnamed_addr global { i64 }
+// CHECK: @foo = weak_odr ifunc i32 (), ptr @foo.resolver
+// CHECK: @fmv_caller = weak_odr ifunc i32 (), ptr @fmv_caller.resolver
+// CHECK: @bar = weak_odr ifunc i32 (), ptr @bar.resolver
+//.
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@foo._Mlse
+// CHECK-SAME: () #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 1
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@foo._Msve2
+// CHECK-SAME: () #[[ATTR1:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 2
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@foo._Msve
+// CHECK-SAME: () #[[ATTR2:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 3
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@foo.default
+// CHECK-SAME: () #[[ATTR3:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 0
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller._MlseMsve2
+// CHECK-SAME: () #[[ATTR4:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @foo._Mlse()
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller._Mlse
+// CHECK-SAME: () #[[ATTR5:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @foo._Mlse()
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: noinline nounwind vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller._Msve
+// CHECK-SAME: () #[[ATTR6:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @foo() #[[ATTR12:[0-9]+]]
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none) vscale_range(1,16)
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller.default
+// CHECK-SAME: () #[[ATTR7:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @foo.default()
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@bar._Maes
+// CHECK-SAME: () #[[ATTR8:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 1
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@bar._Msm4
+// CHECK-SAME: () #[[ATTR9:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 2
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@bar.default
+// CHECK-SAME: () #[[ATTR3]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 0
+//
+//
+// CHECK: Function Attrs: noinline nounwind
+// CHECK-LABEL: define {{[^@]+}}@regular_caller_aes
+// CHECK-SAME: () local_unnamed_addr #[[ATTR10:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @bar() #[[ATTR12]]
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(none)
+// CHECK-LABEL: define {{[^@]+}}@regular_caller_sm4
+// CHECK-SAME: () local_unnamed_addr #[[ATTR11:[0-9]+]] {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[CALL:%.*]] = tail call i32 @bar._Msm4()
+// CHECK-NEXT: ret i32 [[CALL]]
+//
+//
+// CHECK-LABEL: define {{[^@]+}}@foo.resolver() comdat {
+// CHECK-NEXT: resolver_entry:
+// CHECK-NEXT: tail call void @__init_cpu_features_resolver()
+// CHECK-NEXT: [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
+// CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 128
+// CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[TMP1]], 0
+// CHECK-NEXT: br i1 [[DOTNOT]], label [[RESOLVER_ELSE:%.*]], label [[COMMON_RET:%.*]]
+// CHECK: common.ret:
+// CHECK-NEXT: [[COMMON_RET_OP:%.*]] = phi ptr [ @foo._Mlse, [[RESOLVER_ENTRY:%.*]] ], [ @foo._Msve2, [[RESOLVER_ELSE]] ], [ [[FOO__MSVE_FOO_DEFAULT:%.*]], [[RESOLVER_ELSE2:%.*]] ]
+// CHECK-NEXT: ret ptr [[COMMON_RET_OP]]
+// CHECK: resolver_else:
+// CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP0]], 69793284352
+// CHECK-NEXT: [[TMP3:%.*]] = icmp eq i64 [[TMP2]], 69793284352
+// CHECK-NEXT: br i1 [[TMP3]], label [[COMMON_RET]], label [[RESOLVER_ELSE2]]
+// CHECK: resolver_else2:
+// CHECK-NEXT: [[TMP4:%.*]] = and i64 [[TMP0]], 1073807616
+// CHECK-NEXT: [[TMP5:%.*]] = icmp eq i64 [[TMP4]], 1073807616
+// CHECK-NEXT: [[FOO__MSVE_FOO_DEFAULT]] = select i1 [[TMP5]], ptr @foo._Msve, ptr @foo.default
+// CHECK-NEXT: br label [[COMMON_RET]]
+//
+//
+// CHECK-LABEL: define {{[^@]+}}@fmv_caller.resolver() comdat {
+// CHECK-NEXT: resolver_entry:
+// CHECK-NEXT: tail call void @__init_cpu_features_resolver()
+// CHECK-NEXT: [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
+// CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 69793284480
+// CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 69793284480
+// CHECK-NEXT: br i1 [[TMP2]], label [[COMMON_RET:%.*]], label [[RESOLVER_ELSE:%.*]]
+// CHECK: common.ret:
+// CHECK-NEXT: [[COMMON_RET_OP:%.*]] = phi ptr [ @fmv_caller._MlseMsve2, [[RESOLVER_ENTRY:%.*]] ], [ @fmv_caller._Mlse, [[RESOLVER_ELSE]] ], [ [[FMV_CALLER__MSVE_FMV_CALLER_DEFAULT:%.*]], [[RESOLVER_ELSE2:%.*]] ]
+// CHECK-NEXT: ret ptr [[COMMON_RET_OP]]
+// CHECK: resolver_else:
+// CHECK-NEXT: [[TMP3:%.*]] = and i64 [[TMP0]], 128
+// CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[TMP3]], 0
+// CHECK-NEXT: br i1 [[DOTNOT]], label [[RESOLVER_ELSE2]], label [[COMMON_RET]]
+// CHECK: resolver_else2:
+// CHECK-NEXT: [[TMP4:%.*]] = and i64 [[TMP0]], 1073807616
+// CHECK-NEXT: [[TMP5:%.*]] = icmp eq i64 [[TMP4]], 1073807616
+// CHECK-NEXT: [[FMV_CALLER__MSVE_FMV_CALLER_DEFAULT]] = select i1 [[TMP5]], ptr @fmv_call...
[truncated]
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Implements ARM-software/acle#404 This allows the user to specify "priority=[1-255];featA+featB" where priority=255 means highest priority. If the explicit priority string is omitted then the priority of "featA+featB" is implied, which is lower than priority=1. Internally this gets expanded using special FMV features P0 ... P7 which can encode up to 256-1 priority levels (excluding all zeros). Those do not have corresponding detection bit at pos FEAT_#enum so I made this field optional in FMVInfo. Also they don't affect the codegen or name mangling of versioned functions.
6101d58 to
0575957
Compare
update comment
|
ping |
efriedma-quic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive-by comment: I'd like to see some optimizer tests for the optimizer changes.
I can try to find time to do a more detailed review on this at some point, but not sure when that will be.
This reverts commit 1ffde15. The test was wrong, I will contrive a valid one in follow up commit.
The reverted one had versions which could not be selected.
peterwaller-arm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach seems reasonable to me. I spent a bit of time looking over this and couldn't find anything significant. Since it is a big change please give a couple of days for any remaining comments to come in.
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/144/builds/41893 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/46/builds/27570 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/154/builds/25017 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/3/builds/26000 Here is the relevant piece of the build log for the reference |
|
The -O3 on the clang test seems to have different behavior on different hosts. My intention was to run an integration test which will use llvm's globalopt pass, but there's no need. We have unittests for it. I'll modify and regenerate the test. |
|
Raised #171457 |
This fixes the buildbot failures from #150267. I could not reproduce them locally but my intuition suggests that the -O3 option on the RUN line behaves incosistently on different hosts judging from the error logs. My intention was to run an integration test which will use llvm's globalopt pass, but there's no need actually. We have unittests in place for it.
This fixes the buildbot failures from llvm/llvm-project#150267. I could not reproduce them locally but my intuition suggests that the -O3 option on the RUN line behaves incosistently on different hosts judging from the error logs. My intention was to run an integration test which will use llvm's globalopt pass, but there's no need actually. We have unittests in place for it.
The commit llvm#150267 allows the user to override version priority. As a result it is now possible to define an unreachable function version if a higher priority version contains a subset of its FMV features. For example: target_clones("sve;priority=2", "sve2;priority=1") the sve2 version is unreachable, since if you don't have sve we can't have sve2 either. The patch emits a warning about such cases and ignores those versions when generating the resolver. Also removes their definitions.
This fixes the buildbot failures from llvm#150267. I could not reproduce them locally but my intuition suggests that the -O3 option on the RUN line behaves incosistently on different hosts judging from the error logs. My intention was to run an integration test which will use llvm's globalopt pass, but there's no need actually. We have unittests in place for it.
Implements ARM-software/acle#404
This allows the user to specify "featA+featB;priority=[1-255]" where priority=255 means highest priority. If the explicit priority string is omitted then the priority of "featA+featB" is implied, which is lower than priority=1.
Internally this gets expanded using special FMV features P0 ... P7 which can encode up to 256-1 priority levels (excluding all zeros). Those do not have corresponding detection bit at pos FEAT_#enum so I made this field optional in FMVInfo. Also they don't affect the codegen or name mangling of versioned functions.