[SPIRV] Emit intrinsics for globals only in function that references them#178143
[SPIRV] Emit intrinsics for globals only in function that references them#178143jmmartinez merged 11 commits intomainfrom
Conversation
|
@llvm/pr-subscribers-backend-spir-v Author: Juan Manuel Martinez Caamaño (jmmartinez) ChangesIn the SPIRV backend, the These intrinsics are used to keep track of global variables, their types and initializers. In SPIRV everything is an instruction (even globals/constants). We currently represent these global entities as individual instructions on every function. Later, the These instructions associated with global entities on functions that do not reference them leads to a bloated intermediate representation and high memory consumption (as it happened in #170339). Consider this example: int A[1024] = { 0, 1, 2, ..., 1023 };
int get_a(int i) {
return A[i];
}
void say_hi() {
puts("hi!\n");
}Although, This patch doesn't fix the underlying issue, but it mitigates it by only emitting global-variable SPIRV intrinsics on the functions that use it. If the global is not referenced by any function, we just pick the first function definition. With this patch, the example in #170339 drops from ~33Gb of maximum resident set size to less than ~570Mb. And compile time goes from 2:09min to 1:26min. The changes in the tests are due to changes in the order in which the instructions appear. Patch is 21.54 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/178143.diff 5 Files Affected:
diff --git a/llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp b/llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
index 0ae30a2cdf1ac..84815f6dc5767 100644
--- a/llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
@@ -2117,13 +2117,49 @@ Instruction *SPIRVEmitIntrinsics::visitUnreachableInst(UnreachableInst &I) {
return &I;
}
-void SPIRVEmitIntrinsics::processGlobalValue(GlobalVariable &GV,
- IRBuilder<> &B) {
+static bool shouldEmitIntrinsicsForGlobalValue(const GlobalVariable &GV,
+ const Function *CurrF) {
// Skip special artificial variables.
static const StringSet<> ArtificialGlobals{"llvm.global.annotations",
"llvm.compiler.used"};
if (ArtificialGlobals.contains(GV.getName()))
+ return false;
+
+ SmallPtrSet<const Value *, 8> Visited;
+ SmallVector<const Value *> Worklist = {&GV};
+ bool ReferencedByAnotherFunction = false;
+ while (!Worklist.empty()) {
+ const Value *V = Worklist.pop_back_val();
+ if (!Visited.insert(V).second)
+ continue;
+
+ if (const Instruction *I = dyn_cast<Instruction>(V)) {
+ if (I->getFunction() == CurrF)
+ return true;
+ ReferencedByAnotherFunction = true;
+ continue;
+ }
+
+ if (const Constant *C = dyn_cast<Constant>(V))
+ Worklist.append(C->user_begin(), C->user_end());
+ }
+
+ // Do not emit the intrinsics in this function, it's going to be emitted on
+ // the functions that reference it.
+ if (ReferencedByAnotherFunction)
+ return false;
+
+ // Emit definitions for globals that are not referenced by any function on the
+ // first function definition.
+ const Module &M = *CurrF->getParent();
+ const Function &FirstDefinition = *M.getFunctionDefs().begin();
+ return CurrF == &FirstDefinition;
+}
+
+void SPIRVEmitIntrinsics::processGlobalValue(GlobalVariable &GV,
+ IRBuilder<> &B) {
+ if (!shouldEmitIntrinsicsForGlobalValue(GV, CurrF))
return;
Constant *Init = nullptr;
diff --git a/llvm/test/CodeGen/SPIRV/extensions/SPV_KHR_float_controls2/exec_mode3.ll b/llvm/test/CodeGen/SPIRV/extensions/SPV_KHR_float_controls2/exec_mode3.ll
index 9cc9870c2cfe1..656030b4f5916 100644
--- a/llvm/test/CodeGen/SPIRV/extensions/SPV_KHR_float_controls2/exec_mode3.ll
+++ b/llvm/test/CodeGen/SPIRV/extensions/SPV_KHR_float_controls2/exec_mode3.ll
@@ -8,39 +8,54 @@
; CHECK: OpEntryPoint Kernel %[[#KERNEL_FLOAT_V:]] "k_float_controls_float_v"
; CHECK: OpEntryPoint Kernel %[[#KERNEL_ALL_V:]] "k_float_controls_all_v"
+; CHECK-DAG: %[[#INT32_TYPE:]] = OpTypeInt 32 0
+; CHECK-DAG: %[[#HALF_TYPE:]] = OpTypeFloat 16
+; CHECK-DAG: %[[#FLOAT_TYPE:]] = OpTypeFloat 32
+; CHECK-DAG: %[[#DOUBLE_TYPE:]] = OpTypeFloat 64
+; CHECK-DAG: %[[#CONST0:]] = OpConstantNull %[[#INT32_TYPE]]
+; CHECK-DAG: %[[#CONST131079:]] = OpConstant %[[#INT32_TYPE]] 131079
+
+; CHECK-DAG: %[[#HALF_V_TYPE:]] = OpTypeVector %[[#HALF_TYPE]]
+; CHECK-DAG: %[[#FLOAT_V_TYPE:]] = OpTypeVector %[[#FLOAT_TYPE]]
+; CHECK-DAG: %[[#DOUBLE_V_TYPE:]] = OpTypeVector %[[#DOUBLE_TYPE]]
+
; We expect 130179 for float type.
-; CHECK-DAG: OpExecutionModeId %[[#KERNEL_FLOAT]] FPFastMathDefault %[[#FLOAT_TYPE:]] %[[#CONST131079:]]
-; CHECK-DAG: OpExecutionModeId %[[#KERNEL_ALL]] FPFastMathDefault %[[#FLOAT_TYPE:]] %[[#CONST131079]]
+; CHECK-DAG: OpExecutionModeId %[[#KERNEL_FLOAT]] FPFastMathDefault %[[#FLOAT_TYPE]] %[[#CONST131079]]
+; CHECK-DAG: OpExecutionModeId %[[#KERNEL_ALL]] FPFastMathDefault %[[#FLOAT_TYPE]] %[[#CONST131079]]
; We expect 0 for the rest of types because it's SignedZeroInfNanPreserve.
-; CHECK-DAG: OpExecutionModeId %[[#KERNEL_ALL]] FPFastMathDefault %[[#HALF_TYPE:]] %[[#CONST0:]]
-; CHECK-DAG: OpExecutionModeId %[[#KERNEL_ALL]] FPFastMathDefault %[[#DOUBLE_TYPE:]] %[[#CONST0]]
+; CHECK-DAG: OpExecutionModeId %[[#KERNEL_ALL]] FPFastMathDefault %[[#HALF_TYPE]] %[[#CONST0]]
+; CHECK-DAG: OpExecutionModeId %[[#KERNEL_ALL]] FPFastMathDefault %[[#DOUBLE_TYPE]] %[[#CONST0]]
; We expect 130179 for float type.
-; CHECK-DAG: OpExecutionModeId %[[#KERNEL_FLOAT_V]] FPFastMathDefault %[[#FLOAT_TYPE:]] %[[#CONST131079]]
-; CHECK-DAG: OpExecutionModeId %[[#KERNEL_ALL_V]] FPFastMathDefault %[[#FLOAT_TYPE:]] %[[#CONST131079]]
+; CHECK-DAG: OpExecutionModeId %[[#KERNEL_FLOAT_V]] FPFastMathDefault %[[#FLOAT_TYPE]] %[[#CONST131079]]
+; CHECK-DAG: OpExecutionModeId %[[#KERNEL_ALL_V]] FPFastMathDefault %[[#FLOAT_TYPE]] %[[#CONST131079]]
; We expect 0 for the rest of types because it's SignedZeroInfNanPreserve.
-; CHECK-DAG: OpExecutionModeId %[[#KERNEL_ALL_V]] FPFastMathDefault %[[#DOUBLE_TYPE:]] %[[#CONST0]]
-; CHECK-DAG: OpExecutionModeId %[[#KERNEL_ALL_V]] FPFastMathDefault %[[#HALF_TYPE:]] %[[#CONST0]]
-
-; CHECK-DAG: OpDecorate %[[#addRes:]] FPFastMathMode NotNaN|NotInf|NSZ|AllowReassoc
-; CHECK-DAG: OpDecorate %[[#addResH:]] FPFastMathMode None
-; CHECK-DAG: OpDecorate %[[#addResF:]] FPFastMathMode NotNaN|NotInf|NSZ|AllowReassoc
-; CHECK-DAG: OpDecorate %[[#addResD:]] FPFastMathMode None
-; CHECK-DAG: OpDecorate %[[#addRes_V:]] FPFastMathMode NotNaN|NotInf|NSZ|AllowReassoc
-; CHECK-DAG: OpDecorate %[[#addResH_V:]] FPFastMathMode None
-; CHECK-DAG: OpDecorate %[[#addResF_V:]] FPFastMathMode NotNaN|NotInf|NSZ|AllowReassoc
-; CHECK-DAG: OpDecorate %[[#addResD_V:]] FPFastMathMode None
+; CHECK-DAG: OpExecutionModeId %[[#KERNEL_ALL_V]] FPFastMathDefault %[[#DOUBLE_TYPE]] %[[#CONST0]]
+; CHECK-DAG: OpExecutionModeId %[[#KERNEL_ALL_V]] FPFastMathDefault %[[#HALF_TYPE]] %[[#CONST0]]
-; CHECK-DAG: %[[#INT32_TYPE:]] = OpTypeInt 32 0
-; CHECK-DAG: %[[#HALF_TYPE]] = OpTypeFloat 16
-; CHECK-DAG: %[[#FLOAT_TYPE]] = OpTypeFloat 32
-; CHECK-DAG: %[[#DOUBLE_TYPE]] = OpTypeFloat 64
-; CHECK-DAG: %[[#CONST0]] = OpConstantNull %[[#INT32_TYPE]]
-; CHECK-DAG: %[[#CONST131079]] = OpConstant %[[#INT32_TYPE]] 131079
+; CHECK-DAG: OpName %[[#G_addResH:]] "G_addResH"
+; CHECK-DAG: OpName %[[#G_addResF:]] "G_addResF"
+; CHECK-DAG: OpName %[[#G_addResD:]] "G_addResD"
+; CHECK-DAG: OpName %[[#G_addResV:]] "G_addResV"
+; CHECK-DAG: OpName %[[#G_addResH_V:]] "G_addResH_V"
+; CHECK-DAG: OpName %[[#G_addResF_V:]] "G_addResF_V"
+; CHECK-DAG: OpName %[[#G_addResD_V:]] "G_addResD_V"
-; CHECK-DAG: %[[#HALF_V_TYPE:]] = OpTypeVector %[[#HALF_TYPE]]
-; CHECK-DAG: %[[#FLOAT_V_TYPE:]] = OpTypeVector %[[#FLOAT_TYPE]]
-; CHECK-DAG: %[[#DOUBLE_V_TYPE:]] = OpTypeVector %[[#DOUBLE_TYPE]]
+; CHECK-DAG: OpStore %[[#G_addResH]] %[[#addResH:]]
+; CHECK-DAG: OpStore %[[#G_addResF]] %[[#addResF:]]
+; CHECK-DAG: OpStore %[[#G_addResD]] %[[#addResD:]]
+; CHECK-DAG: OpStore %[[#G_addResV]] %[[#addResV:]]
+; CHECK-DAG: OpStore %[[#G_addResH_V]] %[[#addResH_V:]]
+; CHECK-DAG: OpStore %[[#G_addResF_V]] %[[#addResF_V:]]
+; CHECK-DAG: OpStore %[[#G_addResD_V]] %[[#addResD_V:]]
+
+; CHECK-DAG: OpDecorate %[[#addResH]] FPFastMathMode None
+; CHECK-DAG: OpDecorate %[[#addResF]] FPFastMathMode NotNaN|NotInf|NSZ|AllowReassoc
+; CHECK-DAG: OpDecorate %[[#addResD]] FPFastMathMode None
+; CHECK-DAG: OpDecorate %[[#addResV]] FPFastMathMode NotNaN|NotInf|NSZ|AllowReassoc
+; CHECK-DAG: OpDecorate %[[#addResH_V]] FPFastMathMode None
+; CHECK-DAG: OpDecorate %[[#addResF_V]] FPFastMathMode NotNaN|NotInf|NSZ|AllowReassoc
+; CHECK-DAG: OpDecorate %[[#addResD_V]] FPFastMathMode None
@G_addRes = global float 0.0
@G_addResH = global half 0.0
@@ -53,7 +68,8 @@
define dso_local dllexport spir_kernel void @k_float_controls_float(float %f) {
entry:
-; CHECK-DAG: %[[#addRes]] = OpFAdd %[[#FLOAT_TYPE]]
+; CHECK-DAG: %[[#addRes:]] = OpFAdd %[[#FLOAT_TYPE]]
+; CHECK-DAG: OpDecorate %[[#addRes]] FPFastMathMode NotNaN|NotInf|NSZ|AllowReassoc
%addRes = fadd float %f, %f
store volatile float %addRes, ptr @G_addRes
ret void
@@ -75,7 +91,7 @@ entry:
define dso_local dllexport spir_kernel void @k_float_controls_float_v(<2 x float> %f) {
entry:
-; CHECK-DAG: %[[#addRes_V]] = OpFAdd %[[#FLOAT_V_TYPE]]
+; CHECK-DAG: %[[#addResV]] = OpFAdd %[[#FLOAT_V_TYPE]]
%addRes = fadd <2 x float> %f, %f
store volatile <2 x float> %addRes, ptr @G_addResV
ret void
diff --git a/llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector/atomicrmw_faddfsub_vec_float16.ll b/llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector/atomicrmw_faddfsub_vec_float16.ll
index 36f6e38fc75de..733b356d82ff5 100644
--- a/llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector/atomicrmw_faddfsub_vec_float16.ll
+++ b/llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector/atomicrmw_faddfsub_vec_float16.ll
@@ -8,20 +8,20 @@
; CHECK-DAG: Capability AtomicFloat16VectorNV
; CHECK: Extension "SPV_NV_shader_atomic_fp16_vector"
; CHECK-DAG: %[[TyF16:[0-9]+]] = OpTypeFloat 16
-; CHECK: %[[TyF16Vec2:[0-9]+]] = OpTypeVector %[[TyF16]] 2
-; CHECK: %[[TyF16Vec4:[0-9]+]] = OpTypeVector %[[TyF16]] 4
-; CHECK: %[[TyF16Vec4Ptr:[0-9]+]] = OpTypePointer {{[a-zA-Z]+}} %[[TyF16Vec4]]
-; CHECK: %[[TyF16Vec2Ptr:[0-9]+]] = OpTypePointer {{[a-zA-Z]+}} %[[TyF16Vec2]]
-; CHECK: %[[TyInt32:[0-9]+]] = OpTypeInt 32 0
-; CHECK: %[[ConstF16:[0-9]+]] = OpConstant %[[TyF16]] 20800{{$}}
-; CHECK: %[[Const0F16Vec2:[0-9]+]] = OpConstantNull %[[TyF16Vec2]]
-; CHECK: %[[f:[0-9]+]] = OpVariable %[[TyF16Vec2Ptr]] CrossWorkgroup %[[Const0F16Vec2]]
-; CHECK: %[[Const0F16Vec4:[0-9]+]] = OpConstantNull %[[TyF16Vec4]]
-; CHECK: %[[g:[0-9]+]] = OpVariable %[[TyF16Vec4Ptr]] CrossWorkgroup %[[Const0F16Vec4]]
-; CHECK: %[[ConstF16Vec2:[0-9]+]] = OpConstantComposite %[[TyF16Vec2]] %[[ConstF16]] %[[ConstF16]]
-; CHECK: %[[ScopeAllSvmDevices:[0-9]+]] = OpConstantNull %[[TyInt32]]
-; CHECK: %[[MemSeqCst:[0-9]+]] = OpConstant %[[TyInt32]] 16{{$}}
-; CHECK: %[[ConstF16Vec4:[0-9]+]] = OpConstantComposite %[[TyF16Vec4]] %[[ConstF16]] %[[ConstF16]] %[[ConstF16]] %[[ConstF16]]
+; CHECK-DAG: %[[TyF16Vec2:[0-9]+]] = OpTypeVector %[[TyF16]] 2
+; CHECK-DAG: %[[TyF16Vec4:[0-9]+]] = OpTypeVector %[[TyF16]] 4
+; CHECK-DAG: %[[TyF16Vec4Ptr:[0-9]+]] = OpTypePointer {{[a-zA-Z]+}} %[[TyF16Vec4]]
+; CHECK-DAG: %[[TyF16Vec2Ptr:[0-9]+]] = OpTypePointer {{[a-zA-Z]+}} %[[TyF16Vec2]]
+; CHECK-DAG: %[[TyInt32:[0-9]+]] = OpTypeInt 32 0
+; CHECK-DAG: %[[ConstF16:[0-9]+]] = OpConstant %[[TyF16]] 20800{{$}}
+; CHECK-DAG: %[[Const0F16Vec2:[0-9]+]] = OpConstantNull %[[TyF16Vec2]]
+; CHECK-DAG: %[[f:[0-9]+]] = OpVariable %[[TyF16Vec2Ptr]] CrossWorkgroup %[[Const0F16Vec2]]
+; CHECK-DAG: %[[Const0F16Vec4:[0-9]+]] = OpConstantNull %[[TyF16Vec4]]
+; CHECK-DAG: %[[g:[0-9]+]] = OpVariable %[[TyF16Vec4Ptr]] CrossWorkgroup %[[Const0F16Vec4]]
+; CHECK-DAG: %[[ConstF16Vec2:[0-9]+]] = OpConstantComposite %[[TyF16Vec2]] %[[ConstF16]] %[[ConstF16]]
+; CHECK-DAG: %[[ScopeAllSvmDevices:[0-9]+]] = OpConstantNull %[[TyInt32]]
+; CHECK-DAG: %[[MemSeqCst:[0-9]+]] = OpConstant %[[TyInt32]] 16{{$}}
+; CHECK-DAG: %[[ConstF16Vec4:[0-9]+]] = OpConstantComposite %[[TyF16Vec4]] %[[ConstF16]] %[[ConstF16]] %[[ConstF16]] %[[ConstF16]]
@f = common dso_local local_unnamed_addr addrspace(1) global <2 x half> <half 0.000000e+00, half 0.000000e+00>
@g = common dso_local local_unnamed_addr addrspace(1) global <4 x half> <half 0.000000e+00, half 0.000000e+00, half 0.000000e+00, half 0.000000e+00>
@@ -44,4 +44,4 @@ entry:
%addval = atomicrmw fadd ptr addrspace(1) @g, <4 x half> <half 42.000000e+00, half 42.000000e+00, half 42.000000e+00, half 42.000000e+00> seq_cst
%subval = atomicrmw fsub ptr addrspace(1) @g, <4 x half> <half 42.000000e+00, half 42.000000e+00, half 42.000000e+00, half 42.000000e+00> seq_cst
ret void
-}
\ No newline at end of file
+}
diff --git a/llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector/atomicrmw_fminfmax_vec_float16.ll b/llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector/atomicrmw_fminfmax_vec_float16.ll
index 7ac772bf5d094..14e98a6fb1f05 100644
--- a/llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector/atomicrmw_fminfmax_vec_float16.ll
+++ b/llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector/atomicrmw_fminfmax_vec_float16.ll
@@ -8,20 +8,20 @@
; CHECK-DAG: Capability AtomicFloat16VectorNV
; CHECK: Extension "SPV_NV_shader_atomic_fp16_vector"
; CHECK-DAG: %[[TyF16:[0-9]+]] = OpTypeFloat 16
-; CHECK: %[[TyF16Vec2:[0-9]+]] = OpTypeVector %[[TyF16]] 2
-; CHECK: %[[TyF16Vec4:[0-9]+]] = OpTypeVector %[[TyF16]] 4
-; CHECK: %[[TyF16Vec4Ptr:[0-9]+]] = OpTypePointer {{[a-zA-Z]+}} %[[TyF16Vec4]]
-; CHECK: %[[TyF16Vec2Ptr:[0-9]+]] = OpTypePointer {{[a-zA-Z]+}} %[[TyF16Vec2]]
-; CHECK: %[[TyInt32:[0-9]+]] = OpTypeInt 32 0
-; CHECK: %[[ConstF16:[0-9]+]] = OpConstant %[[TyF16]] 20800{{$}}
-; CHECK: %[[Const0F16Vec2:[0-9]+]] = OpConstantNull %[[TyF16Vec2]]
-; CHECK: %[[f:[0-9]+]] = OpVariable %[[TyF16Vec2Ptr]] CrossWorkgroup %[[Const0F16Vec2]]
-; CHECK: %[[Const0F16Vec4:[0-9]+]] = OpConstantNull %[[TyF16Vec4]]
-; CHECK: %[[g:[0-9]+]] = OpVariable %[[TyF16Vec4Ptr]] CrossWorkgroup %[[Const0F16Vec4]]
-; CHECK: %[[ConstF16Vec2:[0-9]+]] = OpConstantComposite %[[TyF16Vec2]] %[[ConstF16]] %[[ConstF16]]
-; CHECK: %[[ScopeAllSvmDevices:[0-9]+]] = OpConstantNull %[[TyInt32]]
-; CHECK: %[[MemSeqCst:[0-9]+]] = OpConstant %[[TyInt32]] 16{{$}}
-; CHECK: %[[ConstF16Vec4:[0-9]+]] = OpConstantComposite %[[TyF16Vec4]] %[[ConstF16]] %[[ConstF16]] %[[ConstF16]] %[[ConstF16]]
+; CHECK-DAG: %[[TyF16Vec2:[0-9]+]] = OpTypeVector %[[TyF16]] 2
+; CHECK-DAG: %[[TyF16Vec4:[0-9]+]] = OpTypeVector %[[TyF16]] 4
+; CHECK-DAG: %[[TyF16Vec4Ptr:[0-9]+]] = OpTypePointer {{[a-zA-Z]+}} %[[TyF16Vec4]]
+; CHECK-DAG: %[[TyF16Vec2Ptr:[0-9]+]] = OpTypePointer {{[a-zA-Z]+}} %[[TyF16Vec2]]
+; CHECK-DAG: %[[TyInt32:[0-9]+]] = OpTypeInt 32 0
+; CHECK-DAG: %[[ConstF16:[0-9]+]] = OpConstant %[[TyF16]] 20800{{$}}
+; CHECK-DAG: %[[Const0F16Vec2:[0-9]+]] = OpConstantNull %[[TyF16Vec2]]
+; CHECK-DAG: %[[f:[0-9]+]] = OpVariable %[[TyF16Vec2Ptr]] CrossWorkgroup %[[Const0F16Vec2]]
+; CHECK-DAG: %[[Const0F16Vec4:[0-9]+]] = OpConstantNull %[[TyF16Vec4]]
+; CHECK-DAG: %[[g:[0-9]+]] = OpVariable %[[TyF16Vec4Ptr]] CrossWorkgroup %[[Const0F16Vec4]]
+; CHECK-DAG: %[[ConstF16Vec2:[0-9]+]] = OpConstantComposite %[[TyF16Vec2]] %[[ConstF16]] %[[ConstF16]]
+; CHECK-DAG: %[[ScopeAllSvmDevices:[0-9]+]] = OpConstantNull %[[TyInt32]]
+; CHECK-DAG: %[[MemSeqCst:[0-9]+]] = OpConstant %[[TyInt32]] 16{{$}}
+; CHECK-DAG: %[[ConstF16Vec4:[0-9]+]] = OpConstantComposite %[[TyF16Vec4]] %[[ConstF16]] %[[ConstF16]] %[[ConstF16]] %[[ConstF16]]
@f = common dso_local local_unnamed_addr addrspace(1) global <2 x half> <half 0.000000e+00, half 0.000000e+00>
@g = common dso_local local_unnamed_addr addrspace(1) global <4 x half> <half 0.000000e+00, half 0.000000e+00, half 0.000000e+00, half 0.000000e+00>
@@ -42,4 +42,4 @@ entry:
%minval = atomicrmw fmin ptr addrspace(1) @g, <4 x half> <half 42.000000e+00, half 42.000000e+00, half 42.000000e+00, half 42.000000e+00> seq_cst
%maxval = atomicrmw fmax ptr addrspace(1) @g, <4 x half> <half 42.000000e+00, half 42.000000e+00, half 42.000000e+00, half 42.000000e+00> seq_cst
ret void
-}
\ No newline at end of file
+}
diff --git a/llvm/test/CodeGen/SPIRV/pointers/fun-with-aggregate-arg-in-const-init.ll b/llvm/test/CodeGen/SPIRV/pointers/fun-with-aggregate-arg-in-const-init.ll
index ec3fd41f7de9e..ffac585669c26 100644
--- a/llvm/test/CodeGen/SPIRV/pointers/fun-with-aggregate-arg-in-const-init.ll
+++ b/llvm/test/CodeGen/SPIRV/pointers/fun-with-aggregate-arg-in-const-init.ll
@@ -6,47 +6,57 @@
; CHECK-DAG: OpExtension "SPV_INTEL_function_pointers"
; CHECK-DAG: OpName %[[#fArray:]] "array"
; CHECK-DAG: OpName %[[#fStruct:]] "struct"
+; CHECK-DAG: OpName %[[#f0:]] "f0"
+; CHECK-DAG: OpName %[[#f1:]] "f1"
+; CHECK-DAG: OpName %[[#f2:]] "f2"
; CHECK-DAG: %[[#Int8Ty:]] = OpTypeInt 8 0
-; CHECK: %[[#GlobalInt8PtrTy:]] = OpTypePointer CrossWorkgroup %[[#Int8Ty]]
-; CHECK: %[[#VoidTy:]] = OpTypeVoid
-; CHECK: %[[#TestFnTy:]] = OpTypeFunction %[[#VoidTy]] %[[#GlobalInt8PtrTy]]
-; CHECK: %[[#F16Ty:]] = OpTypeFloat 16
-; CHECK: %[[#t_halfTy:]] = OpTypeStruct %[[#F16Ty]]
-; CHECK: %[[#FnTy:]] = OpTypeFunction %[[#t_halfTy]] %[[#GlobalInt8PtrTy]] %[[#t_halfTy]]
-; CHECK: %[[#IntelFnPtrTy:]] = OpTypePointer CodeSectionINTEL %[[#FnTy]]
-; CHECK: %[[#Int8PtrTy:]] = OpTypePointer Function %[[#Int8Ty]]
-; CHECK: %[[#Int32Ty:]] = OpTypeInt 32 0
-; CHECK: %[[#I32Const3:]] = OpConstant %[[#Int32Ty]] 3
-; CHECK: %[[#FnArrTy:]] = OpTypeArray %[[#Int8PtrTy]] %[[#I32Const3]]
-; CHECK: %[[#GlobalFnArrPtrTy:]] = OpTypePointer CrossWorkgroup %[[#FnArrTy]]
-; CHECK: %[[#GlobalFnPtrTy:]] = OpTypePointer CrossWorkgroup %[[#FnTy]]
-; CHECK: %[[#FnPtrTy:]] = OpTypePointer Function %[[#FnTy]]
-; CHECK: %[[#StructWithPfnTy:]] = OpTypeStruct %[[#FnPtrTy]] %[[#FnPtrTy]] %[[#FnPtrTy]]
-; CHECK: %[[#ArrayOfPfnTy:]] = OpTypeArray %[[#FnPtrTy]] %[[#I32Const3]]
-; CHECK: %[[#Int64Ty:]] = OpTypeInt 64 0
-; CHECK: %[[#GlobalStructWithPfnPtrTy:]] = OpTypePointer CrossWorkgroup %[[#StructWithPfnTy]]
-; CHECK: %[[#GlobalArrOfPfnPtrTy:]] = OpTypePointer CrossWorkgroup %[[#ArrayOfPfnTy]]
-; CHECK: %[[#I64Const2:]] = OpConstant %[[#Int64Ty]] 2
-; CHECK: %[[#I64Const1:]] = OpConstant %[[#Int64Ty]] 1
-; CHECK: %[[#I64Const0:]] = OpConstantNull %[[#Int64Ty]]
-; CHECK: %[[#f0Pfn:]] = OpConstantFunctionPointerINTEL %[[#IntelFnPtrTy]] %28
-; CHECK: %[[#f1Pfn:]] = OpConstantFunctionPointerINTEL %[[#IntelFnPtrTy]] %32
-; CHECK: %[[#f2Pfn:]] = OpConstantFunctionPointerINTEL %[[#IntelFnPtrTy]] %36
-; CHECK: %[[#f0Cast:]] = OpSpecConstantOp %[[#FnPtrTy]] Bitcast %[[#f0Pfn]]
-; CHECK: %[[#f1Cast:]] = OpSpecConstantOp %[[#FnPtrTy]] Bitcast %[[#f1Pfn]]
-; CHECK: %[[#f2Cast:]] = OpSpecConstantOp %[[#FnPtrTy]] Bitcast %[[#f2Pfn]]
-; CHECK: %[[#fnptrTy:]] = OpConstantComposite %[[#ArrayOfPfnTy]] %[[#f0Cast]] %[[#f1Cast]] %[[#f2Cast]]
-; CHECK: %[[#fnptr:]] = OpVariable %[[#GlobalArrOfPfnPtrTy]] CrossWorkgroup %[[#fnptrTy]]
-; CHECK: %[[#fnstructTy:]] = OpConstantComposite %[[#StructWithPfnTy]] %[[#f0Cast]] %[[#f1Cast]] %[[#f2Cast]]
-; CHECK: %[[#fnstruct:]] = OpVariable %[[#GlobalStructWithPfnPtrTy:]] CrossWorkgroup %[[#fnstructTy]]
+; CHECK-DAG: %[[#GlobalInt8PtrTy:]] = OpTypePointer CrossWorkgroup %[[#Int8Ty]]
+; CHECK-DAG: %[[#VoidTy:]] = OpTypeVoid
+; CHECK-DAG: %[[#TestFnTy:]] = OpTypeFunction %[[#VoidTy]] %[[#GlobalInt8PtrTy]]
+; CHECK-DAG: %[[#F16Ty:]] = OpTypeFloat 16
+; CHECK-DAG: %[[#t_halfTy:]] = OpTypeStruct %[[#F16Ty]]
+; CHECK-DAG: %[[#FnTy:]] = OpTypeFunction %[[#t_halfTy]] %[[#GlobalInt8PtrTy]] %[[#t_halfTy]]
+; CHECK-DAG: %[[#IntelFnPtrTy:]] = OpTypePointer CodeSectionINTEL %[[#FnTy]]
+; CHECK-DAG: %[[#Int8PtrTy:]] = OpTypePointer Function %[[#Int8Ty]]
+; CHECK-DAG: %[[#Int32Ty:]] = OpTypeInt 32 0
+; CHECK-DAG: %[[#I32Const3:]] = OpConstant %[[#Int32Ty]] 3
+; CHECK-DAG: %[[#FnArrTy:]] = OpTypeArray %[[#Int8PtrTy]] %[[#I32Const3]]
+; CHECK-DAG: %[[#GlobalFnArrPtrTy:]] = OpTypePointer CrossWorkgroup %[[#FnArrTy]]
+; CHECK-DAG: %[[#GlobalFnPtrTy:]] = OpTypePointer CrossWorkgroup %[[#FnTy]]
+; CHECK-DAG: %[[#FnPtrTy:]] = OpTypePointer Function %[[#FnTy]]
+; CHECK-DAG: %[[#StructWithPfnTy:]] = OpTypeStruct %[[#FnPtrTy]] %[[#FnPtrTy]] %[[#FnPtrTy]]
+; CHECK-DAG: %[[#ArrayOfPfnTy:]] = OpTypeArray %[[#FnPtrTy]] %[[#I32Const3]]
+; CHECK-DAG: %[[#Int64Ty:]] = OpTypeInt 64 0
+; CHECK-DAG: %[[#GlobalStructWithPfnPtrTy:]] = OpTypePointer CrossWorkgroup %[[#StructWithPfnTy]]
+; CHECK-DAG: %[[#GlobalArrOfPfnPtrTy:]] = OpTypePointer CrossWorkgroup %[[#ArrayOfPfnTy]]
+; CHECK-DAG: %[[#I64Const2:]] = OpConstant %[[#Int64Ty]] 2
+; CHECK-DAG: %[[#I64Const1:]] = OpConstant %[[#Int64Ty]] 1
+; CHECK-DAG: %[[#I64Const0:]] = OpConstantNull %[[#Int64Ty]]
+; CHECK-DAG: %[[#f0Pfn:]] = OpConstantFunctionPointerINTEL %[[#IntelFnPtrTy]] %[[#f0]]
+; CHECK-DAG: %[[#f1Pfn:]] = OpConstantFunctionPointerINTEL %[[#IntelFnPtrTy]] %[[#f1]]
+; CHECK-DAG: %[[#f2Pfn:]] = OpConstantFunctionPointerINTEL %[[#IntelFnPtrTy]] %[[#f2]]
+
+; These constants appear twice (duplicated) at the moment
+; CHECK-DAG: %[[#f0Cast_0:]] = OpSpecConstantOp %[[#FnPtrTy]] Bitcast %[[#f0Pfn]]
+; CHECK-DAG: %[[#f1Cast_0:]] = OpSpecConstantOp %[[#FnPtrTy]] Bitcast %[[#f1Pfn]]
+; CHECK-DAG: %[[#f2Cast_0:]] = OpSpecConstantOp %[[#FnPtrTy]] Bitcast %[[#f2Pfn]]
+; CHECK-DAG: %[[#f0Cast_1:]] = OpSpecConstantOp %[[#FnPtrTy]] Bitcast %[[#f0Pfn]]
+; CHECK-DAG: %[[#f1Cast_1:]] = OpSpecConstantOp %[[#FnPtrTy]] Bitcast %[[#f1Pfn]]
+; CHECK-D...
[truncated]
|
kwk
left a comment
There was a problem hiding this comment.
Thank you for working on this!!! To me this change in shouldEmitIntrinsicsForGlobalValue looks reasonable. But I feel like I cannot judge the tests and nor the entire handling in context. That's why I wait for someone else to approve this change. But with your detailed description in the PR's initial comment and the source code comments, this looks good.
llvm/test/CodeGen/SPIRV/extensions/SPV_KHR_float_controls2/exec_mode3.ll
Show resolved
Hide resolved
Sorry for the delays. Had to work on some other issues in parallel and I wanted to make sure there was no other "quick" alternative. Would you be able to check if this patch allows you to fully build libclc ? |
nikic
left a comment
There was a problem hiding this comment.
I'm not sure I fully understand what the requirements here are. Is the problem here just that we need some place to emit these instructions, and as the LLVM backend only has a notion of instructions inside functions, we solve this by emitting them in each function?
Would it be possible to instead create some dummy __global_init function that holds all the global declaration/initialization instructions? Or is the problem then that we can't properly reference the global from within functions?
Yes I'm running the build now and will check it later in the evening. I'm sorry to not have the results earlier but I'm out during the day. |
I think that something along those lines would be possible. We could have all the globals constructors in a single |
I believe it would cause a problem when a GV is referenced in other function. I feel like that global declarations handling in SPIR-V backend should be revisited as current approach causes multiple translation issues including translation of debug information. Conceptually the current PR is LGTM (yet haven't yet took a look at the code). Redesigning global symbols lowering should be out of the scope of the current effort. |
2ce3e74 to
1e85a60
Compare
|
@jmmartinez When I compile 1e85a60 I run into the following error. I should say that I've built many of your commits yesterday and I never did a $ cmake ../llvm -DLLVM_ENABLE_PROJECTS="clang" -DLLVM_ENABLE_RUNTIMES="libclc" -DCMAKE_BUILD_TYPE="Debug" -DLLVM_PARALLEL_LINK_JOBS=1 -DLLVM_ENABLE_ZLIB:BOOL=FORCE_ON -DLLVM_ENABLE_ZSTD:BOOL=FORCE_ON -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-redhat-linux -DCMAKE_GENERATOR=Ninja -DLIBCLC_USE_SPIRV_BACKEND=ON
$ ninja |
|
@kwk This looks like an unrelated issue. |
|
@kwk it looks like a different issue. If you can share the bitcode file |
@jmmartinez sure, here it is: builtins.link.spirv64-mesa3d-.zip |
FYI, it looks like The minimal reproducer for this would be: ; RUN: llc -mtriple spirv64-- < %s
define spir_func void @_Z12__clc_remquoffPi() {
%a = call float @llvm.canonicalize.f32(float 1.000000e+00)
store float %a, ptr null
ret void
} |
|
Idea makes sense to me and code looks reasonable, but I don't feel qualified to formally approve :) |
3d01fff to
af0668e
Compare
|
I'm merging this PR. The 2 fails during testing are there in main and were triggered by an update on spirv-val: KhronosGroup/SPIRV-Tools#6524 |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/66/builds/25808 Here is the relevant piece of the build log for the reference |
|
Reverting this due to a suspicious ASAN failure: #179268 |
…erences them (#178143 (#179268) This reverts commit 1daef59. From the ASAN buildbot: ```bash FAIL: LLVM :: CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll (46596 of 94488) ******************** TEST 'LLVM :: CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll' FAILED ******************** Exit Code: 2 Command Output (stdout): -- # RUN: at line 5 /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/llc -verify-machineinstrs -O0 -mtriple=spirv32-unknown-unknown /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll -o - --spirv-ext=+SPV_INTEL_function_pointers | /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll # executed command: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/llc -verify-machineinstrs -O0 -mtriple=spirv32-unknown-unknown /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll -o - --spirv-ext=+SPV_INTEL_function_pointers # note: command had no output on stdout or stderr # error: command failed with exit status: 1 # executed command: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll # .---command stderr------------ # | FileCheck error: '<stdin>' is empty. # | FileCheck command line: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll # `----------------------------- # error: command failed with exit status: 2 ``` I did not confirm this commit is introducing the issue. But since its likely related I'm reverting it until I confirm everything is ok or fix it.
It seems that this PR may not be the root cause of the failure. But the failure is there over main on a reduced example (and this PR triggered it). I'll work on fixing this and then reapply this PR wihtout modificaitons. ; with ASAN
; llc -verify-machineinstrs -O0 -mtriple=spirv32-unknown-unknown reduced.ll -o /dev/null --spirv-ext=+SPV_INTEL_function_pointers
define void @foo() {
entry:
store ptr addrspace(4) addrspacecast (ptr @foo to ptr addrspace(4)), ptr null, align 8
ret void
} |
I'm seeing a crash with this: FAILED: /home/fedora/src/llvm-project/main/build/lib/clang/23/lib/spirv32--/libclc.spvcd /home/fedora/src/llvm-project/main/build/runtimes/runtimes-bins/libclc && /home/fedora/src/llvm-project/main/build/bin/clang-23 -c --target=spirv32-- -x ir -o /home/fedora/src/llvm-project/main/build/./lib/clang/23/lib/spirv32--/libclc.spv /home/fedora/src/llvm-project/main/build/runtimes/runtimes-bins/libclc/obj.libclc.dir/spirv-mesa3d-/builtins.link.spirv-mesa3d-.bc fatal error: error in backend: unable to legalize instruction: %88:fid(s32) = G_FCANONICALIZE %87:fid (in function: _Z12__clc_remquoffPi) PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /home/fedora/src/llvm-project/main/build/bin/clang-23 -c --target=spirv32-- -x ir -o /home/fedora/src/llvm-project/main/build/./lib/clang/23/lib/spirv32--/libclc.spv /home/fedora/src/llvm-project/main/build/runtimes/runtimes-bins/libclc/obj.libclc.dir/spirv-mesa3d-/builtins.link.spirv-mesa3d-.bc 1. Code generation 2. Running pass 'Function Pass Manager' on module '/home/fedora/src/llvm-project/main/build/runtimes/runtimes-bins/libclc/obj.libclc.dir/spirv-mesa3d-/builtins.link.spirv-mesa3d-.bc'. 3. Running pass 'Legalizer' on function '@_Z12__clc_remquoffPi' #0 0x0000000004b7539a llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/fedora/src/llvm-project/main/llvm/lib/Support/Unix/Signals.inc:842:22 #1 0x0000000004b75866 PrintStackTraceSignalHandler(void*) /home/fedora/src/llvm-project/main/llvm/lib/Support/Unix/Signals.inc:924:1 #2 0x0000000004b72dba llvm::sys::RunSignalHandlers() /home/fedora/src/llvm-project/main/llvm/lib/Support/Signals.cpp:108:20 #3 0x0000000004b74bf8 llvm::sys::CleanupOnSignal(unsigned long) /home/fedora/src/llvm-project/main/llvm/lib/Support/Unix/Signals.inc:376:31 #4 0x0000000004a934af (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /home/fedora/src/llvm-project/main/llvm/lib/Support/CrashRecoveryContext.cpp:73:5 #5 0x0000000004a93b54 llvm::CrashRecoveryContext::HandleExit(int) /home/fedora/src/llvm-project/main/llvm/lib/Support/CrashRecoveryContext.cpp:446:3 #6 0x0000000004b6ee21 llvm::sys::Process::Exit(int, bool) /home/fedora/src/llvm-project/main/llvm/lib/Support/Process.cpp:114:3 #7 0x000000000040ea84 ensureSufficientStack() /home/fedora/src/llvm-project/main/clang/tools/driver/cc1_main.cpp:87:37 #8 0x0000000004a9ff21 llvm::report_fatal_error(llvm::Twine const&, bool) /home/fedora/src/llvm-project/main/llvm/lib/Support/ErrorHandling.cpp:117:36 #9 0x0000000004aa0100 llvm::install_bad_alloc_error_handler(void (*)(void*, char const*, bool), void*) /home/fedora/src/llvm-project/main/llvm/lib/Support/ErrorHandling.cpp:160:61 #10 0x00000000067c847c reportGISelDiagnostic(llvm::DiagnosticSeverity, llvm::MachineFunction&, llvm::MachineOptimizationRemarkEmitter&, llvm::MachineOptimizationRemarkMissed&) /home/fedora/src/llvm-project/main/llvm/lib/CodeGen/GlobalISel/Utils.cpp:250:14 #11 0x00000000067c8509 llvm::reportGISelFailure(llvm::MachineFunction&, llvm::MachineOptimizationRemarkEmitter&, llvm::MachineOptimizationRemarkMissed&) /home/fedora/src/llvm-project/main/llvm/lib/CodeGen/GlobalISel/Utils.cpp:264:1 #12 0x00000000067c8731 llvm::reportGISelFailure(llvm::MachineFunction&, llvm::MachineOptimizationRemarkEmitter&, char const*, llvm::StringRef, llvm::MachineInstr const&) /home/fedora/src/llvm-project/main/llvm/lib/CodeGen/GlobalISel/Utils.cpp:278:1 #13 0x000000000672b5e0 llvm::Legalizer::runOnMachineFunction(llvm::MachineFunction&) /home/fedora/src/llvm-project/main/llvm/lib/CodeGen/GlobalISel/Legalizer.cpp:360:12 #14 0x00000000039c1fad llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /home/fedora/src/llvm-project/main/llvm/lib/CodeGen/MachineFunctionPass.cpp:112:30 #15 0x0000000004213827 llvm::FPPassManager::runOnFunction(llvm::Function&) /home/fedora/src/llvm-project/main/llvm/lib/IR/LegacyPassManager.cpp:1398:20 #16 0x0000000004213a89 llvm::FPPassManager::runOnModule(llvm::Module&) /home/fedora/src/llvm-project/main/llvm/lib/IR/LegacyPassManager.cpp:1444:13 #17 0x0000000004213e94 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /home/fedora/src/llvm-project/main/llvm/lib/IR/LegacyPassManager.cpp:1513:20 #18 0x000000000420f656 llvm::legacy::PassManagerImpl::run(llvm::Module&) /home/fedora/src/llvm-project/main/llvm/lib/IR/LegacyPassManager.cpp:531:13 #19 0x00000000042146b1 llvm::legacy::PassManager::run(llvm::Module&) /home/fedora/src/llvm-project/main/llvm/lib/IR/LegacyPassManager.cpp:1641:1 #20 0x0000000004f77cb2 (anonymous namespace)::EmitAssemblyHelper::RunCodegenPipeline(clang::BackendAction, std::unique_ptr>&, std::unique_ptr>&) /home/fedora/src/llvm-project/main/clang/lib/CodeGen/BackendUtil.cpp:1278:9 #21 0x0000000004f77ef5 (anonymous namespace)::EmitAssemblyHelper::emitAssembly(clang::BackendAction, std::unique_ptr>, clang::BackendConsumer*) /home/fedora/src/llvm-project/main/clang/lib/CodeGen/BackendUtil.cpp:1303:7 #22 0x0000000004f78f5c clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr, std::unique_ptr>, clang::BackendConsumer*) /home/fedora/src/llvm-project/main/clang/lib/CodeGen/BackendUtil.cpp:1473:25 #23 0x0000000005a4b6d6 clang::CodeGenAction::ExecuteAction() /home/fedora/src/llvm-project/main/clang/lib/CodeGen/CodeGenAction.cpp:1183:20 #24 0x0000000005d9441e clang::FrontendAction::Execute() /home/fedora/src/llvm-project/main/clang/lib/Frontend/FrontendAction.cpp:1317:38 #25 0x0000000005caf02a clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /home/fedora/src/llvm-project/main/clang/lib/Frontend/CompilerInstance.cpp:1007:42 #26 0x0000000005f70118 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /home/fedora/src/llvm-project/main/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:310:38 #27 0x000000000040fbc8 cc1_main(llvm::ArrayRef, char const*, void*) /home/fedora/src/llvm-project/main/clang/tools/driver/cc1_main.cpp:304:40 #28 0x00000000004024e7 ExecuteCC1Tool(llvm::SmallVectorImpl&, llvm::ToolContext const&, llvm::IntrusiveRefCntPtr) /home/fedora/src/llvm-project/main/clang/tools/driver/driver.cpp:226:20 #29 0x00000000004026ea clang_main(int, char**, llvm::ToolContext const&)::'lambda'(llvm::SmallVectorImpl&)::operator()(llvm::SmallVectorImpl&) const /home/fedora/src/llvm-project/main/clang/tools/driver/driver.cpp:376:26 #30 0x0000000000403c9c int llvm::function_ref&)>::callback_fn&)>(long, llvm::SmallVectorImpl&) /home/fedora/src/llvm-project/main/llvm/include/llvm/ADT/STLFunctionalExtras.h:48:3 #31 0x0000000005b0984b llvm::function_ref&)>::operator()(llvm::SmallVectorImpl&) const /home/fedora/src/llvm-project/main/llvm/include/llvm/ADT/STLFunctionalExtras.h:70:3 #32 0x0000000005b08728 clang::driver::CC1Command::Execute(llvm::ArrayRef>, std::__cxx11::basic_string, std::allocator>*, bool*) const::'lambda'()::operator()() const /home/fedora/src/llvm-project/main/clang/lib/Driver/Job.cpp:442:32 #33 0x0000000005b08b96 void llvm::function_ref::callback_fn>, std::__cxx11::basic_string, std::allocator>*, bool*) const::'lambda'()>(long) /home/fedora/src/llvm-project/main/llvm/include/llvm/ADT/STLFunctionalExtras.h:47:40 #34 0x000000000369c1a2 llvm::function_ref::operator()() const /home/fedora/src/llvm-project/main/llvm/include/llvm/ADT/STLFunctionalExtras.h:69:62 #35 0x0000000004a93aed llvm::CrashRecoveryContext::RunSafely(llvm::function_ref) /home/fedora/src/llvm-project/main/llvm/lib/Support/CrashRecoveryContext.cpp:427:10 #36 0x0000000005b08921 clang::driver::CC1Command::Execute(llvm::ArrayRef>, std::__cxx11::basic_string, std::allocator>*, bool*) const /home/fedora/src/llvm-project/main/clang/lib/Driver/Job.cpp:442:7 #37 0x0000000005aa4d68 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const /home/fedora/src/llvm-project/main/clang/lib/Driver/Compilation.cpp:196:22 #38 0x0000000005aa507b clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl>&, bool) const /home/fedora/src/llvm-project/main/clang/lib/Driver/Compilation.cpp:246:62 #39 0x0000000005abac12 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl>&) /home/fedora/src/llvm-project/main/clang/lib/Driver/Driver.cpp:2265:28 #40 0x0000000000403878 clang_main(int, char**, llvm::ToolContext const&) /home/fedora/src/llvm-project/main/clang/tools/driver/driver.cpp:414:39 #41 0x0000000000432db9 main /home/fedora/src/llvm-project/main/build/tools/clang/tools/driver/clang-driver.cpp:17:20 #42 0x00007f4391924575 __libc_start_call_main (/lib64/libc.so.6+0x3575) #43 0x00007f4391924628 __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3628) #44 0x00000000004018e5 _start (/home/fedora/src/llvm-project/main/build/bin/clang-23+0x4018e5) |
@kwk that's a failure to select the '@llvm.canonicalize' intrinsic. That issue is orthogonal to the memory consumption issue and the asan issue. I saw a related draft that tries to solve the issue you mention in #178439 . |
…them (llvm#178143) In the SPIRV backend, the `SPIRVEmitIntrinscs::processGlobalValue` function adds intrinsic calls for every global variable of the module, on every function. These intrinsics are used to keep track of global variables, their types and initializers. In SPIRV everything is an instruction (even globals/constants). We currently represent these global entities as individual instructions on every function. Later, the `SPIRVModuleAnalysis` collects these entities and maps function _local_ registers to _global_ registers. The `SPIRVAsmPrinter` is in charge of mapping back the _local_ registers to the appropriate _global_ register. These instructions associated with global entities on functions that do not reference them leads to a bloated intermediate representation and high memory consumption (as it happened in llvm#170339). Consider this example: ```cpp int A[1024] = { 0, 1, 2, ..., 1023 }; int get_a(int i) { return A[i]; } void say_hi() { puts("hi!\n"); } ``` Although, `say_hi` does not reference `A`; we would generate machine instructions for the declaration of `A`, for the constant array that initializes it, and for the 1024 constant elements of `A`. This patch doesn't fix the underlying issue, but it mitigates it by only emitting global-variable SPIRV intrinsics on the functions that use it. If the global is not referenced by any function, we just pick the first function definition. With this patch, the example in llvm#170339 drops from ~33Gb of maximum resident set size to less than ~590Mb. And compile time goes from 2:09min to ~5secs. The changes in the tests are due to changes in the order in which the instructions appear. In `fun-with-aggregate-arg-in-const-init.ll` 2 duplicated `OpConstantComposite` are emitted anymore.
…erences them (llvm#178143 (llvm#179268) This reverts commit 1daef59. From the ASAN buildbot: ```bash FAIL: LLVM :: CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll (46596 of 94488) ******************** TEST 'LLVM :: CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll' FAILED ******************** Exit Code: 2 Command Output (stdout): -- # RUN: at line 5 /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/llc -verify-machineinstrs -O0 -mtriple=spirv32-unknown-unknown /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll -o - --spirv-ext=+SPV_INTEL_function_pointers | /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll # executed command: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/llc -verify-machineinstrs -O0 -mtriple=spirv32-unknown-unknown /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll -o - --spirv-ext=+SPV_INTEL_function_pointers # note: command had no output on stdout or stderr # error: command failed with exit status: 1 # executed command: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll # .---command stderr------------ # | FileCheck error: '<stdin>' is empty. # | FileCheck command line: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-addrcast.ll # `----------------------------- # error: command failed with exit status: 2 ``` I did not confirm this commit is introducing the issue. But since its likely related I'm reverting it until I confirm everything is ok or fix it.
@jmmartinez were you able to identify or fix the ASAN issue? Apparently you can reproduce the issue without this PR. |
Hi. Not yet. From what I understand at the moment, it looks like the |
This test is reduced from a failure detected after attempting to merge llvm#178143. Without ASAN the test passes. `spirv-val` fails likely due to an unimplemented feature in `spirv-val` (assuming our implementation is correct). stack-info: PR: llvm#182550, branch: users/jmmartinez/spirv/memory-issues-2
Our backend implementation is currently very unpractical. We keep a dictionary that maps registers to their type. If the register type and instruction result type do not match, `validatePtrTypes` will insert bitcasts and fix the situation. However, when no instruction references the type in the dictionary, we may remove it. Even if it is needed later. This patch prevents this from happening for `OpPointerType`. This is far from a good solution... But it gives us time to: * reland llvm#178143 and fix llvm#170339 which unblocks some of the backend's users * explore a better solution than the hidden inconsistent global state that we currently have stack-info: PR: llvm#182551, branch: users/jmmartinez/spirv/memory-issues-3
In the SPIRV backend, the
SPIRVEmitIntrinscs::processGlobalValuefunction adds intrinsic calls for every global variable of the module, on every function.These intrinsics are used to keep track of global variables, their types and initializers.
In SPIRV everything is an instruction (even globals/constants). We currently represent these global entities as individual instructions on every function. Later, the
SPIRVModuleAnalysiscollects these entities and maps function local registers to global registers. TheSPIRVAsmPrinteris in charge of mapping back the local registers to the appropriate global register.These instructions associated with global entities on functions that do not reference them leads to a bloated intermediate representation and high memory consumption (as it happened in #170339).
Consider this example:
Although,
say_hidoes not referenceA; we would generate machine instructions for the declaration ofA, for the constant array that initializes it, and for the 1024 constant elements ofA.This patch doesn't fix the underlying issue, but it mitigates it by only emitting global-variable SPIRV intrinsics on the functions that use it. If the global is not referenced by any function, we just pick the first function definition.
With this patch, the example in #170339 drops from ~33Gb of maximum resident set size to less than ~590Mb. And compile time goes from 2:09min to ~5secs.
The changes in the tests are due to changes in the order in which the instructions appear.
In
fun-with-aggregate-arg-in-const-init.ll2 duplicatedOpConstantCompositeare emitted anymore.