You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 23, 2021. It is now read-only.
When lowering MLIR to LLVM, since memrefs are lowered through llvm structs that hold the descriptor info, the alloca's for these structs can exhaust stack space when there are calls with memref args inside a loop! Here's an example snippet:
The call to foo will be preceded by three alloca's corresponding to the memrefs passed. The lowered LLVM dialect snippet is below, and given a typical number of %arg6 iterations, will run out of stack space (with 8 MB stacks).
^bb19(%151: !llvm.i64): // 2 preds: ^bb18, ^bb20
%152 = llvm.icmp "slt" %151, %149 : !llvm.i64
llvm.cond_br %152, ^bb20, ^bb21
^bb20: // pred: ^bb19
%153 = llvm.mlir.constant(1 : index) : !llvm.i64
%154 = llvm.alloca %153 x !llvm<"{ float*, i64, [2 x i64], [2 x i64] }"> : (!llvm.i64) -> !llvm<"{ float*, i64, [2 x i64], [2 x i64] }*">
llvm.store %32, %154 : !llvm<"{ float*, i64, [2 x i64], [2 x i64] }*">
%155 = llvm.mlir.constant(1 : index) : !llvm.i64
%156 = llvm.alloca %155 x !llvm<"{ <8 x float>*, i64, [2 x i64], [2 x i64] }"> : (!llvm.i64) -> !llvm<"{ <8 x float>*, i64, [2 x i64], [2 x i64] }*">
llvm.store %102, %156 : !llvm<"{ <8 x float>*, i64, [2 x i64], [2 x i64] }*">
%157 = llvm.mlir.constant(1 : index) : !llvm.i64
%158 = llvm.alloca %157 x !llvm<"{ <8 x float>*, i64, [2 x i64], [2 x i64] }"> : (!llvm.i64) -> !llvm<"{ <8 x float>*, i64, [2 x i64], [2 x i64] }*">
llvm.store %2, %158 : !llvm<"{ <8 x float>*, i64, [2 x i64], [2 x i64] }*">
llvm.call @foo(%154, %156, %158, %151) : (!llvm<"{ float*, i64, [2 x i64], [2 x i64] }*">, !llvm<"{ <8 x float>*, i64, [2 x i64], [2 x i64] }*">, !llvm<"{ <8 x float>*, i64, [2 x i64], [2 x i64] }*">, !llvm.i64) -> ()
%159 = llvm.add %151, %150 : !llvm.i64
llvm.br ^bb19(%159 : !llvm.i64)
^bb21: // pred: ^bb19
Increasing the stack size is a stop gap and obviously solves the issue here. I think this issue requires the same approach as with block local variables in C/C++ (say large structs with loop body scope)? Another solution is of inserting these alloca's at the highest level, i.e., right after the descriptors are defined (%32, %102, and %2 above).
On a separate note, hoisting such alloca's out is valid here; however, LICM won't do it since alloc's have side effects. Moreover, it can't be done without knowing what's inside @foo, even if there is a utility to hoist alloc's. In some way, the meaning / special property of these alloc'ed descriptors is hard to later recover if you don't exploit it at the time you generate them.
The text was updated successfully, but these errors were encountered:
When lowering MLIR to LLVM, since memrefs are lowered through llvm structs that hold the descriptor info, the alloca's for these structs can exhaust stack space when there are calls with memref args inside a loop! Here's an example snippet:
The call to foo will be preceded by three alloca's corresponding to the memrefs passed. The lowered LLVM dialect snippet is below, and given a typical number of %arg6 iterations, will run out of stack space (with 8 MB stacks).
Increasing the stack size is a stop gap and obviously solves the issue here. I think this issue requires the same approach as with block local variables in C/C++ (say large structs with loop body scope)? Another solution is of inserting these alloca's at the highest level, i.e., right after the descriptors are defined (%32, %102, and %2 above).
On a separate note, hoisting such alloca's out is valid here; however, LICM won't do it since alloc's have side effects. Moreover, it can't be done without knowing what's inside @foo, even if there is a utility to hoist alloc's. In some way, the meaning / special property of these alloc'ed descriptors is hard to later recover if you don't exploit it at the time you generate them.
The text was updated successfully, but these errors were encountered: