Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[MLIR][OpenMP] Correctly handle branching within target captures (#217)
This patch improves the detection of captured OpenMP constructs inside of an `omp.target` operation by also considering potential branches. If a nested OpenMP construct can be executed in a loop or optionally omitted by means of explicit MLIR control flow, then it's not supposed to be captured. The following Fortran example results in such a case: ```f90 !$omp target teams do i = 1, n !$omp distribute parallel do do j = 1, n ... end do !$omp end distribute parallel do end do !$omp end target teams ``` The result of lowering that code to MLIR is the creation of multiple blocks and branches inside of the `omp.teams` operation's region. Without this change, it is identified as an SPMD kernel during translation to LLVM IR due to the nesting of operations, but it is a generic kernel, so it causes a compiler crash. This is because it tries to get host-evaluated loop bounds that do not exist to calculate the inner loop's trip count. ```mlir omp.target map_entries(...) { ... omp.teams { %233 = llvm.trunc %227 : i64 to i32 llvm.br ^bb1(%233, %225 : i32, i64) ^bb1(%234: i32, %235: i64): // 2 preds: ^bb0, ^bb2 %236 = llvm.icmp "sgt" %235, %226 : i64 llvm.cond_br %236, ^bb2, ^bb3 ^bb2: // pred: ^bb1 llvm.store %234, %arg5 : i32, !llvm.ptr omp.parallel ... { ... } {omp.composite} ... llvm.br ^bb1(%239, %240 : i32, i64) ^bb3: // pred: ^bb1 llvm.store %234, %arg5 : i32, !llvm.ptr omp.terminator } omp.terminator } ```
- Loading branch information