-
Notifications
You must be signed in to change notification settings - Fork 15.3k
Description
git version: 2172a5e
system: Ubuntu 18.04.6 LTS
Description:
I am experiencing an inconsistent result when executing the same MLIR program with and without linalg-specialize-generic-ops.
Steps to Reproduce:
1. MLIR Program (a.mlir):
a.mlir:
module {
func.func private @printMemrefI32(tensor<*xi32>)
func.func private @printMemrefF32(tensor<*xf32>)
func.func @main() -> () {
%arg0 = index.constant 0
%6 = "tosa.const"() <{values = dense<-132> : tensor<1x2x1xi32>}> : () -> tensor<1x2x1xi32>
%11 = tosa.cast %6 : (tensor<1x2x1xi32>) -> tensor<1x2x1xf32>
%15 = "tosa.const"() <{values = dense<0> : tensor<1x2x1xi32>}> : () -> tensor<1x2x1xi32>
%16 = tosa.while_loop (%arg1 = %15) : (tensor<1x2x1xi32>) -> tensor<1x2x1xi32> {
%40 = "tosa.const"() <{values = dense<3> : tensor<1x2x1xi32>}> : () -> tensor<1x2x1xi32>
%41 = tosa.greater %40, %arg1 : (tensor<1x2x1xi32>, tensor<1x2x1xi32>) -> tensor<1x2x1xi1>
%extracted = tensor.extract %41[%arg0, %arg0, %arg0] : tensor<1x2x1xi1>
%from_elements = tensor.from_elements %extracted : tensor<i1>
tosa.yield %from_elements : tensor<i1>
} do {
^bb0(%arg1: tensor<1x2x1xi32>):
%45 = "tosa.const"() <{values = dense<1> : tensor<1x2x1xi32>}> : () -> tensor<1x2x1xi32>
%46 = tosa.add %arg1, %45 : (tensor<1x2x1xi32>, tensor<1x2x1xi32>) -> tensor<1x2x1xi32>
tosa.yield %46 : tensor<1x2x1xi32>
}
%17 = tosa.clamp %16 {max_val = 1073741823 : i32, min_val = -1073741824 : i32} : (tensor<1x2x1xi32>) -> tensor<1x2x1xi32>
%18 = tosa.clamp %16 {max_val = 1073741823 : i32, min_val = -1073741824 : i32} : (tensor<1x2x1xi32>) -> tensor<1x2x1xi32>
%19 = tosa.sub %17, %18 : (tensor<1x2x1xi32>, tensor<1x2x1xi32>) -> tensor<1x2x1xi32>
%cast19 = tensor.cast %19 : tensor<1x2x1xi32> to tensor<*xi32>
call @printMemrefI32(%cast19) : (tensor<*xi32>) -> ()
return
}
}
2. Command to Run without linalg-specialize-generic-ops :
/data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt /data/szy/workspace/mlir-inconsistent/a.mlir -pass-pipeline="builtin.module(func.func(tosa-to-linalg-named))" \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -tosa-to-scf \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -pass-pipeline="builtin.module(func.func(tosa-to-linalg))" \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -tosa-to-tensor -tosa-to-arith -convert-scf-to-cf -convert-math-to-llvm \
--linalg-fuse-elementwise-ops --cse --linalg-generalize-named-ops -convert-arith-to-llvm \
-one-shot-bufferize="bufferize-function-boundaries" --linalg-fold-unit-extent-dims -finalize-memref-to-llvm --expand-strided-metadata --linalg-specialize-generic-ops \
-convert-linalg-to-affine-loops -convert-index-to-llvm -finalize-memref-to-llvm -lower-affine -convert-scf-to-cf \
-convert-arith-to-llvm -convert-cf-to-llvm -finalize-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts \
| timeout 10 /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-cpu-runner -e main -entry-point-result=void \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_c_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_async_runtime.so
3. Output without linalg-specialize-generic-ops ::
[[[0],
[0]]]
4. Command to Run with linalg-specialize-generic-ops :
/data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt /data/szy/workspace/mlir-inconsistent/a.mlir -pass-pipeline="builtin.module(func.func(tosa-to-linalg-named))" \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -tosa-to-scf \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -pass-pipeline="builtin.module(func.func(tosa-to-linalg))" \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -tosa-to-tensor -tosa-to-arith -convert-scf-to-cf -convert-math-to-llvm \
--linalg-fuse-elementwise-ops --cse --linalg-generalize-named-ops -convert-arith-to-llvm \
-one-shot-bufferize="bufferize-function-boundaries" --linalg-fold-unit-extent-dims -finalize-memref-to-llvm --expand-strided-metadata \
-convert-linalg-to-affine-loops -convert-index-to-llvm -finalize-memref-to-llvm -lower-affine -convert-scf-to-cf \
-convert-arith-to-llvm -convert-cf-to-llvm -finalize-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts \
| timeout 10 /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-cpu-runner -e main -entry-point-result=void \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_c_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_async_runtime.so
5. Output with linalg-specialize-generic-ops :
[[[3],
[3]]]
6. Analysis for this case :
I debug this issue and find the faulty pass is linalg-specialize-generic-ops
The input IR (ir before running the linalg-specialize-generic-ops) can be found in input.txt
The output IR (ir after running thelinalg-specialize-generic-ops) can be found in output.txt
Please change file from .txt to .mlir
This MLIR program is expected to correctly output [0, 0] for %19 = tosa.sub %17, %18, given that %17 and %18 are both equal to %16. However, instead of the expected result, it incorrectly outputs [3, 3], which is the value of %16.
To debug this issue, I printed the IR after each pass and found that the input IR input.txt is correct before applying the --linalg-specialize-generic-ops pass. As shown in the first image, %reinterpret_cast_24 (the final result) is stored the value of %9, which is a constant with the value 0. However, after running --linalg-specialize-generic-ops output.txt , in the second image the linalg.generic operation is mistakenly optimized into linalg.copy, propagating the value 3 from %reinterpret_cast_19 to %reinterpret_cast_24, ultimately leading to the incorrect final result.

