[numeric] Numeric error for Conv operator with quantize/dequantize #19416

pdhirajkumarprasad · 2024-12-09T08:46:50Z

What happened?

For the given IR

module {
  func.func @main_graph(%arg0: !torch.vtensor<[1,3,224,224],f32>, %arg1: !torch.vtensor<[1,24,112,112],f32>) -> !torch.vtensor<[1,24,112,112],f32> attributes {torch.onnx_meta.ir_version = 8 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.opset_versions = {ai.onnx.contrib = 1 : si64, ai.onnx.ml = 4 : si64, ai.onnx.preview.training = 1 : si64, ai.onnx.training = 1 : si64, com.microsoft = 1 : si64, com.microsoft.experimental = 1 : si64, com.microsoft.nchwc = 1 : si64, org.pytorch.aten = 1 : si64}, torch.onnx_meta.producer_name = "vai_q_onnx", torch.onnx_meta.producer_version = "1.17.0+43059a7"} {
    %12 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<0> : tensor<si8>} : () -> !torch.vtensor<[],si8> 
    %13 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<5.000000e-01> : tensor<f32>} : () -> !torch.vtensor<[],f32> 
    %14 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<1.000000e+00> : tensor<f32>} : () -> !torch.vtensor<[],f32> 
    %15 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<0> : tensor<si8>} : () -> !torch.vtensor<[],si8> 
    %16 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<_onnx__Conv_1060_quantized> : tensor<24x1x3x3xsi8>} : () -> !torch.vtensor<[24,1,3,3],si8> 
    %17 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<2.500000e-01> : tensor<f32>} : () -> !torch.vtensor<[],f32> 
    %18 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<0> : tensor<si8>} : () -> !torch.vtensor<[],si8> 
    %19 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<_onnx__Conv_1061_quantized> : tensor<24xsi8>} : () -> !torch.vtensor<[24],si8> 
    %24 = torch.operator "onnx.DequantizeLinear"(%16, %14, %15) : (!torch.vtensor<[24,1,3,3],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[24,1,3,3],f32> 
    %25 = torch.operator "onnx.DequantizeLinear"(%19, %17, %18) : (!torch.vtensor<[24],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[24],f32> 
    %35 = torch.operator "onnx.QuantizeLinear"(%arg1, %13, %12) : (!torch.vtensor<[1,24,112,112],f32>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[1,24,112,112],si8> 
    %36 = torch.operator "onnx.DequantizeLinear"(%35, %13, %12) : (!torch.vtensor<[1,24,112,112],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[1,24,112,112],f32> 
    %37 = torch.operator "onnx.Conv"(%36, %24, %25) {torch.onnx.auto_pad = "NOTSET", torch.onnx.dilations = [1 : si64, 1 : si64], torch.onnx.group = 24 : si64, torch.onnx.kernel_shape = [3 : si64, 3 : si64], torch.onnx.pads = [1 : si64, 1 : si64, 1 : si64, 1 : si64], torch.onnx.strides = [1 : si64, 1 : si64]} : (!torch.vtensor<[1,24,112,112],f32>, !torch.vtensor<[24,1,3,3],f32>, !torch.vtensor<[24],f32>) -> !torch.vtensor<[1,24,112,112],f32> 
    return %37 : !torch.vtensor<[1,24,112,112],f32>
  }
}

{-#
  dialect_resources: {
    builtin: {
      _onnx__Conv_1060_quantized: "0x0800000000000000FF00000000000000000000000000000000000000000000000000000000000000FBE208EAA4F91B7A0100000000FE0000000000010000010000FE0000320700CEF703FD0200000000FF0000020003F9FDF529FCFEFB0200000001FF0000000000000000000000000000000000020000000000000000000000010000000000010000000000000000000000000000000000000000000000000000FC0100000300000000010000000000000000030000000000000000FF00000000FF0001FE000200000000000000FF00000000000000000000000000",
      _onnx__Conv_1061_quantized: "0x0800000012044E020B59F50B030B0B0F020114FBFE0800FE040B1014"
    }
  }
#-}

getting numeric error as

EXEC @main_graph
[FAILED] result[0]: element at index 50176 (3) does not match the expected (2.75); expected that the view is equal to contents of a view of 1x24x112x112xf32

Steps to reproduce your issue

command:

iree-compile model.torch_onnx.mlir --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-target-cpu=host -o compiled_model.vmfb 

iree-run-module --module='compiled_model.vmfb' --device=local-task --function='main_graph' --input='[email protected]' --input='[email protected]'  --output=@'output.0.bin' --expected_output='1x24x112x112xf32=@golden_output.0.bin'

Version : IREE compiler version 3.1.0rc20241208 @ 39c56de

golden_output.0.bin.txt
input.0.bin.txt
input.1.bin.txt
model.torch_onnx.mlir.txt

model impacted: fbnetv3_d.ra2_in1k* and other, total 50+ models

What component(s) does this issue relate to?

Runtime

Version information

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

zjgarvey · 2024-12-09T21:44:26Z

What model is this coming from? The fact that the bias quantization scale is not the same as the product of weight and input scales is concerning. I would not expect us to have comparable numerics for such an example without some additional work in the TorchFuseQuantizedOps pass.

zjgarvey · 2024-12-09T21:58:17Z

See https://github.com/zjgarvey/SHARK-TestSuite/tree/conv_numerics_repro for operator level tests in the test suite.

You can run both of the examples with:

python run.py -m cl-onnx-iree -v -t qconv_numerics

If all of our models are failing because of bias, weight, and input scale mismatches, we should add support for this behavior at the torch-mlir level.

pdhirajkumarprasad added the bug 🐞 Something isn't working label Dec 9, 2024

pdhirajkumarprasad mentioned this issue Dec 9, 2024

[numeric] Numeric error for Sigmoid with Conv operator #19415

Open

pdhirajkumarprasad changed the title ~~[numeric] Numeric error for HardSigmoid with Conv operator~~ [numeric] Numeric error for Conv operator with quantize/dequantize Dec 9, 2024

pdhirajkumarprasad mentioned this issue Dec 9, 2024

[Tracker] All the issue related with e2e shark test suite nod-ai/SHARK-ModelDev#812

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[numeric] Numeric error for Conv operator with quantize/dequantize #19416

[numeric] Numeric error for Conv operator with quantize/dequantize #19416

pdhirajkumarprasad commented Dec 9, 2024 •

edited

Loading

zjgarvey commented Dec 9, 2024

zjgarvey commented Dec 9, 2024 •

edited

Loading

[numeric] Numeric error for Conv operator with quantize/dequantize #19416

[numeric] Numeric error for Conv operator with quantize/dequantize #19416

Comments

pdhirajkumarprasad commented Dec 9, 2024 • edited Loading

What happened?

Steps to reproduce your issue

What component(s) does this issue relate to?

Version information

Additional context

zjgarvey commented Dec 9, 2024

zjgarvey commented Dec 9, 2024 • edited Loading

pdhirajkumarprasad commented Dec 9, 2024 •

edited

Loading

zjgarvey commented Dec 9, 2024 •

edited

Loading