Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[numeric] Numeric error for Conv operator with quantize/dequantize #19416

Open
pdhirajkumarprasad opened this issue Dec 9, 2024 · 2 comments
Labels
bug 🐞 Something isn't working

Comments

@pdhirajkumarprasad
Copy link

pdhirajkumarprasad commented Dec 9, 2024

What happened?

For the given IR

module {
  func.func @main_graph(%arg0: !torch.vtensor<[1,3,224,224],f32>, %arg1: !torch.vtensor<[1,24,112,112],f32>) -> !torch.vtensor<[1,24,112,112],f32> attributes {torch.onnx_meta.ir_version = 8 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.opset_versions = {ai.onnx.contrib = 1 : si64, ai.onnx.ml = 4 : si64, ai.onnx.preview.training = 1 : si64, ai.onnx.training = 1 : si64, com.microsoft = 1 : si64, com.microsoft.experimental = 1 : si64, com.microsoft.nchwc = 1 : si64, org.pytorch.aten = 1 : si64}, torch.onnx_meta.producer_name = "vai_q_onnx", torch.onnx_meta.producer_version = "1.17.0+43059a7"} {
    %12 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<0> : tensor<si8>} : () -> !torch.vtensor<[],si8> 
    %13 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<5.000000e-01> : tensor<f32>} : () -> !torch.vtensor<[],f32> 
    %14 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<1.000000e+00> : tensor<f32>} : () -> !torch.vtensor<[],f32> 
    %15 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<0> : tensor<si8>} : () -> !torch.vtensor<[],si8> 
    %16 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<_onnx__Conv_1060_quantized> : tensor<24x1x3x3xsi8>} : () -> !torch.vtensor<[24,1,3,3],si8> 
    %17 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<2.500000e-01> : tensor<f32>} : () -> !torch.vtensor<[],f32> 
    %18 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<0> : tensor<si8>} : () -> !torch.vtensor<[],si8> 
    %19 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<_onnx__Conv_1061_quantized> : tensor<24xsi8>} : () -> !torch.vtensor<[24],si8> 
    %24 = torch.operator "onnx.DequantizeLinear"(%16, %14, %15) : (!torch.vtensor<[24,1,3,3],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[24,1,3,3],f32> 
    %25 = torch.operator "onnx.DequantizeLinear"(%19, %17, %18) : (!torch.vtensor<[24],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[24],f32> 
    %35 = torch.operator "onnx.QuantizeLinear"(%arg1, %13, %12) : (!torch.vtensor<[1,24,112,112],f32>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[1,24,112,112],si8> 
    %36 = torch.operator "onnx.DequantizeLinear"(%35, %13, %12) : (!torch.vtensor<[1,24,112,112],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[1,24,112,112],f32> 
    %37 = torch.operator "onnx.Conv"(%36, %24, %25) {torch.onnx.auto_pad = "NOTSET", torch.onnx.dilations = [1 : si64, 1 : si64], torch.onnx.group = 24 : si64, torch.onnx.kernel_shape = [3 : si64, 3 : si64], torch.onnx.pads = [1 : si64, 1 : si64, 1 : si64, 1 : si64], torch.onnx.strides = [1 : si64, 1 : si64]} : (!torch.vtensor<[1,24,112,112],f32>, !torch.vtensor<[24,1,3,3],f32>, !torch.vtensor<[24],f32>) -> !torch.vtensor<[1,24,112,112],f32> 
    return %37 : !torch.vtensor<[1,24,112,112],f32>
  }
}

{-#
  dialect_resources: {
    builtin: {
      _onnx__Conv_1060_quantized: "0x0800000000000000FF00000000000000000000000000000000000000000000000000000000000000FBE208EAA4F91B7A0100000000FE0000000000010000010000FE0000320700CEF703FD0200000000FF0000020003F9FDF529FCFEFB0200000001FF0000000000000000000000000000000000020000000000000000000000010000000000010000000000000000000000000000000000000000000000000000FC0100000300000000010000000000000000030000000000000000FF00000000FF0001FE000200000000000000FF00000000000000000000000000",
      _onnx__Conv_1061_quantized: "0x0800000012044E020B59F50B030B0B0F020114FBFE0800FE040B1014"
    }
  }
#-}

getting numeric error as

EXEC @main_graph
[FAILED] result[0]: element at index 50176 (3) does not match the expected (2.75); expected that the view is equal to contents of a view of 1x24x112x112xf32

Steps to reproduce your issue

command:

iree-compile model.torch_onnx.mlir --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-target-cpu=host -o compiled_model.vmfb 

iree-run-module --module='compiled_model.vmfb' --device=local-task --function='main_graph' --input='[email protected]' --input='[email protected]'  --output=@'output.0.bin' --expected_output='1x24x112x112xf32=@golden_output.0.bin'

Version : IREE compiler version 3.1.0rc20241208 @ 39c56de

golden_output.0.bin.txt
input.0.bin.txt
input.1.bin.txt
model.torch_onnx.mlir.txt

model impacted: fbnetv3_d.ra2_in1k* and other, total 50+ models

What component(s) does this issue relate to?

Runtime

Version information

No response

Additional context

No response

@pdhirajkumarprasad pdhirajkumarprasad added the bug 🐞 Something isn't working label Dec 9, 2024
@pdhirajkumarprasad pdhirajkumarprasad changed the title [numeric] Numeric error for HardSigmoid with Conv operator [numeric] Numeric error for Conv operator with quantize/dequantize Dec 9, 2024
@zjgarvey
Copy link
Contributor

zjgarvey commented Dec 9, 2024

What model is this coming from? The fact that the bias quantization scale is not the same as the product of weight and input scales is concerning. I would not expect us to have comparable numerics for such an example without some additional work in the TorchFuseQuantizedOps pass.

@zjgarvey
Copy link
Contributor

zjgarvey commented Dec 9, 2024

See https://github.com/zjgarvey/SHARK-TestSuite/tree/conv_numerics_repro for operator level tests in the test suite.

You can run both of the examples with:

python run.py -m cl-onnx-iree -v -t qconv_numerics

If all of our models are failing because of bias, weight, and input scale mismatches, we should add support for this behavior at the torch-mlir level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants