[NNAPI EP] add uint8 support for Transpose/Concat/Maxpool, add support of QLinearSigmoid #6534

guoyu-wang · 2021-02-02T06:14:48Z

Description: [NNAPI EP] add uint8 support for Transpose/Concat/Maxpool, add support of QLinearSigmoid

Motivation and Context

This is for one of our 1P models (Office Lens)
NNAPI will run more efficiently if the majority of the model (or entire model) is quantized, such that the model can run efficiently using NPU, and avoid NPU/GPU/CPU context switch
Adding support of uint8 input for Transpose/Concat/Maxpool if these are nodes do not consume graph inputs, since NNAPI will require explicit value of scale and zero point, we can only get these value as an output of another node in the graph/partition
Add QLinearSigmoid support

…eck for concat uint8

onnxruntime/core/providers/nnapi/nnapi_builtin/builders/helper.cc

skottmckay · 2021-02-02T09:32:16Z

onnxruntime/core/providers/nnapi/nnapi_builtin/builders/helper.cc

+  // Not running using quantized input
+  if (input_type == ONNX_NAMESPACE::TensorProto_DataType_FLOAT)
+    return true;
+


Does this mean quantization isn't relevent here, and maybe this check be outside of IsInternalQuantizationSupported?

onnxruntime/core/providers/nnapi/nnapi_builtin/builders/helper.cc

onnxruntime/core/providers/nnapi/nnapi_builtin/builders/op_builder.cc

onnxruntime/core/providers/nnapi/nnapi_builtin/builders/op_support_checker.cc

skottmckay · 2021-02-02T09:49:46Z

onnxruntime/core/providers/nnapi/nnapi_builtin/builders/op_support_checker.cc

+  // This should not happen, but if it happens make sure this will require an impossible version
+  if (!GetType(*node.InputDefs()[0], input_type))
+    return std::numeric_limits<int32_t>::max();
+


If it should never happen (type inferencing should always have populated this value), why would we not throw?

Want to be on the safe side, this function does not return status, so don't want this crash the runtime

skottmckay · 2021-02-02T09:53:30Z

General comment. When you have so much implicit knowledge it can be hard to recognize the lines of code that may need a quick explanation, but it would be very helpful to someone new to the whole setup if there were a few more comments explaining the 'why' behind various things.

guoyu-wang · 2021-02-02T22:56:26Z

General comment. When you have so much implicit knowledge it can be hard to recognize the lines of code that may need a quick explanation, but it would be very helpful to someone new to the whole setup if there were a few more comments explaining the 'why' behind various things.

There will be separated work to add documents for both NNAPI and CoreML EP,
Task 1030676: Add documentation for NNAPI and CoreML EP

skottmckay

…emulator

skottmckay

guoyu-wang added 3 commits February 1, 2021 18:16

Init change

2bb6843

Add QlinearSigmoid support

38621a4

Update tests

ee5d276

guoyu-wang requested review from edgchen1 and skottmckay February 2, 2021 06:14

guoyu-wang requested a review from a team as a code owner February 2, 2021 06:14

guoyu-wang added 2 commits February 1, 2021 23:33

Add resize int8 support

6f6cb8f

Add version check for resize linear uint8 and add scale/zero point ch…

4076fd9

…eck for concat uint8