Skip to content

Conversation

@guoyu-wang
Copy link
Contributor

Description: [NNAPI EP] add uint8 support for Transpose/Concat/Maxpool, add support of QLinearSigmoid

Motivation and Context

  • This is for one of our 1P models (Office Lens)
  • NNAPI will run more efficiently if the majority of the model (or entire model) is quantized, such that the model can run efficiently using NPU, and avoid NPU/GPU/CPU context switch
  • Adding support of uint8 input for Transpose/Concat/Maxpool if these are nodes do not consume graph inputs, since NNAPI will require explicit value of scale and zero point, we can only get these value as an output of another node in the graph/partition
  • Add QLinearSigmoid support

@guoyu-wang guoyu-wang requested a review from a team as a code owner February 2, 2021 06:14
// Not running using quantized input
if (input_type == ONNX_NAMESPACE::TensorProto_DataType_FLOAT)
return true;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean quantization isn't relevent here, and maybe this check be outside of IsInternalQuantizationSupported?

// This should not happen, but if it happens make sure this will require an impossible version
if (!GetType(*node.InputDefs()[0], input_type))
return std::numeric_limits<int32_t>::max();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it should never happen (type inferencing should always have populated this value), why would we not throw?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to be on the safe side, this function does not return status, so don't want this crash the runtime

@skottmckay
Copy link
Contributor

General comment. When you have so much implicit knowledge it can be hard to recognize the lines of code that may need a quick explanation, but it would be very helpful to someone new to the whole setup if there were a few more comments explaining the 'why' behind various things.

@guoyu-wang
Copy link
Contributor Author

General comment. When you have so much implicit knowledge it can be hard to recognize the lines of code that may need a quick explanation, but it would be very helpful to someone new to the whole setup if there were a few more comments explaining the 'why' behind various things.

There will be separated work to add documents for both NNAPI and CoreML EP,
Task 1030676: Add documentation for NNAPI and CoreML EP

skottmckay
skottmckay previously approved these changes Feb 3, 2021
Copy link
Contributor

@skottmckay skottmckay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Contributor

@skottmckay skottmckay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@guoyu-wang guoyu-wang merged commit 464dbef into master Feb 3, 2021
@guoyu-wang guoyu-wang deleted the gwang-msft/nnapi_resize_concat_use_uint8 branch February 3, 2021 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants