Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retain qnn input kernel scales #4292

Merged
merged 14 commits into from
Nov 16, 2019

Conversation

u99127
Copy link
Contributor

@u99127 u99127 commented Nov 9, 2019

The QNN dialect loses the input tensor scale and the weight
tensor scale too early. This inhibits the work one could do
with integrating 3rd party codegen or libraries that have interfaces
that expect input tensor scale and weight tensor scales.

This patch stack fixes it up for conv2d and Dense. I cannot see
any other operators affected yet.

See here for more . https://discuss.tvm.ai/t/lowering-qnn-conv2d-tflite/4654/5

Ramana

@anijain2305 - Please review.

Copy link
Contributor

@anijain2305 anijain2305 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. The only change I request is changing the name to input_scale and kernel_scale. I have similar naming elsewhere, so will be easy to read.

@u99127
Copy link
Contributor Author

u99127 commented Nov 9, 2019

Overall LGTM. The only change I request is changing the name to input_scale and kernel_scale. I have similar naming elsewhere, so will be easy to read.

Sure.

When I wrote it, I felt it better to keep the name distinct to indicate the difference between input_scale in Requantize vs here but I suppose `tensor' is superfluous to requirements ..

Ramana

@tqchen tqchen added status: need review status: need update need update based on feedbacks labels Nov 10, 2019
@u99127
Copy link
Contributor Author

u99127 commented Nov 11, 2019

Not sure why my testing didn't catch this - hopefully this lot satisfies the CI ...
Ramana

Copy link
Contributor

@anijain2305 anijain2305 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with some comments.

@u99127
Copy link
Contributor Author

u99127 commented Nov 12, 2019

All updates now done ,please review and merge as appropriate .

Copy link
Contributor

@anijain2305 anijain2305 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhiics @yzhliu Can you please merge?

@u99127
Copy link
Contributor Author

u99127 commented Nov 13, 2019

A gentle ping for a merge.

Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one nitpick. Otherwise looks good to me.

python/tvm/relay/qnn/op/qnn.py Show resolved Hide resolved
@u99127
Copy link
Contributor Author

u99127 commented Nov 13, 2019

Whoops , now fixed.

Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Ramana Radhakrishnan added 11 commits November 14, 2019 23:50
kernel_tensor_scale.

The lowering in the tflite frontend loses the input_tensor_scale
and the kernel_tensor_scale by multiplying it and putting it into
the Requantize operation. This means that any graph partitioning
passes or other passes that need to access this information no longer
have it available in the qnn dialect.

regards
Ramana
As for conv2d, the tflite frontend drops the input tensor
scale and the weight tensor scale from the relay op. Store
it as separate fields in there.
And use input_scale and kernel_scale
@u99127
Copy link
Contributor Author

u99127 commented Nov 15, 2019

Ok, it looks like there is some more work to be done here because of other bits that have landed.

  • Fix legalisations.py to consider qnn.conv2d and qnn.dense to filter out the input_scale and kernel_scale before calling nn.conv2d and nn.dense. - Still working on this one.
  • Fix qnn.op.dense to have input_scale and kernel_scale- I've done it.

@anijain2305
Copy link
Contributor

Ramana Radhakrishnan added 2 commits November 15, 2019 09:25
nn.conv2d does not contain input_scale and kernel_scale. We need
to delete it when lowering it to nn.conv2d.
@u99127 u99127 force-pushed the retain-qnn-input-kernel-scales branch from 80084dd to 46f3c23 Compare November 15, 2019 13:07
@u99127
Copy link
Contributor Author

u99127 commented Nov 15, 2019

https://github.com/apache/incubator-tvm/blob/5b9f459d638d1cf6bd820f3bdc58d7e5632d7ed7/python/tvm/relay/qnn/op/legalizations.py#L89-L90

For the first problem, you can just delete the extra scale arguments from the attrs.

Ah thanks - I hadn't spotted that last night.

Now rebased and repushed.

Ramana

@u99127
Copy link
Contributor Author

u99127 commented Nov 16, 2019

Now all done, would be nice to merge.

@zhiics zhiics merged commit 3ba9dd0 into apache:master Nov 16, 2019
@zhiics
Copy link
Member

zhiics commented Nov 16, 2019

Thanks @u99127 @anijain2305

@zhiics zhiics added status: accepted and removed status: need review status: need update need update based on feedbacks labels Nov 16, 2019
zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Nov 26, 2019
* Add qnn conv2d attributes for input_tensor_scale and
kernel_tensor_scale.

The lowering in the tflite frontend loses the input_tensor_scale
and the kernel_tensor_scale by multiplying it and putting it into
the Requantize operation. This means that any graph partitioning
passes or other passes that need to access this information no longer
have it available in the qnn dialect.

regards
Ramana

* Store input tensor scale and Weight tensor scale for Dense as well

As for conv2d, the tflite frontend drops the input tensor
scale and the weight tensor scale from the relay op. Store
it as separate fields in there.

* Fix unintentional tab

* Rename input_tensor_scale to input_scale and kernel_tensor_scale
to kernel_scale for conv2d.

* input_tensor_scale -> input_scale weight_tensor_scale->weight_scale

* Rework dense testcase

And use input_scale and kernel_scale

* Be consistent in use of input_scale and kernel_scale values

* Fixup qnn conv2d tests for input_scale and kernel_scale

* Make pydoc identical between conv2d and dense for weight_tensor

* Fix up conv2d parameters to be in the same order between C++ and python

* Fix ordering of parameters for dense.

* Add input_scale and output_scale to try and satisfy ci gods

* Delete input_scale and kernel_scale.

nn.conv2d does not contain input_scale and kernel_scale. We need
to delete it when lowering it to nn.conv2d.

* Add input_scale and kernel_scale for qnn.conv2d
yongwww pushed a commit to neo-ai/tvm that referenced this pull request Nov 26, 2019
* Add qnn conv2d attributes for input_tensor_scale and
kernel_tensor_scale.

The lowering in the tflite frontend loses the input_tensor_scale
and the kernel_tensor_scale by multiplying it and putting it into
the Requantize operation. This means that any graph partitioning
passes or other passes that need to access this information no longer
have it available in the qnn dialect.

regards
Ramana

* Store input tensor scale and Weight tensor scale for Dense as well

As for conv2d, the tflite frontend drops the input tensor
scale and the weight tensor scale from the relay op. Store
it as separate fields in there.

* Fix unintentional tab

* Rename input_tensor_scale to input_scale and kernel_tensor_scale
to kernel_scale for conv2d.

* input_tensor_scale -> input_scale weight_tensor_scale->weight_scale

* Rework dense testcase

And use input_scale and kernel_scale

* Be consistent in use of input_scale and kernel_scale values

* Fixup qnn conv2d tests for input_scale and kernel_scale

* Make pydoc identical between conv2d and dense for weight_tensor

* Fix up conv2d parameters to be in the same order between C++ and python

* Fix ordering of parameters for dense.

* Add input_scale and output_scale to try and satisfy ci gods

* Delete input_scale and kernel_scale.

nn.conv2d does not contain input_scale and kernel_scale. We need
to delete it when lowering it to nn.conv2d.

* Add input_scale and kernel_scale for qnn.conv2d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants