-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retain qnn input kernel scales #4292
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. The only change I request is changing the name to input_scale and kernel_scale. I have similar naming elsewhere, so will be easy to read.
Sure. When I wrote it, I felt it better to keep the name distinct to indicate the difference between input_scale in Requantize vs here but I suppose `tensor' is superfluous to requirements .. Ramana |
Not sure why my testing didn't catch this - hopefully this lot satisfies the CI ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with some comments.
All updates now done ,please review and merge as appropriate . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A gentle ping for a merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one nitpick. Otherwise looks good to me.
Whoops , now fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
kernel_tensor_scale. The lowering in the tflite frontend loses the input_tensor_scale and the kernel_tensor_scale by multiplying it and putting it into the Requantize operation. This means that any graph partitioning passes or other passes that need to access this information no longer have it available in the qnn dialect. regards Ramana
As for conv2d, the tflite frontend drops the input tensor scale and the weight tensor scale from the relay op. Store it as separate fields in there.
to kernel_scale for conv2d.
And use input_scale and kernel_scale
Ok, it looks like there is some more work to be done here because of other bits that have landed.
|
For the first problem, you can just delete the extra scale arguments from the attrs. |
nn.conv2d does not contain input_scale and kernel_scale. We need to delete it when lowering it to nn.conv2d.
80084dd
to
46f3c23
Compare
Ah thanks - I hadn't spotted that last night. Now rebased and repushed. Ramana |
Now all done, would be nice to merge. |
Thanks @u99127 @anijain2305 |
* Add qnn conv2d attributes for input_tensor_scale and kernel_tensor_scale. The lowering in the tflite frontend loses the input_tensor_scale and the kernel_tensor_scale by multiplying it and putting it into the Requantize operation. This means that any graph partitioning passes or other passes that need to access this information no longer have it available in the qnn dialect. regards Ramana * Store input tensor scale and Weight tensor scale for Dense as well As for conv2d, the tflite frontend drops the input tensor scale and the weight tensor scale from the relay op. Store it as separate fields in there. * Fix unintentional tab * Rename input_tensor_scale to input_scale and kernel_tensor_scale to kernel_scale for conv2d. * input_tensor_scale -> input_scale weight_tensor_scale->weight_scale * Rework dense testcase And use input_scale and kernel_scale * Be consistent in use of input_scale and kernel_scale values * Fixup qnn conv2d tests for input_scale and kernel_scale * Make pydoc identical between conv2d and dense for weight_tensor * Fix up conv2d parameters to be in the same order between C++ and python * Fix ordering of parameters for dense. * Add input_scale and output_scale to try and satisfy ci gods * Delete input_scale and kernel_scale. nn.conv2d does not contain input_scale and kernel_scale. We need to delete it when lowering it to nn.conv2d. * Add input_scale and kernel_scale for qnn.conv2d
* Add qnn conv2d attributes for input_tensor_scale and kernel_tensor_scale. The lowering in the tflite frontend loses the input_tensor_scale and the kernel_tensor_scale by multiplying it and putting it into the Requantize operation. This means that any graph partitioning passes or other passes that need to access this information no longer have it available in the qnn dialect. regards Ramana * Store input tensor scale and Weight tensor scale for Dense as well As for conv2d, the tflite frontend drops the input tensor scale and the weight tensor scale from the relay op. Store it as separate fields in there. * Fix unintentional tab * Rename input_tensor_scale to input_scale and kernel_tensor_scale to kernel_scale for conv2d. * input_tensor_scale -> input_scale weight_tensor_scale->weight_scale * Rework dense testcase And use input_scale and kernel_scale * Be consistent in use of input_scale and kernel_scale values * Fixup qnn conv2d tests for input_scale and kernel_scale * Make pydoc identical between conv2d and dense for weight_tensor * Fix up conv2d parameters to be in the same order between C++ and python * Fix ordering of parameters for dense. * Add input_scale and output_scale to try and satisfy ci gods * Delete input_scale and kernel_scale. nn.conv2d does not contain input_scale and kernel_scale. We need to delete it when lowering it to nn.conv2d. * Add input_scale and kernel_scale for qnn.conv2d
The QNN dialect loses the input tensor scale and the weight
tensor scale too early. This inhibits the work one could do
with integrating 3rd party codegen or libraries that have interfaces
that expect input tensor scale and weight tensor scales.
This patch stack fixes it up for conv2d and Dense. I cannot see
any other operators affected yet.
See here for more . https://discuss.tvm.ai/t/lowering-qnn-conv2d-tflite/4654/5
Ramana
@anijain2305 - Please review.