-
Notifications
You must be signed in to change notification settings - Fork 713
Arm backend: Int16 linear support #14258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14258
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit 0bdb35c with merge base 0329a8a ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
e170183 to
0ab797f
Compare
|
Unrelated failures. |
| self.add_pass(FuseConstantArgsPass(exported_program)) | ||
| self.add_pass(InsertTableOpsPass(exported_program)) | ||
| # If we have a conv2d with int16 activation split up into a convolution | ||
| # and an addition, to work-around the lack of support for int48 in torch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and an addition, to work-around the lack of support for int48 in torch
Or can it be done by using torch.dtype.int64 instead and then detecting and lowering it as int48 downstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was starting of in that direction, but it interfere a bit with the int64->int32 handling, so rather keep it separate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah given int64 is treated as radioactive :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome, No conv tests for 16a8w, just curious.
5b9c54e to
ac4203c
Compare
| ) | ||
|
|
||
|
|
||
| @common.parametrize("test_data", test_data_all_16a8w) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not yet. Test will be added for Ethos-U with a Vela update when it's in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added tests for U55 and U85
|
@digantdesai has imported this pull request. If you are a Meta employee, you can view this in D82552402. |
|
I see this testfail :( FAILED backends/arm/test/ops/test_linear.py::test_linear_16a8w_tosa_INT[model_linear_rank1_large_randn,per_channel_quant=True] - AssertionError: Output 0 does not match reference output. |
ac4203c to
54ad8e2
Compare
|
I removed milestone 1.0 for now Vela does not support this anyway so lets take this a bit calmer. |
|
@digantdesai , this has come out of sync with Meta internal version and I'm not allowed merge it, would you like to assist? |
|
yeah let me try to merge this. |
|
@digantdesai has imported this pull request. If you are a Meta employee, you can view this in D82552402. |
|
if it doesn't work I can give you a patch.. |
Signed-off-by: Per Åstrand <[email protected]> Change-Id: I2b189b559f699c7eda6921ed515c0e8a849226ca
Support quantization to 16a8w. Since the resulting TOSA operator needs to have the bias in int48 which isn't avaiable as a type in torch, the conv2d needs to be decomposed into a conv + add, where the conv result is scaled down to 32 bit before the addition of the bias is done. Signed-off-by: Per Åstrand <[email protected]> Change-Id: Ib8cae694035796374a55a9909e501596e983abf5
For the case when the activation is 16 bit the bias in TOSA must be a int48_t tensor. Since that can't be represented using torch.dtypes the corresponding node.meta is set with a key 'tosa_dtype_48bit' to pass through the note to the creation of the TOSA Tensor. Also make sure to distinguish between int32 and int48 tensors in fuse constant ops pass. Signed-off-by: Per Åstrand <[email protected]> Change-Id: Iefe64f2b02f388c905c9c818ee7d2a6af40bc9e3
Signed-off-by: Per Åstrand <[email protected]> Change-Id: Ibe158d8d35a632547290f1b9a055d061ae267d77
Enable tests of int16 activations and int8 weight quantization. Test for large_rand is disabled to sort out why the test is flaky. Signed-off-by: Per Åstrand <[email protected]> Change-Id: I9de5d472f8862edebcf82c140399985db930c069
Add a enum class to handle special dtypes that can't be represented in torch (i.e. int48_t) to avoid leaking serializer types into the pass handling of the backend. Signed-off-by: Per Åstrand <[email protected]> Change-Id: I3388cec3c8a26f28790eedc3f124c336b6724cb4
dc7a0a7 to
18c9985
Compare
|
Fails are unrelated |
|
@digantdesai we had to rebase as a file was changed and as we changed it again we got back to the same problem and need your help again. Sorry about that. |
|
@digantdesai has imported this pull request. If you are a Meta employee, you can view this in D82552402. |
Summary
Adds support for a16w8 for linear when targeting a backend with +int16 extension.
Fixes #13729
Test plan
Tested through unit tests.
cc @digantdesai @freddan80 @zingo @oscarandersson8218