Skip to content

Conversation

@per
Copy link
Collaborator

@per per commented Sep 12, 2025

Summary

Adds support for a16w8 for linear when targeting a backend with +int16 extension.

Fixes #13729

Test plan

Tested through unit tests.

cc @digantdesai @freddan80 @zingo @oscarandersson8218

@per per requested a review from digantdesai as a code owner September 12, 2025 14:22
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 12, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14258

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 0bdb35c with merge base 0329a8a (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 12, 2025
@per per requested a review from zingo September 12, 2025 14:22
@per per added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk release notes: none Do not include this in the release notes labels Sep 12, 2025
@zingo zingo added this to the 1.0.0 milestone Sep 12, 2025
@zingo zingo changed the title Int16 linear support Arm backend: Int16 linear support Sep 12, 2025
@per per force-pushed the int16_linear_support branch from e170183 to 0ab797f Compare September 15, 2025 07:08
@per
Copy link
Collaborator Author

per commented Sep 15, 2025

Unrelated failures.

self.add_pass(FuseConstantArgsPass(exported_program))
self.add_pass(InsertTableOpsPass(exported_program))
# If we have a conv2d with int16 activation split up into a convolution
# and an addition, to work-around the lack of support for int48 in torch
Copy link
Contributor

@digantdesai digantdesai Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and an addition, to work-around the lack of support for int48 in torch

Or can it be done by using torch.dtype.int64 instead and then detecting and lowering it as int48 downstream?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was starting of in that direction, but it interfere a bit with the int64->int32 handling, so rather keep it separate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah given int64 is treated as radioactive :P

Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, No conv tests for 16a8w, just curious.

@digantdesai digantdesai requested a review from Ninja91 September 15, 2025 17:10
@per per force-pushed the int16_linear_support branch from 5b9c54e to ac4203c Compare September 16, 2025 14:18
)


@common.parametrize("test_data", test_data_all_16a8w)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@per - Does it work with U55 and U85?
@Ninja91 - IIRC you already have tests seems like haven't merged?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not yet. Test will be added for Ethos-U with a Vela update when it's in place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added tests for U55 and U85

@facebook-github-bot
Copy link
Contributor

@digantdesai has imported this pull request. If you are a Meta employee, you can view this in D82552402.

@zingo
Copy link
Collaborator

zingo commented Sep 17, 2025

I see this testfail :(
[test-arm-backend (test_pytest_ops_ethosu_fvp) / linux-job https://github.com/pytorch/executorch/actions/runs/17768877335/job/50517971935?pr=14258#logs

FAILED backends/arm/test/ops/test_linear.py::test_linear_16a8w_tosa_INT[model_linear_rank1_large_randn,per_channel_quant=True] - AssertionError: Output 0 does not match reference output.
Given atol: 0.004528127912431955, rtol: 0.001.
Output tensor shape: torch.Size([20]), dtype: torch.float32
Difference: max: 0.1481781005859375, abs: 0.1481781005859375, mean abs error: 0.008467483520507812.
-- Model vs. Reference --
Numel: 20, 20
Median: 14.398289680480957, 14.398289680480957
Mean: 2.563008487224579, 2.555246722698212
Max: 69.8498764038086, 69.84635162353516
Min: -115.46151733398438, -115.60969543457031

@per per force-pushed the int16_linear_support branch from ac4203c to 54ad8e2 Compare September 17, 2025 15:20
@zingo zingo removed this from the 1.0.0 milestone Sep 18, 2025
@zingo
Copy link
Collaborator

zingo commented Sep 18, 2025

I removed milestone 1.0 for now Vela does not support this anyway so lets take this a bit calmer.

@zingo
Copy link
Collaborator

zingo commented Sep 19, 2025

@digantdesai , this has come out of sync with Meta internal version and I'm not allowed merge it, would you like to assist?

@digantdesai
Copy link
Contributor

yeah let me try to merge this.

@facebook-github-bot
Copy link
Contributor

@digantdesai has imported this pull request. If you are a Meta employee, you can view this in D82552402.

@digantdesai
Copy link
Contributor

if it doesn't work I can give you a patch..

per added 4 commits September 22, 2025 11:49
Signed-off-by: Per Åstrand <[email protected]>
Change-Id: I2b189b559f699c7eda6921ed515c0e8a849226ca
Support quantization to 16a8w. Since the resulting TOSA operator
needs to have the bias in int48 which isn't avaiable as a type in
torch, the conv2d needs to be decomposed into a conv + add, where
the conv result is scaled down to 32 bit before the addition of the
bias is done.

Signed-off-by: Per Åstrand <[email protected]>
Change-Id: Ib8cae694035796374a55a9909e501596e983abf5
For the case when the activation is 16 bit the bias in TOSA must be
a int48_t tensor. Since that can't be represented using torch.dtypes
the corresponding node.meta is set with a key 'tosa_dtype_48bit' to
pass through the note to the creation of the TOSA Tensor.
Also make sure to distinguish between int32 and int48 tensors in fuse
constant ops pass.

Signed-off-by: Per Åstrand <[email protected]>
Change-Id: Iefe64f2b02f388c905c9c818ee7d2a6af40bc9e3
Signed-off-by: Per Åstrand <[email protected]>
Change-Id: Ibe158d8d35a632547290f1b9a055d061ae267d77
per added 2 commits September 22, 2025 11:51
Enable tests of int16 activations and int8 weight quantization.
Test for large_rand is disabled to sort out why the test is flaky.

Signed-off-by: Per Åstrand <[email protected]>
Change-Id: I9de5d472f8862edebcf82c140399985db930c069
Add a enum class to handle special dtypes that can't be represented
in torch (i.e. int48_t) to avoid leaking serializer types into the
pass handling of the backend.

Signed-off-by: Per Åstrand <[email protected]>
Change-Id: I3388cec3c8a26f28790eedc3f124c336b6724cb4
@per per force-pushed the int16_linear_support branch from dc7a0a7 to 18c9985 Compare September 22, 2025 10:06
@zingo
Copy link
Collaborator

zingo commented Sep 22, 2025

Fails are unrelated
Note: Fail in trunk / test-arm-ootb-linux / linux-job is unrelated and handled by another PR

@zingo
Copy link
Collaborator

zingo commented Sep 22, 2025

@digantdesai we had to rebase as a file was changed and as we changed it again we got back to the same problem and need your help again. Sorry about that.

@facebook-github-bot
Copy link
Contributor

@digantdesai has imported this pull request. If you are a Meta employee, you can view this in D82552402.

@zingo zingo merged commit bb81136 into pytorch:main Sep 24, 2025
359 of 365 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: none Do not include this in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Arm] Support Linear INT16 TOSA reference model run

5 participants