-
Notifications
You must be signed in to change notification settings - Fork 730
Add 16a4w_block QAT config #15878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add 16a4w_block QAT config #15878
Conversation
Summary: Introduce a FakeQuantizer subclass. It falls back to LPBQ observer's `convert`. `_derived_bias_quant_spec` also looks for it to correctly derive bias scale. Differential Revision: D87194388
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15878
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 2 Unrelated FailuresAs of commit eb2e9f9 with merge base 529a265 ( NEW FAILURE - The following job has failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@winskuo-quic can you review and approve this diff? |
| self.eps = eps | ||
|
|
||
| def forward(self, x: torch.Tensor) -> torch.Tensor: | ||
| if x.numel() == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be simpler if calling torchao.quantization.quant_primitives._fake_quantize_affine directly?
|
I think for QAT testing we can use pseudo labels generated by the FP32 model, do a few mini training steps on the fake-quant model, and then compare outputs against the FP32 baseline (pseudo labels) within acceptable atol/rtol thresholds as usual. |
Summary: Introduce a FakeQuantizer subclass. It falls back to LPBQ observer's
convert._derived_bias_quant_specalso looks for it to correctly derive bias scale.Open to suggestions on how to test. Naveen launched a QAT run and it seems to produce reasonable results.
Differential Revision: D87194388