-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Per-Tensor quantization support for Conv2D layers #438
Comments
Hi @LLNLanLeN, The default implementation for QAT follows the TF (default 8 bit) quantization spec. If you want something different, you can use a custom |
Hi @nutsiepully , thank you for getting back to me. I'm wondering if there are any example that can help me quantize the Conv2D weight per-tensor, instead of per-axis? The examples in the comprehensive QAT guide is only for Dense Layers, and it's not directly applicable for Conv2D layer. I've been using this configurations for Conv2D which is called Line 486 in fcaa230
I ended up modifying the line Unfortunately by simply changing that has caused a size mismatch since some of the underlying works are based on the assumption that Conv2D is per-axis quantization |
This is the correct way to do it.
Modifying the training to happen per-tensor should allow the training to work just fine. However, that does not guarantee conversion to TFLite. Only the default quantization spec supports conversion to TFLite. Modifications can be used to train your model against any target backend you want. |
@nutsiepully I see, thank you for responding. I've managed to per-tensor quantize Conv2D by passing in a custom configuration. Turns out that in addition to changing However, after that, when I convert the model to This is tflite model using default Quantization Parameters This is tflite model using Conv2D per-tensor quantization parametrers (I also pass a no-op-config for Batchnorm layer here as well). As you see, the Batchnorm layer did not get folded properly |
Hi @LLNLanLeN,, |
@debapriyamaji Thank you for notifying me. I've found an alternative solution, but I'll still check the post |
Hi @LLNLanLeN , |
@jackjhliu hey, I recommend you try the solution posted by @debapriyamaji above (there should be a thread leading to it). And if that method works, please comment on that thread and let us know. If it still doesn't work, I can recommend another solution, it's a bit trickier to do and a require a bit more time for sure. |
@LLNLanLeN |
I'm afraid conversion is not guaranteed or supported for custom quantization. Conversion only works for the default quantization spec. |
hello, does Conversion now support per layer quantization for Conv2D?? |
@LLNLanLeN could you upload your code how you did the adjustment on the Default8BitConvWeightsQuantizer and the actual usage ? I have trouble doing it the way you described. Thanks |
@danielmimimi hey, I have moved away from TF framework for a while now , hence I cannot recall the issue I've come across here. |
System information
Motivation
I've been testing TF QAT features by following the tutorials and guides on the following website:
https://www.tensorflow.org/model_optimization/guide/quantization/training_comprehensive_guide
To my understanding is that TF only have per-axis support for Conv2D layers and still working on per-tensor support. Right now, I'm working with a deployment target that requires per-tensor quantization for Conv2D, and just simply passing a CustomQuantizeConfig class to Conv2D layer and changing the weight quantizers
Per-axis
toFalse
will cause errors with the TF quantize API.Hence I'm wondering if there are any resources or additional experimental features that I can try out to perform per-tensor quantization for Conv2D layers?
The text was updated successfully, but these errors were encountered: