Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per-Tensor quantization support for Conv2D layers #438

Closed
LLNLanLeN opened this issue Jun 24, 2020 · 13 comments
Closed

Per-Tensor quantization support for Conv2D layers #438

LLNLanLeN opened this issue Jun 24, 2020 · 13 comments
Assignees
Labels
feature request feature request

Comments

@LLNLanLeN
Copy link

System information

  • TensorFlow version (you are using): TF nightly 2.3
  • Are you willing to contribute it (Yes/No): No

Motivation

I've been testing TF QAT features by following the tutorials and guides on the following website:

https://www.tensorflow.org/model_optimization/guide/quantization/training_comprehensive_guide

To my understanding is that TF only have per-axis support for Conv2D layers and still working on per-tensor support. Right now, I'm working with a deployment target that requires per-tensor quantization for Conv2D, and just simply passing a CustomQuantizeConfig class to Conv2D layer and changing the weight quantizers Per-axis to False will cause errors with the TF quantize API.

Hence I'm wondering if there are any resources or additional experimental features that I can try out to perform per-tensor quantization for Conv2D layers?

@nutsiepully
Copy link
Contributor

Hi @LLNLanLeN,

The default implementation for QAT follows the TF (default 8 bit) quantization spec.

If you want something different, you can use a custom QuantizeConfig as in the guide. However, that just ensures that you can train QAT with per-tensor quantization. For custom configurations, you have to provide the relevant kernels while executing the model.

@LLNLanLeN
Copy link
Author

Hi @nutsiepully , thank you for getting back to me. I'm wondering if there are any example that can help me quantize the Conv2D weight per-tensor, instead of per-axis? The examples in the comprehensive QAT guide is only for Dense Layers, and it's not directly applicable for Conv2D layer.

I've been using this configurations for Conv2D which is called Default8BitConvQuantizeConfig, that I found here:

I ended up modifying the line self.weight_quantizer = default_8bit_quantizers.Default8BitConvWeightsQuantizer() (which is per-axis by default) to per-tensor by setting the argument per-axis = False:

https://github.com/tensorflow/model-optimization/blob/fcaa2306d62a419c5bce700275748b8b08711dbc/tensorflow_model_optimization/python/core/quantization/keras/default_8bit/default_8bit_quantizers.py

Unfortunately by simply changing that has caused a size mismatch since some of the underlying works are based on the assumption that Conv2D is per-axis quantization

@nutsiepully
Copy link
Contributor

I ended up modifying the line self.weight_quantizer = default_8bit_quantizers.Default8BitConvWeightsQuantizer() (which is per-axis by default) to per-tensor by setting the argument per-axis = False:

This is the correct way to do it.

Unfortunately by simply changing that has caused a size mismatch since some of the underlying works are based on the assumption that Conv2D is per-axis quantization

Modifying the training to happen per-tensor should allow the training to work just fine. However, that does not guarantee conversion to TFLite. Only the default quantization spec supports conversion to TFLite. Modifications can be used to train your model against any target backend you want.

@LLNLanLeN
Copy link
Author

LLNLanLeN commented Jun 29, 2020

@nutsiepully I see, thank you for responding. I've managed to per-tensor quantize Conv2D by passing in a custom configuration. Turns out that in addition to changing per_axis = False, I need to change the min_weight and max_weight shape to None as well:

https://github.com/tensorflow/model-optimization/blob/fcaa2306d62a419c5bce700275748b8b08711dbc/tensorflow_model_optimization/python/core/quantization/keras/default_8bit/default_8bit_quantizers.py

However, after that, when I convert the model to tflite, the BatchNorm layers, which normally which get folded to the Conv2D before it, now is not being Folded. Without much more information, I know it can be difficult for you to guide me, but if you have any idea where or why the folding didn't happen correctly, please point me to it.

This is tflite model using default Quantization Parameters
Capture_1

This is tflite model using Conv2D per-tensor quantization parametrers (I also pass a no-op-config for Batchnorm layer here as well). As you see, the Batchnorm layer did not get folded properly
Capture_2

@debapriyamaji
Copy link

Hi @LLNLanLeN,,
If you are still stuck with the merging issue, please refer to this :#552

@LLNLanLeN
Copy link
Author

@debapriyamaji Thank you for notifying me. I've found an alternative solution, but I'll still check the post

@biyoml
Copy link

biyoml commented Sep 24, 2020

Hi @LLNLanLeN ,
I have the same issue. Could you please share your solution?

@LLNLanLeN
Copy link
Author

@jackjhliu hey, I recommend you try the solution posted by @debapriyamaji above (there should be a thread leading to it). And if that method works, please comment on that thread and let us know. If it still doesn't work, I can recommend another solution, it's a bit trickier to do and a require a bit more time for sure.

@biyoml
Copy link

biyoml commented Sep 24, 2020

@LLNLanLeN
Ok, I will try. Thank your for your reply.

@nutsiepully
Copy link
Contributor

Modifying the training to happen per-tensor should allow the training to work just fine. However, that does not guarantee conversion to TFLite. Only the default quantization spec supports conversion to TFLite. Modifications can be used to train your model against any target backend you want.

I'm afraid conversion is not guaranteed or supported for custom quantization. Conversion only works for the default quantization spec.

@ai1361720220000
Copy link

Modifying the training to happen per-tensor should allow the training to work just fine. However, that does not guarantee conversion to TFLite. Only the default quantization spec supports conversion to TFLite. Modifications can be used to train your model against any target backend you want.

I'm afraid conversion is not guaranteed or supported for custom quantization. Conversion only works for the default quantization spec.

hello, does Conversion now support per layer quantization for Conv2D??

@danielmimimi
Copy link

@LLNLanLeN could you upload your code how you did the adjustment on the Default8BitConvWeightsQuantizer and the actual usage ?

I have trouble doing it the way you described.

Thanks

@LLNLanLeN
Copy link
Author

@danielmimimi hey, I have moved away from TF framework for a while now , hence I cannot recall the issue I've come across here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request feature request
Projects
None yet
Development

No branches or pull requests

6 participants