-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fold "FP16 weights -> Dequantize -> FP32 -> Conv/MatMul..." to "FP32 weights -> Conv/MatMul..." #35
Comments
It would be awesome if you could implement this! Unfortunately, the mediapipe models often rely on fp16 weights. |
I would second @paulgavrikov and request the fold from fp16 to fp32. Thanks. If that is not possible or would take a long time, it would be great if you can guide me to do this. |
@paulgavrikov @ram95014 Thanks for your feedback! I am very glad about that. This should be possible and should not take too much time but I was not working on it. It would be great if you can help! In general, it could be divided into three steps:
I would suggest starting by adding a new operator to It would be great if we can bring it up in the next minor release. Let's prioritize it! |
@paulgavrikov @ram95014 This functionality has been enabled, please try out with the latest code. |
Hi, should this be fixed? I am still having issues |
@mikkelmedm It has been fixed and protected by this test. What's the error you have? |
Running it on a MediaPipe model I am getting an error like "FP16 is not tested, and might not work properly" |
ONNX quantization doesn't take FP16 as a quantized data type, therefore nearly all FP16 quantized TFLite models are unsupported (see this FAQ).
We recommend user switch to full integer quantization. But if the user cannot do that (for example no TensorFlow model available), we can fold "FP16 weights -> Dequantize -> FP32 -> Conv/MatMul..." to "FP32 weights -> Conv/MatMul..." to workaround this issue.
The text was updated successfully, but these errors were encountered: