-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
onnx model convert trt.int8 failure:fallback fp32 #3754
Comments
Can you provide what trt logged during the build, and possible the build script? |
thanks,bro. I truncated the last part because it was too long. [03/29/2024-17:37:32] [TRT] [V] Setting a default quantization params because quantization data is missing for {ForeignNode[onnx::Gather_401...(Unnamed Layer* 3201) [ElementWise]]} others: |
How many layers are affected? Since, it could be a necessary reformat layer that tensorrt adds at I/O. Refer to this for more info: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#reformat-free-network-tensors Please share the whole log, and .onnx file in a google drive for further help. Had the same issue with a Reformat layer #2136 |
I am sorry for my late response, i will check you method, thanks for help |
What is your trtexec cmd ? |
I didn't use the trtexec command; instead, I used my own script.
|
Description
When I use TensorRT for int8 quantization, I always encounter the accuracy fallback to fp32. The trt.BuilderFlag.OBEY_PRECISION_CONSTRAINTS parameter does not solve the issue. What should I do?"
Environment
TensorRT Version:8.6.16
NVIDIA GPU: A100
CUDA Version:11.4
Operating System:
Python Version (if applicable):3.7
PyTorch Version (if applicable):1.12.1
The text was updated successfully, but these errors were encountered: