-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantized Inference failure of TensorRT 8.6 when running SDXL-turbo model on GPU A10 #3710
Comments
python demo_txt2img_xl.py "a photo of an astronaut riding a horse on mars" --version xl-turbo --onnx-dir ./dreamshaper_model/dreamshaper_onnx/ --engine-dir engine-sdxl-turbo --height 1024 --width 1024 --int8 Description: |
if i use width = 512 and height = 512 , it can run. But the inference time unet-int8 ~ 300ms , unet - fp16 ~ 250ms. |
@rajeevsrao @azhurkevich Is it expected? Thanks! |
same problem as #3724 |
Can anyone help? |
@ApolloRay maybe you can follow this blog post |
from utils import load_calib_prompts |
Description
Environment
TensorRT Version:8.6
NVIDIA GPU:A10
NVIDIA Driver Version:525.147.05
CUDA Version:12.0
CUDNN Version:8.9
Operating System:
Python Version (if applicable):
Tensorflow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if so, version):
Relevant Files
Model link:
dreamshaper(turbo version)
Steps To Reproduce
Commands or scripts:
Have you tried the latest release?:
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (
polygraphy run <model.onnx> --onnxrt
):The text was updated successfully, but these errors were encountered: