Quantized Inference failure of TensorRT 8.6 when running SDXL-turbo model on GPU A10 #3710

ApolloRay · 2024-03-13T12:09:47Z

Description

Environment

TensorRT Version:8.6

NVIDIA GPU:A10

NVIDIA Driver Version:525.147.05

CUDA Version:12.0

CUDNN Version:8.9

Operating System:

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link:
dreamshaper(turbo version)

Steps To Reproduce

Commands or scripts:

Have you tried the latest release?:

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

The text was updated successfully, but these errors were encountered:

ApolloRay · 2024-03-13T12:11:42Z

python demo_txt2img_xl.py "a photo of an astronaut riding a horse on mars" --version xl-turbo --onnx-dir ./dreamshaper_model/dreamshaper_onnx/ --engine-dir engine-sdxl-turbo --height 1024 --width 1024 --int8

Description:
run with this code in A10 (23G), it will show OOM.

ApolloRay · 2024-03-13T13:50:28Z

if i use width = 512 and height = 512 , it can run. But the inference time unet-int8 ~ 300ms , unet - fp16 ~ 250ms.

zerollzeng · 2024-03-16T15:57:25Z

@rajeevsrao @azhurkevich Is it expected? Thanks!

ApolloRay · 2024-03-19T08:51:35Z

same problem as #3724

ApolloRay · 2024-03-20T06:14:08Z

@rajeevsrao @azhurkevich Is it expected? Thanks!

Can anyone help?

azhurkevich · 2024-04-01T21:33:20Z

@ApolloRay maybe you can follow this blog post

ApolloRay · 2024-04-02T08:31:44Z

@ApolloRay maybe you can follow this blog post

from utils import load_calib_prompts
I can't find any info about utils.

zerollzeng assigned rajeevsrao Mar 16, 2024

zerollzeng added the triaged Issue has been triaged by maintainers label Mar 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantized Inference failure of TensorRT 8.6 when running SDXL-turbo model on GPU A10 #3710

Quantized Inference failure of TensorRT 8.6 when running SDXL-turbo model on GPU A10 #3710

ApolloRay commented Mar 13, 2024

ApolloRay commented Mar 13, 2024

ApolloRay commented Mar 13, 2024

zerollzeng commented Mar 16, 2024

ApolloRay commented Mar 19, 2024

ApolloRay commented Mar 20, 2024

azhurkevich commented Apr 1, 2024

ApolloRay commented Apr 2, 2024

Quantized Inference failure of TensorRT 8.6 when running SDXL-turbo model on GPU A10 #3710

Quantized Inference failure of TensorRT 8.6 when running SDXL-turbo model on GPU A10 #3710

Comments

ApolloRay commented Mar 13, 2024

Description

Environment

Relevant Files

Steps To Reproduce

ApolloRay commented Mar 13, 2024

ApolloRay commented Mar 13, 2024

zerollzeng commented Mar 16, 2024

ApolloRay commented Mar 19, 2024

ApolloRay commented Mar 20, 2024

azhurkevich commented Apr 1, 2024

ApolloRay commented Apr 2, 2024