Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantized Inference failure of TensorRT 8.6 when running SDXL-turbo model on GPU A10 #3710

Open
ApolloRay opened this issue Mar 13, 2024 · 7 comments
Assignees
Labels
triaged Issue has been triaged by maintainers

Comments

@ApolloRay
Copy link

Description

Environment

TensorRT Version:8.6

NVIDIA GPU:A10

NVIDIA Driver Version:525.147.05

CUDA Version:12.0

CUDNN Version:8.9

Operating System:

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link:
dreamshaper(turbo version)

Steps To Reproduce

Commands or scripts:

Have you tried the latest release?:

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

@ApolloRay
Copy link
Author

python demo_txt2img_xl.py "a photo of an astronaut riding a horse on mars" --version xl-turbo --onnx-dir ./dreamshaper_model/dreamshaper_onnx/ --engine-dir engine-sdxl-turbo --height 1024 --width 1024 --int8

Description:
run with this code in A10 (23G), it will show OOM.

@ApolloRay
Copy link
Author

if i use width = 512 and height = 512 , it can run. But the inference time unet-int8 ~ 300ms , unet - fp16 ~ 250ms.

@zerollzeng
Copy link
Collaborator

@rajeevsrao @azhurkevich Is it expected? Thanks!

@zerollzeng zerollzeng added the triaged Issue has been triaged by maintainers label Mar 16, 2024
@ApolloRay
Copy link
Author

same problem as #3724

@ApolloRay
Copy link
Author

@rajeevsrao @azhurkevich Is it expected? Thanks!

Can anyone help?

@azhurkevich
Copy link
Contributor

@ApolloRay maybe you can follow this blog post

@ApolloRay
Copy link
Author

@ApolloRay maybe you can follow this blog post

from utils import load_calib_prompts
I can't find any info about utils.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

4 participants