Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does the whisper model need 17GB of video memory? #805

Closed
paulxin001 opened this issue Jan 4, 2024 · 4 comments
Closed

Why does the whisper model need 17GB of video memory? #805

paulxin001 opened this issue Jan 4, 2024 · 4 comments
Assignees
Labels

Comments

@paulxin001
Copy link

Why does the whisper model need 17GB of video memory? fast-whipser only needs 4G video memory? And I haven't found a way for whisper to quantize Int. Is it not supported now? This video memory occupies too much, is there any way to optimize it?

微信图片_20240104110750
@kristiankielhofner
Copy link

It's getting worked on.

@yuekaizhang
Copy link

yuekaizhang commented Jan 10, 2024

It's getting worked on.

Yeah, you could try the int8 weight only quantization branch, which greatly reduces the memory usage.
Also, memory usage should not be a big issue, as the GPU utilization is already high, and freeing up memory would not be used for other tasks. @paulxin001

@yuekaizhang
Copy link

yuekaizhang commented Jan 31, 2024

@paulxin001 Would you mind trying to removal layernorm plugin and try again? Thank you.

See #992

@nv-guomingz
Copy link
Collaborator

Hi @paulxin001 would u please try our latest code base to see if the issue still exists?

And do u still have further issue or question now? If not, we'll close it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants