Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whisper in web-llm with WebGPU? #68

Open
sandorkonya opened this issue Apr 25, 2023 · 4 comments
Open

Whisper in web-llm with WebGPU? #68

sandorkonya opened this issue Apr 25, 2023 · 4 comments

Comments

@sandorkonya
Copy link

Great Repository!

Is it within your scope to implement a webGPU accelerated version of Whisper?

Not sure if this helps, but there is a C port for Whisper wirh CPU implementation, and as mentioned in this discussion, the main thing that needs to be offloaded to the GPU is the GGML_OP_MUL_MAT operator.

thy

@tqchen
Copy link
Contributor

tqchen commented Apr 25, 2023

great suggestion, yes this is something that we can push for

@sandorkonya
Copy link
Author

@tqchen my ultimate goal would be to get it run the most efficient way on android edge device.

Although there is already a solution in the onnx framework onnx framework, based on the recent merge, but i am not sure when it will be usable on android.

There were some who tried with GPU delegates, but no success yet.

Any idea how one could solve it on the edge (Android) device?

@DustinBrett
Copy link
Contributor

There is also a demo of Whisper running via WebAssembly in that repo. https://github.com/ggerganov/whisper.cpp/tree/master/examples/talk.wasm

@sandorkonya
Copy link
Author

There is also a demo of Whisper running via WebAssembly in that repo. https://github.com/ggerganov/whisper.cpp/tree/master/examples/talk.wasm

Yes, it runs on CPU. I hope, that with a GPU version one could reach real time inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants