Done! Accelerate Whisper tasks such as transcription, by multiprocesing through parallelization #432

MrEdwards007 · 2022-10-29T04:13:07Z

MrEdwards007
Oct 29, 2022

I recently made my first ever commit, that is also one of my first Python programs, which I hope will be of benefit to others.
The program accelerates Whisper tasks such as transcription, by multiprocessing through parallelization for CPUs.

No modification to Whisper is needed.

It makes use of multiple CPU cores and the results are as follows

The input file duration was 3706.393 seconds - 01:01:46(H:M:S)

Using 011 of 16CPUs for the "tiny.en" model, a transcription speed of 32.713x
Using 007 of 16CPUs for the "base.en" model, a transcription speed of 16.416x
Using 009 of 16CPUs for the "small.en" model, a transcription speed of 5.595x

Machine -- MacBook, macOS Big Sur using 2.3 GHz Intel Core i9, 16 cores, with 16G of RAM.
Testing of "medium.en" model was very limited because I quickly ran out of memory, so those tests were not included.

https://github.com/MrEdwards007/WhisperTaskAcceleration

ArtyomZemlyak · 2022-10-30T12:04:26Z

ArtyomZemlyak
Oct 30, 2022

Standart implementation already use all CPU cores for inference.

1 reply

MrEdwards007 Oct 31, 2022
Author

I had been looking for a way to do this by API through Python but did not find a way.
I used the command line and continued to increase the thread count but the completion time did not go down.
If there is a way to decrease the completion time, I did not find it, which is why I implemented a different process for doing so.
If increasing the thread count works for someone, please let me know what was done to make it work.

scott2b · 2023-01-27T22:04:26Z

scott2b
Jan 27, 2023

The proposed solution gets its speed-up by chunking the audio file. There is discussion elsewhere in this forum regarding the deterioration of transcription when chunking due to loss of context, of which one should be aware when taking this approach.

5 replies

ccossou Mar 17, 2023

Do you have the link to the particular discussion? At what point can we consider that "context" is lost? I have 2h of audio, if I make 5 minutes chunck, would that be equivalent? or 10/20 minutes? For me, the processing take 6 hours, so even big chunks would be a huge help.

One could even consider some overlap of the chunks to add some context for the first few seconds.

glangford Mar 17, 2023

Whisper large producing differing outputs when .wav file is chunked #440 (comment)

billyg88 · 2023-04-26T14:13:47Z

billyg88
Apr 26, 2023

Is there a way to parallelize the transcription of multiple long-form audio files?

i.e. Have 2 or more copies of the same Whisper model processing different files?

0 replies

manybot · 2023-09-03T19:20:48Z

manybot
Sep 3, 2023

Alternatively is there a way to use more resources ? such as setting the vram / cpu cores or gpu allocated?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Done! Accelerate Whisper tasks such as transcription, by multiprocesing through parallelization #432

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 6 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Done! Accelerate Whisper tasks such as transcription, by multiprocesing through parallelization #432

Replies: 4 comments · 6 replies

MrEdwards007 Oct 31, 2022 Author

Replies: 4 comments 6 replies

MrEdwards007 Oct 31, 2022
Author