It seems that the CPU is working most of the time while the GPU is resting. There's still a lot of room for optimization.
A 27.8-minute audio takes 62.7 minutes to transcribe...
model: ggml-model-largev2.bin
parameters: -bs 5 -bo 5
audio: diffusion2023-07-03.wav 27.8 mins
 
 
