MuLaw Audio Transcription in Whisper Model #2331
Replies: 4 comments 3 replies
-
It looks like a hallucination, which might happen if the signal to noise ratio is low in the audio. Maybe that's the MuLaw quality, but that's a hypothesis without hearing your audio file. |
Beta Was this translation helpful? Give feedback.
-
@ryanheise Thank you for the reply :) I've attached the audio sample here. Please unzip and use the file "phone-call.raw". If we can transcribe this with Whisper, it would be really helpful. |
Beta Was this translation helpful? Give feedback.
-
@gongouveia @ryanheise I've attached the zip file again, which has the wav file of the mulaw audio. |
Beta Was this translation helpful? Give feedback.
-
Worked fine for me with the following options:
|
Beta Was this translation helpful? Give feedback.
-
Hi Everyone :)
I would need your support in getting my audio bytes transcribed using Whisper model.
My audio sample format is Mulaw, 8-bit, 8000Hz, Stereo channeled.
I tried converting them to 16-bit linear/float32 bit and others as well...using different libraries audioop, pydub....
But nothing helped.
I always get transcription like 'You' 'You' 'Thank you' 'Thank you for watching'.
Please help
Beta Was this translation helpful? Give feedback.
All reactions