MuLaw Audio Transcription in Whisper Model #2331

gokulhub-io · 2024-09-10T07:55:16Z

gokulhub-io
Sep 10, 2024

Hi Everyone :)

I would need your support in getting my audio bytes transcribed using Whisper model.
My audio sample format is Mulaw, 8-bit, 8000Hz, Stereo channeled.

I tried converting them to 16-bit linear/float32 bit and others as well...using different libraries audioop, pydub....

But nothing helped.

I always get transcription like 'You' 'You' 'Thank you' 'Thank you for watching'.

Please help

ryanheise · 2024-09-11T01:11:50Z

ryanheise
Sep 11, 2024

It looks like a hallucination, which might happen if the signal to noise ratio is low in the audio. Maybe that's the MuLaw quality, but that's a hypothesis without hearing your audio file.

0 replies

gokulhub-io · 2024-09-11T09:04:18Z

gokulhub-io
Sep 11, 2024
Author

@ryanheise Thank you for the reply :)

I've attached the audio sample here. Please unzip and use the file "phone-call.raw".
It is in MuLaw, 8000Hz, 2 Channel format.

If we can transcribe this with Whisper, it would be really helpful.

phone-call.zip

1 reply

gongouveia Sep 11, 2024

@gokulhub-io please convert the .raw file to wav file. Post the result here
Do you have any other format where you can listen to it, with no additional hassle?

gokulhub-io · 2024-09-11T16:34:24Z

gokulhub-io
Sep 11, 2024
Author

@gongouveia @ryanheise I've attached the zip file again, which has the wav file of the mulaw audio.
Please let me know if we can transcribe this using whisper

phone-call-wav.zip

1 reply

gongouveia Sep 11, 2024

@gokulhub-io It is tricky but should be possible, use demucs for pre processing and enhancing audio.
Do you want to do batch inference? For further help you can contact me.

ryanheise · 2024-09-12T03:47:38Z

ryanheise
Sep 12, 2024

Worked fine for me with the following options:

$ python -m whisper --model large-v2 --language English --word_timestamps True --suppress_tokens "" --append_punctuation "" --prepend_punctuation "" phone-call.wav
[00:00.000 --> 00:05.400]  Good morning, XYZ Incorporation, Agent 1 speaking. How can I help you today?
[00:05.400 --> 00:07.760]  Good morning, I'm calling to renew my service.
[00:07.880 --> 00:11.660]  Okay, I can help you with that. Can I just confirm your name and account number?
[00:11.660 --> 00:14.800]  Sure, it's Gokul, and my account number is 8615309.
[00:14.800 --> 00:18.700]  Okay, I have your account up here. Just give me a second to renew it.
[00:18.760 --> 00:24.680]  So, I have the renewal here, but it says since you're a long-time customer, you're eligible for a 10% discount.
[00:24.680 --> 00:30.420]  If you're interested in taking advantage of that, I'll just need to get a manager to join us on the call to approve.
[00:30.420 --> 00:35.040]  Awesome, that sounds great. Great, please just hold a moment while I grab them.
[00:35.520 --> 00:36.920]  No problem.
[00:37.520 --> 00:43.140]  So, your account has been renewed and the discount applied. Is there anything else we can help you with today?
[00:43.180 --> 00:44.680]  Thanks, that's perfect.
[00:44.720 --> 00:47.260]  Great, please have yourself a great day.
[00:47.260 --> 00:48.040]  Goodbye.
[00:48.520 --> 00:49.360]  Goodbye.

1 reply

ryanheise Sep 12, 2024

Also tested v3:

[00:00.000 --> 00:05.060]  Good morning, XYZ Incorporation, Agent 1 speaking. How can I help you today?
[00:05.180 --> 00:07.480]  Good morning, I'm calling to renew my service.
[00:07.680 --> 00:11.320]  Okay, I can help you with that. Can I just confirm your name and account number?
[00:11.520 --> 00:14.780]  Sure, it's Gokul and my account number is 8615309.
[00:14.780 --> 00:18.280]  Okay, I have your account up here. Just give me a second to renew it.
[00:18.560 --> 00:24.240]  So, I have the renewal here, but it says since you're a long-time customer, you're eligible for a 10% discount.
[00:24.240 --> 00:30.040]  If you're interested in taking advantage of that, I'll just need to get a manager to join us on the call to approve.
[00:30.200 --> 00:34.520]  Awesome, that sounds great. Great, please just hold a moment while I grab them.
[00:35.460 --> 00:36.440]  No problem.
[00:37.460 --> 00:42.660]  So, your account has been renewed and the discount applied. Is there anything else we can help you with today?
[00:43.000 --> 00:44.180]  Thanks, that's perfect.
[00:44.340 --> 00:46.840]  Great, please have yourself a great day.
[00:46.980 --> 00:47.720]  Goodbye.
[00:48.180 --> 00:48.800]  Goodbye.

And tiny:

[00:00.580 --> 00:06.140]  Good morning, actually the incorporation agent one speaking. How can I help you today? Good morning.
[00:06.140 --> 00:11.820]  I'm calling to running my service. Okay. I can help you with that. Can I just come from your name and a call number?
[00:11.880 --> 00:15.180]  Sure. It's Google or my account number is 8655309.
[00:15.280 --> 00:19.020]  Okay. I have your account up here. Just give me a second to run you with it.
[00:19.020 --> 00:24.800]  So I have the renewal here, but it says since you're a long time customer, you're eligible for a 10% discount.
[00:24.880 --> 00:29.980]  If you're interested in taking advantage of that, I'll just need to get a manager to join us on the call to approve.
[00:29.980 --> 00:36.160]  Awesome. That's some great. Great. Please just hold a moment while I grab them.
[00:36.160 --> 00:37.060]  No problem.
[00:38.080 --> 00:43.260]  So your account has been renewed and the discount applied. Is there anything else we can help you with today?
[00:43.360 --> 00:49.380]  Thanks. That's perfect. Great. Please have yourselves a great day. Goodbye. Goodbye.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MuLaw Audio Transcription in Whisper Model #2331

{{title}}

Replies: 4 comments 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

MuLaw Audio Transcription in Whisper Model #2331

gokulhub-io Sep 10, 2024

Replies: 4 comments · 3 replies

ryanheise Sep 11, 2024

gokulhub-io Sep 11, 2024 Author

gongouveia Sep 11, 2024

gokulhub-io Sep 11, 2024 Author

gongouveia Sep 11, 2024

ryanheise Sep 12, 2024

ryanheise Sep 12, 2024

gokulhub-io
Sep 10, 2024

Replies: 4 comments 3 replies

ryanheise
Sep 11, 2024

gokulhub-io
Sep 11, 2024
Author

gokulhub-io
Sep 11, 2024
Author

ryanheise
Sep 12, 2024