Skip to content

Can't read Bengali year ১৯৫৪ সাল। কালো রাত। [Bug] #3815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
khandakershahi opened this issue Jul 8, 2024 · 6 comments
Labels
bug Something isn't working wontfix This will not be worked on but feel free to help.

Comments

@khandakershahi
Copy link

Describe the bug

I was testing the Bengali Voice model and it missed the Bengali number pronunciation. Bengali numbers
০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯
0 1 2 3 4 5 6 7 8 9.

১৯৫৪ সাল। কালো রাত। Here is supposed to read in Bengali, the year 1954.

log:

['১৯৫৪ সাল। কালো রাত।']
১৯৫৪ সাল। কালো রাত।
 [!] Character '৯' not found in the vocabulary. Discarding it.
 > Processing time: 1.444657564163208
 > Real-time factor: 0.46246659828395376

Log shows [!] Character '৯' not found in the vocabulary. Discarding it.

To Reproduce

pip install TTS

main.py

import torch
from TTS.api import TTS
import gradio as gr

device = "cuda" if torch.cuda.is_available() else "cpu"

def generate_audio(text="তুমি কেমন আছো?"):
    tts = TTS(model_name='tts_models/bn/custom/vits-male').to(device)
    tts.tts_to_file(text=text, file_path="outputs/output.wav")
    return "outputs/output.wav"

demo = gr.Interface(
    fn=generate_audio,
    inputs=[gr.Text(label="Text"),],
    outputs=[gr.Audio(label="Audio"),],
    )

demo.launch()

Expected behavior

No response

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [],
        "available": false,
        "version": "12.1"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.3.1+cu121",
        "TTS": "0.22.0",
        "numpy": "1.26.4"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "",
        "python": "3.11.2",
        "version": "#1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30)"
    }
}

Additional context

No response

@saifulislam79
Copy link

@khandakershahi use pybangla normalizer for your number normalization

https://pypi.org/project/pybangla/#description

@eginhard
Copy link
Contributor

@khandakershahi A Bengali phonemizer/normalizer is also included directly in Coqui TTS, you can use it as follows:

from TTS.tts.utils.text.phonemizers import BN_Phonemizer
bn = BN_Phonemizer()
bn.phonemize("১৯৫৪ সাল। কালো রাত।")

(resulting in এক হাজার নয় শত চুয়ান্ন সাল।কালো রাত।।)

@khandakershahi
Copy link
Author

@saifulislam79 Thank you. I am just a normal user. Don't know python coding or TTS.

I tried, but I didn't able to figure out how to use with my main.py code. Would it be possible to give me an update code of my main.py, so that it works with your package? Or anything else that allow to use the GUI interface.

@eginhard Thank you. I didn't able to figure out how to use with my main.py code. Would it be possible to give me an update code of my main.py, so that it works with the BN_Phonemizer? Or anything else that allow to use the GUI interface.

@eginhard
Copy link
Contributor

@khandakershahi Try this:

import torch
from TTS.api import TTS
from TTS.tts.utils.text.phonemizers import BN_Phonemizer
import gradio as gr

device = "cuda" if torch.cuda.is_available() else "cpu"
bn = BN_Phonemizer()
tts = TTS(model_name='tts_models/bn/custom/vits-male').to(device)

def generate_audio(text="তুমি কেমন আছো?"):
    text = bn.phonemize(text)
    tts.tts_to_file(text=text, file_path="outputs/output.wav")
    return "outputs/output.wav"

demo = gr.Interface(
    fn=generate_audio,
    inputs=[gr.Text(label="Text"),],
    outputs=[gr.Audio(label="Audio"),],
    )

demo.launch()

@khandakershahi
Copy link
Author

@eginhard Thank you. It now works. One thing is here. It only read it as number, not year.
like "১৯৫৪ সাল" should be "উনিশশ চুয়ান্ন সাল" or "উনিশশত চুয়ান্ন সাল।"
"এক হাজার নয় শত চুয়ান্ন সাল।" not used in bagali language.

Many many thanks.

Copy link

stale bot commented Nov 10, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

@stale stale bot added the wontfix This will not be worked on but feel free to help. label Nov 10, 2024
@stale stale bot closed this as completed Dec 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on but feel free to help.
Projects
None yet
Development

No branches or pull requests

3 participants