Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does the synthesizer only work on "middle length" (+/- 20 words) sentences? #636

Closed
akrokodile opened this issue Jan 21, 2021 · 2 comments
Closed

Comments

@akrokodile
Copy link

Beyond 20 words, it seems to talk a blue streak.
Below 10, it produces pauses and inhumanly noises.
I've tried padding the short sentences with words consisting of the single letter "s" and then subtracting int(0.25 * Synthesizer.sample_rate) * padding_size from the b_ends array. It works, but only to an extent (sometimes it cuts too much, sometimes it leaves in a bit of the padding).
Is there any better way to teach the synth to process shorter sentences?

@ghost
Copy link

ghost commented Jan 21, 2021

Poor performance on short inputs results from bad training data. You can try the alternative model in #538 and see if it gets better.

Problems with long inputs are caused by a failure of the attention mechanism. Solution is to implement a better one and retrain the model. Much easier said than done.

@akrokodile
Copy link
Author

Thank you so much! The alternative model does get the short sentences right (also gets rid of weird pauses).
Re the long ones--I've got a working method of handling them (basically considering them in chunks of 20 words and seeing if I can split on commas).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant