-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to configure TTS engines in the UI #151
Comments
I guess "the cloud based engines always work" isn't true if there is no internet connection. |
Some idea's from yesterday's team meeting:
@bertfrees: (see daisy/pipeline-modules#66)
|
This issue is not quite fixed yet I think. #154 fixes only a part of it, namely selecting preferred voices. |
Can you create more issues for what still needs to be done? Or is it more of keeping everyone's ideas around? |
It is more a collection of ideas. We can create some new issues with concrete things to do. |
On second thought, the first comment in this issue sums it up pretty well I think. Instead of creating a new issue with more or less the same in it, I'm gonna reword this one, and convert it into a list of tasks. |
Will there be an API to describe engines' properties? Or should I hardcode it based on the engine configuration docs? Voice preview will come from the API too, right? Is that ready? |
I think hardcoding the properties makes the most sense for now. But an API for the engines can definitely be useful too. Let's keep the idea. Voice preview will come from the API, yes. I'm not sure yet what the API should look like though. I guess it could also be a general purpose "speak" command, that could even accept SSML. That wouldn't be so hard to do. (A while back I already wrote a mock of the Google TTS API that dispatches to the available TTS engines.) |
Added credential fields for Azure and Google voices in 4d97c87 Verifying the credentials is a new issue: #164 From this convo we have now implemented all the engine settings for our current goal |
Noting here that we also got a feature request from a tester: |
Is there anything left in the first task above that is still relevant? We have designed the settings dialog in a different way based on other convos about engine properties that we wanted to support. And the "voice preview" task is still pending engine implementation. |
The way the settings dialog looks now is great! It's very minor, but one thing that would be nice is if the status (connected or disconnected) would somehow be made even more clear. I don't know how though. By the way, this note at the top:
Is it really needed? It seems the voices are updated without closing and reopening the settings. |
Yes, I also thought of emphasizing it visually like that. That is indeed slightly better, however I don't know whether that fundamentally changes anything? 'Cause it will just be decoration, right? Perhaps the engines could be grouped by connection status? There would be two main headings with the connection status, under which the subheadings "Azure" and "Google" would go. The main headings wouldn't need to be visible for sighted users, the sections could be indicated some other way. Just thinking out loud. As I said, it is already good the way it is now. |
Ok I will commit this since it seems better. We can keep this thread going for ideas. I don't like grouping it by connection status because then doesn't it get reordered when the status changes? That's visually disruptive and probably bad accessibility. |
Yes, that's probably true. |
Voice preview team discussion: https://daisy-dev.slack.com/archives/C064GB8U9/p1719243930802729 |
Some engines need to be configured before they can work. Examples are the cloud based engines from Google and Microsoft. There should be a config panel for entering the available properties for each engine. A nice and simple way to present the settings could be to have one panel per available engine. Each engine would have a status to indicate whether it is working or not. (The GUI can check this using the API to retrieve voices.) The settings would be nicely hidden for unconfigured or disabled engines. Upon enabling an engine, the user is asked to fill in the required properties. If the properties are not filled in correctly, the engine is not enabled. The user can also disable engines that are configured correctly or that don't need configuration (e.g. the engines native to Windows and macOS).
Scripts with TTS support have a "Text-to-speech configuration file". For usability it would be nicer if this configuration would be integrated in the UI somehow. #154 adds a config panel for selecting preferred voices.
A nice addition to the voices config panel could be a way to have a live preview of the voices (blocked on Add ability to get a "preview" of a voice pipeline-modules#89).
The text was updated successfully, but these errors were encountered: