Ability to configure TTS engines in the UI #151

bertfrees · 2023-06-27T12:38:39Z

Some engines need to be configured before they can work. Examples are the cloud based engines from Google and Microsoft. There should be a config panel for entering the available properties for each engine. A nice and simple way to present the settings could be to have one panel per available engine. Each engine would have a status to indicate whether it is working or not. (The GUI can check this using the API to retrieve voices.) The settings would be nicely hidden for unconfigured or disabled engines. Upon enabling an engine, the user is asked to fill in the required properties. If the properties are not filled in correctly, the engine is not enabled. The user can also disable engines that are configured correctly or that don't need configuration (e.g. the engines native to Windows and macOS).
Scripts with TTS support have a "Text-to-speech configuration file". For usability it would be nicer if this configuration would be integrated in the UI somehow. #154 adds a config panel for selecting preferred voices.
A nice addition to the voices config panel could be a way to have a live preview of the voices (blocked on Add ability to get a "preview" of a voice pipeline-modules#89).

ways2read · 2023-06-28T12:55:10Z

I guess "the cloud based engines always work" isn't true if there is no internet connection.

bertfrees · 2023-07-04T10:48:38Z

Some idea's from yesterday's team meeting:

@rdeltour:

One thing that c/b nice is to have a way to have a live preview of the TTS from a config page in the UI.

@bertfrees: (see daisy/pipeline-modules#66)

I want to eliminate the "Text-to-speech configuration file" options, and replace it with the following:

a dedicated option to specify CSS style sheets (in addition to the possibility to attach style sheets to the input)

a dedicated option to specify lexicons (in addition to the possibility to attach lexicons to the input)

dedicated options for certain TTS properties

org.daisy.pipeline.tts.log: done

org.daisy.pipeline.tts.mp3.bitrate: to do

org.daisy.pipeline.tts.lame.cli.options: has been deprecated

it should not be possible anymore to set other TTS properties dynamically (per job) (note that org.daisy.pipeline.tts.host.protection has already been deprecated)

per-job voice configuration should be replaced by a system wide voice configuration

bertfrees · 2023-09-25T19:25:27Z

This issue is not quite fixed yet I think. #154 fixes only a part of it, namely selecting preferred voices.

marisademeglio · 2023-09-25T19:53:30Z

This issue is not quite fixed yet I think. #154 fixes only a part of it, namely selecting preferred voices.

Can you create more issues for what still needs to be done? Or is it more of keeping everyone's ideas around?

bertfrees · 2023-09-26T07:38:07Z

It is more a collection of ideas. We can create some new issues with concrete things to do.

bertfrees · 2023-09-27T18:55:08Z

On second thought, the first comment in this issue sums it up pretty well I think. Instead of creating a new issue with more or less the same in it, I'm gonna reword this one, and convert it into a list of tasks.

marisademeglio · 2023-09-27T19:34:04Z

Will there be an API to describe engines' properties? Or should I hardcode it based on the engine configuration docs?

Voice preview will come from the API too, right? Is that ready?

bertfrees · 2023-09-27T19:49:40Z

I think hardcoding the properties makes the most sense for now. But an API for the engines can definitely be useful too. Let's keep the idea.

Voice preview will come from the API, yes. I'm not sure yet what the API should look like though. I guess it could also be a general purpose "speak" command, that could even accept SSML. That wouldn't be so hard to do. (A while back I already wrote a mock of the Google TTS API that dispatches to the available TTS engines.)

marisademeglio · 2023-10-19T04:42:27Z

Added credential fields for Azure and Google voices in 4d97c87

Verifying the credentials is a new issue: #164

From this convo we have now implemented all the engine settings for our current goal

marisademeglio · 2024-02-06T00:06:52Z

Noting here that we also got a feature request from a tester:
"When selecting between voices, offer a preview."

marisademeglio · 2024-04-16T17:56:40Z

Is there anything left in the first task above that is still relevant? We have designed the settings dialog in a different way based on other convos about engine properties that we wanted to support.

And the "voice preview" task is still pending engine implementation.

bertfrees · 2024-04-17T12:09:46Z

Is there anything left in the first task above that is still relevant?

The way the settings dialog looks now is great!

It's very minor, but one thing that would be nice is if the status (connected or disconnected) would somehow be made even more clear. I don't know how though.

By the way, this note at the top:

After configuring these engines with the required credentials, they will be available under 'Voices'. Save and reopen the settings dialog to see changes.

Is it really needed? It seems the voices are updated without closing and reopening the settings.

marisademeglio · 2024-04-17T16:43:25Z

True, that wording can be simplified as the changes now are effective immediately.

How is this for a slightly clearer connected/disconnected status?

bertfrees · 2024-04-17T18:08:41Z

Yes, I also thought of emphasizing it visually like that. That is indeed slightly better, however I don't know whether that fundamentally changes anything? 'Cause it will just be decoration, right?

Perhaps the engines could be grouped by connection status? There would be two main headings with the connection status, under which the subheadings "Azure" and "Google" would go. The main headings wouldn't need to be visible for sighted users, the sections could be indicated some other way.

Just thinking out loud. As I said, it is already good the way it is now.

marisademeglio · 2024-04-17T18:32:47Z

Ok I will commit this since it seems better.

We can keep this thread going for ideas. I don't like grouping it by connection status because then doesn't it get reordered when the status changes? That's visually disruptive and probably bad accessibility.

bertfrees · 2024-04-17T18:34:18Z

Yes, that's probably true.

marisademeglio · 2024-06-24T16:15:25Z

Voice preview team discussion:

https://daisy-dev.slack.com/archives/C064GB8U9/p1719243930802729
https://daisy-dev.slack.com/archives/C064GB8U9/p1719244054736299

Basic implementation of #151

marisademeglio mentioned this issue Sep 25, 2023

feat(tts): configure tts via settings dialog #154

Merged

marisademeglio closed this as completed in #154 Sep 25, 2023

bertfrees reopened this Sep 27, 2023

marisademeglio mentioned this issue Sep 29, 2023

TTS configuration UI ideas #155

Closed

marisademeglio added the enhancement New feature or request label Sep 29, 2023

marisademeglio added a commit that referenced this issue Apr 17, 2024

fix(#151): wording fix on engines settings, visual indicator of status

074f597

marisademeglio added this to the 1.6 milestone Jun 17, 2024

marisademeglio added the engine Issues that require something to change on the engine side label Jun 24, 2024

marisademeglio added a commit that referenced this issue Sep 16, 2024

feat(tts): add voice preview to TTS dialog

8a6e56f

Basic implementation of #151

bertfrees assigned marisademeglio Sep 19, 2024

marisademeglio added the ready-for-testing An implementation is ready to be tested label Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to configure TTS engines in the UI #151

Ability to configure TTS engines in the UI #151

bertfrees commented Jun 27, 2023 •

edited

Loading

ways2read commented Jun 28, 2023

bertfrees commented Jul 4, 2023 •

edited

Loading

bertfrees commented Sep 25, 2023

marisademeglio commented Sep 25, 2023

bertfrees commented Sep 26, 2023 •

edited

Loading

bertfrees commented Sep 27, 2023

marisademeglio commented Sep 27, 2023

bertfrees commented Sep 27, 2023

marisademeglio commented Oct 19, 2023

marisademeglio commented Feb 6, 2024

marisademeglio commented Apr 16, 2024

bertfrees commented Apr 17, 2024 •

edited

Loading

marisademeglio commented Apr 17, 2024

bertfrees commented Apr 17, 2024

marisademeglio commented Apr 17, 2024

bertfrees commented Apr 17, 2024

marisademeglio commented Jun 24, 2024

Ability to configure TTS engines in the UI #151

Ability to configure TTS engines in the UI #151

Comments

bertfrees commented Jun 27, 2023 • edited Loading

ways2read commented Jun 28, 2023

bertfrees commented Jul 4, 2023 • edited Loading

bertfrees commented Sep 25, 2023

marisademeglio commented Sep 25, 2023

bertfrees commented Sep 26, 2023 • edited Loading

bertfrees commented Sep 27, 2023

marisademeglio commented Sep 27, 2023

bertfrees commented Sep 27, 2023

marisademeglio commented Oct 19, 2023

marisademeglio commented Feb 6, 2024

marisademeglio commented Apr 16, 2024

bertfrees commented Apr 17, 2024 • edited Loading

marisademeglio commented Apr 17, 2024

bertfrees commented Apr 17, 2024

marisademeglio commented Apr 17, 2024

bertfrees commented Apr 17, 2024

marisademeglio commented Jun 24, 2024

bertfrees commented Jun 27, 2023 •

edited

Loading

bertfrees commented Jul 4, 2023 •

edited

Loading

bertfrees commented Sep 26, 2023 •

edited

Loading

bertfrees commented Apr 17, 2024 •

edited

Loading