support for alignment output in tsv format #407

contentnation · 2024-02-23T16:21:10Z

Support of alignment data output.
Kind of matching on issue #364
Can be used as a base for #391 and #361
Runs text to speech 2 times, one for normal audio generation,
a second time for each word.
Since both produce different outputs and times, a correction is applied.
Not perfect, but "good enough". Both will self sync after each sentence, so only slight offset are created.

vytskalt · 2024-04-03T11:58:13Z

I've been trying this out. Looks like when using a long text some of the last words are being skipped in the alignment file.

contentnation · 2024-04-03T12:44:58Z

@vytskalt can you provide an example so I can debug/fix it?

vytskalt · 2024-04-03T13:01:45Z

@vytskalt can you provide an example so I can debug/fix it?

Yes, this is the command I'm running:

cat text.txt | piper --sentence-silence 0.5 -m en_US-ryan-high --output_file out.wav --alignment-data alignment.tsv

This is the text (random Reddit post): text.txt

In the alignment.tsv, 2 of the last words are missing.

contentnation · 2024-04-03T13:52:39Z

ok, it's not the length that is the issue, it's the content. For example: "musical/sport" will be spoken as 3 words. "in the" is mangled into one spoken word. My word/phoneme sync trips over this. Needs to be fixed, I have to find another way to sync.

… or split by "musical/sports". Also fixed missing sentence silence in calculation

charlyhayoz · 2024-05-07T15:28:19Z

Hi,

i pulled this pull request and make a build but the --ali gnment-data is not disponible in the executable "piper" in the install folder.

Am i missing something to make it work ?

Thanks (:

contentnation · 2024-05-07T15:30:41Z

It is only built into the python script, not in the c++ executable.

charlyhayoz · 2024-05-08T09:46:28Z

Make sense ! Thanks (:

Sascha Nitsch added 2 commits February 23, 2024 17:11

support for alignment output in tsv

6bbce86

support for alignment output in tsv

40e6bae

fixed missing adjustment with the silence at the end

e4d1d65

fixing out-of-sync of alignment when words are combined like "in the"…

6ce77e8

… or split by "musical/sports". Also fixed missing sentence silence in calculation

Merge branch 'rhasspy:master' into alignment_data

65f3b00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support for alignment output in tsv format #407

support for alignment output in tsv format #407

contentnation commented Feb 23, 2024

vytskalt commented Apr 3, 2024

contentnation commented Apr 3, 2024

vytskalt commented Apr 3, 2024

contentnation commented Apr 3, 2024

charlyhayoz commented May 7, 2024

contentnation commented May 7, 2024

charlyhayoz commented May 8, 2024

support for alignment output in tsv format #407

Are you sure you want to change the base?

support for alignment output in tsv format #407

Conversation

contentnation commented Feb 23, 2024

vytskalt commented Apr 3, 2024

contentnation commented Apr 3, 2024

vytskalt commented Apr 3, 2024

contentnation commented Apr 3, 2024

charlyhayoz commented May 7, 2024

contentnation commented May 7, 2024

charlyhayoz commented May 8, 2024