Output .ctm of Speech Data Simulator has channel and spk_id swapped #7445

popcornell · 2023-09-15T13:25:44Z

NeMo/nemo/collections/asr/data/data_simulation.py

Line 1057 in 2cc0942

text = f"{session_name} {speaker_id} {align1} {align2} {word} 0\n"

But according to https://web.archive.org/web/20170119114252/http://www.itl.nist.gov/iad/mig/tests/rt/2009/docs/rt09-meeting-eval-plan-v2.pdf it should be:

<SOURCE><SP><CHANNEL><SP> <BEG-TIME><SP><DURATION><SP><TOKEN><SP>
<CONF><SP><TYPE><SP><SPEAKER><NEWLINE>

The text was updated successfully, but these errors were encountered:

github-actions · 2023-10-16T01:45:04Z

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

tango4j · 2023-10-17T22:50:27Z

Hi, @popcornell .
This is intended since we were creating dataset where all channels share the same word alignment
so we used channel slot as speaker.
Now that we are trying to let the public users use the data simulator freely, it needs to be updated to have consistency with the CTM convention in RT09 document.
I will keep this open until this gets fixed.

popcornell · 2023-10-17T22:57:38Z

Hi Taejin, thanks for the reply.
I like a lot the data simulator, it is very fast and really helpful, I used it in a recent work.

I can actually fix this it is pretty easy, I have already done so locally.
I needed consistency with RT09 convention because I was using lhotse https://github.com/lhotse-speech/lhotse for dataloading and having .ctm was quite handy for loading in the manifests also the word alignments.

This is intended since we were creating dataset where all channels share the same word alignment
so we used channel slot as speaker.

There is a channel slot in the .ctm convention but IDK if it is what you need.

tango4j · 2023-10-17T23:27:58Z

Oh I see.
This definitely needs to be updated ASAP.
Also Piotr Zelasko [email protected] joined NVIDIA NeMo team, so
I think I could let him go through the PR to make sure the compatibility with lhotse.

github-actions · 2023-11-17T01:46:01Z

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions · 2023-11-25T01:44:55Z

This issue was closed because it has been inactive for 7 days since being marked as stale.

popcornell · 2023-12-08T00:19:48Z

I have a PR that addresses this now.

github-actions bot added the stale label Oct 16, 2023

nithinraok assigned tango4j Oct 17, 2023

github-actions bot removed the stale label Oct 18, 2023

github-actions bot added the stale label Nov 17, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 25, 2023

popcornell mentioned this issue Dec 8, 2023

.ctm in data simulator annotator compliant with RT-09 specification #7999

Closed

popcornell mentioned this issue Dec 9, 2023

.ctm in data simulator annotator compliant with RT-09 specification #8004

Merged

tango4j closed this as completed in #8004 Jan 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output .ctm of Speech Data Simulator has channel and spk_id swapped #7445

Output .ctm of Speech Data Simulator has channel and spk_id swapped #7445

popcornell commented Sep 15, 2023

github-actions bot commented Oct 16, 2023

tango4j commented Oct 17, 2023 •

edited

Loading

popcornell commented Oct 17, 2023

tango4j commented Oct 17, 2023 •

edited

Loading

github-actions bot commented Nov 17, 2023

github-actions bot commented Nov 25, 2023

popcornell commented Dec 8, 2023

Output .ctm of Speech Data Simulator has channel and spk_id swapped #7445

Output .ctm of Speech Data Simulator has channel and spk_id swapped #7445

Comments

popcornell commented Sep 15, 2023

github-actions bot commented Oct 16, 2023

tango4j commented Oct 17, 2023 • edited Loading

popcornell commented Oct 17, 2023

tango4j commented Oct 17, 2023 • edited Loading

github-actions bot commented Nov 17, 2023

github-actions bot commented Nov 25, 2023

popcornell commented Dec 8, 2023

tango4j commented Oct 17, 2023 •

edited

Loading

tango4j commented Oct 17, 2023 •

edited

Loading