Tentative lhotse --> kaldi manifests conversion for multiple channels #962

popcornell · 2023-01-30T11:50:11Z

As title says, works with CHiME6 + DiPCo + Mixer6 Speech.
But may break in other circumstances.

popcornell · 2023-01-30T15:35:15Z

NOTE: isort installation fails on my side causing pre-commit hook to fail.

I removed it for now from pre-commit hook. But I can re-add it back later once tests here are fine

… kaldi_channel

desh2608 · 2023-02-01T18:35:47Z

Tagging @jtrmal since he's been working on the Kaldi import/export stuff.

@pzelasko We need this option to support the ESPNet2 baseline for the CHiME-challenge (espnet/espnet#4894). My main concern right now is that the exported manifests for multi-channel recording won't be consistent when imported back into Lhotse, but I think most of the current use-case for this is limited to exporting the manifests, so perhaps we can get this checked in?

pzelasko · 2023-02-01T18:48:42Z

Thanks guys! (and nice too see you back here @popcornell :))

Does this PR introduce any regression? As long as there's no regression for the existing cases, I am OK with incosistencies on importing back.

desh2608 · 2023-02-01T18:53:05Z

Thanks guys! (and nice too see you back here @popcornell :))

Does this PR introduce any regression? As long as there's no regression for the existing cases, I am OK with incosistencies on importing back.

It shouldn't change anything for the existing single-channel export AFAIK, but let's wait for the tests to confirm that.

pzelasko

Cool, LGTM!

popcornell · 2023-02-01T18:57:50Z

@pzelasko Ahaha yeah I am back, amazed on how much lhotse has grown.

I think consistency is impossible for CHiME-6 like datasets where you have multi-channel data splitted over multiple single channel files (even for the same device).
These will be dumped as single channel kaldi entries in wav.scp but they are regarded as multi-channel in lhotse no ?

popcornell · 2023-02-01T18:59:00Z

lhotse/kaldi.py

-                    f"ffmpeg -threads 1 -i {source.source} -ar {sampling_rate} "
-                    f"-map_channel 0.0.{channel}  -f wav -threads 1 pipe:1 |"
-                )
+                if len(source.channels) == 1:


this maybe we want to change in future BTW

It might be worth adding a comment here to point out limitations. Something like NOTE (@popcornelle): ....

pzelasko · 2023-02-01T19:12:23Z

I think consistency is impossible for CHiME-6 like datasets where you have multi-channel data splitted over multiple single channel files (even for the same device). These will be dumped as single channel kaldi entries in wav.scp but they are regarded as multi-channel in lhotse no ?

That's right, Kaldi data directories don't really "know" about multi-channel recordings, we could probably work around it by adding some convention like special recording-id suffixes that help group relevant wav.scp lines back into a single Recording, but if nobody needs it today, there's likely no point.

BTW don't worry about failing test for python 3.9, it's an RNG flake, we can merge it anyway once the rest finishes.

jtrmal · 2023-02-02T10:10:34Z

As long as tests finish, I'm fine with it. The only long-term issue will be if there are different conventions for multichannel audio in kaldi recipes audio files, but let's not worry about that for now y.

…

On Wed, Feb 1, 2023 at 8:12 PM Piotr Żelasko ***@***.***> wrote: I think consistency is impossible for CHiME-6 like datasets where you have multi-channel data splitted over multiple single channel files (even for the same device). These will be dumped as single channel kaldi entries in wav.scp but they are regarded as multi-channel in lhotse no ? That's right, Kaldi data directories don't really "know" about multi-channel recordings, we could probably work around it by adding some convention like special recording-id suffixes that help group relevant wav.scp lines back into a single Recording, but if nobody needs it today, there's likely no point. BTW don't worry about failing test for python 3.9, it's an RNG flake, we can merge it anyway once the rest finishes. — Reply to this email directly, view it on GitHub <#962 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACUKYX7XY5N4IDZ74PHY7ILWVKYSFANCNFSM6AAAAAAULBRFSI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

desh2608 · 2023-02-02T13:35:41Z

@popcornell is this ready to merge?

jtrmal · 2023-02-02T13:40:07Z

WOuld be great if you guys could add tests testing the multi-channel setup y.

…

On Thu, Feb 2, 2023 at 2:35 PM Desh Raj ***@***.***> wrote: @popcornell <https://github.com/popcornell> is this ready to merge? — Reply to this email directly, view it on GitHub <#962 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACUKYX2RSBOATNI5EVPKOVDWVOZ3RANCNFSM6AAAAAAULBRFSI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

desh2608 · 2023-02-02T20:33:17Z

test/test_kaldi_dirs.py

        },
    }


-@pytest.mark.xfail(reason="multi file recordings not supported yet")


@jtrmal This multi-channel recording test which was expected to fail earlier is now passing.

popcornell · 2023-02-02T20:56:16Z

@desh2608 On my side everything seems to work right now.

pzelasko · 2023-02-03T20:24:42Z

Thanks @popcornell and @desh2608, merging!

popcornell added 5 commits January 28, 2023 21:50

Update kaldi.py

ae3d275

trying fixed for reco2dur

9e874af

handle multichannel to kaldi

a77bc10

apply linter

44acc88

run linter

b91663b

desh2608 and others added 6 commits January 31, 2023 12:05

Merge branch 'master' into master

1a80163

fix failing tests; add note about back-compatibility

86424de

Merge branch 'master' of https://github.com/lhotse-speech/lhotse into…

268c6eb

… kaldi_channel

add isort back

b066236

if the file has one channel, return always ffmpeg 0 channels

69d3464

Merge remote-tracking branch 'origin/master'

73d0f72

pzelasko added this to the v1.13 milestone Feb 1, 2023

Merge branch 'master' into master

5456716

pzelasko approved these changes Feb 1, 2023

View reviewed changes

popcornell commented Feb 1, 2023

View reviewed changes

desh2608 reviewed Feb 2, 2023

View reviewed changes

pzelasko merged commit a792260 into lhotse-speech:master Feb 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tentative lhotse --> kaldi manifests conversion for multiple channels #962

Tentative lhotse --> kaldi manifests conversion for multiple channels #962

popcornell commented Jan 30, 2023

popcornell commented Jan 30, 2023 •

edited

Loading

desh2608 commented Feb 1, 2023

pzelasko commented Feb 1, 2023

desh2608 commented Feb 1, 2023

pzelasko left a comment

popcornell commented Feb 1, 2023

popcornell Feb 1, 2023

desh2608 Feb 1, 2023

pzelasko commented Feb 1, 2023

jtrmal commented Feb 2, 2023 via email

desh2608 commented Feb 2, 2023

jtrmal commented Feb 2, 2023 via email

desh2608 Feb 2, 2023

popcornell commented Feb 2, 2023

pzelasko commented Feb 3, 2023

Tentative lhotse --> kaldi manifests conversion for multiple channels #962

Tentative lhotse --> kaldi manifests conversion for multiple channels #962

Conversation

popcornell commented Jan 30, 2023

popcornell commented Jan 30, 2023 • edited Loading

desh2608 commented Feb 1, 2023

pzelasko commented Feb 1, 2023

desh2608 commented Feb 1, 2023

pzelasko left a comment

Choose a reason for hiding this comment

popcornell commented Feb 1, 2023

popcornell Feb 1, 2023

Choose a reason for hiding this comment

desh2608 Feb 1, 2023

Choose a reason for hiding this comment

pzelasko commented Feb 1, 2023

jtrmal commented Feb 2, 2023 via email

desh2608 commented Feb 2, 2023

jtrmal commented Feb 2, 2023 via email

desh2608 Feb 2, 2023

Choose a reason for hiding this comment

popcornell commented Feb 2, 2023

pzelasko commented Feb 3, 2023

popcornell commented Jan 30, 2023 •

edited

Loading