Add `transforms` attribute for MixedCut #1035

desh2608 · 2023-04-20T15:40:34Z

Currently, for MixedCut, any transforms are always lazily applied on the underlying tracks. This is fine for transforms such as speed perturbation, but may not be ideal for other kinds of transforms. Consider the following examples:

We have a MixedCut consisting of 3 tracks: 2 MonoCuts and a PaddingCut, i.e., it represents 2 utterances from a speaker with some silence in between. Suppose we want to convolve it with an RIR. In the current method, the tracks will be individually convolved, and the silence portion will not contain any residue from the previous MonoCut. Instead, we should first mix the 3 tracks and then apply the RIR on the resulting audio.
We have a MixedCut comprising 2 MonoCuts with some overlap, and we want to perform loudness normalization. Currently, the tracks will be normalized individually and then mixed, and the resulting mixture may not have the target loudness.

To solve these problems, we introduce a transforms attribute for MixedCut. It is analogous to the transforms attribute in Recording, and contains the transforms that should be lazily applied at the time of audio loading, but after the underlying tracks have been mixed.

pzelasko · 2023-04-20T17:54:08Z

lhotse/cut/mixed.py

+        #    to all tracks. It does not make sense if all tracks belong to different speakers,
+        #    but it is useful for cases when we have a mixture of MonoCut and PaddingCut,
+        #    and we want to apply the same RIR to all of them.
+        # 2. Apply RIRs to each track separately. This is useful when we want to simulate


Your description here got me thinking -- does it really matter in which order we apply RIR and downmix? Casting aside the numerical differences, RIR reverb is a convolution, which is linear, so the sum of reverbed signals should be equal to the reverbed sum of signals, shouldn't it? Am I missing something here?

Ohh I missed the effect on track edges -- the signals are possibly not the same length and have different offsets, and before the PR we only pad them after (not before) the convolution. You mentioned this in the PR description.

Yeah I was mainly concerned about edge effects in the case of reverb.

pzelasko · 2023-04-20T17:59:14Z

Cool contribution! One question: what happens if we mix another track after the reverb/loudnorm is applied? It looks to me like the behaviour wouldn't be correct (it would reverb/norm everything including the track mixed after).

desh2608 · 2023-04-20T18:19:11Z

Cool contribution! One question: what happens if we mix another track after the reverb/loudnorm is applied? It looks to me like the behaviour wouldn't be correct (it would reverb/norm everything including the track mixed after).

Yeah, I think perhaps the right approach is to not allow mixing in new tracks once any of the mix_first transforms have been applied.

pzelasko · 2023-04-20T18:30:56Z

Yeah, I think perhaps the right approach is to not allow mixing in new tracks once any of the mix_first transforms have been applied.

Would it work if we added a MixedCut as a MixTrack when it has a transform defined on it? I think that technically mixing code only calls load_audio so it might work as expected..

EDIT there might be code all around the place though that doesn't expect MixedCut to be present in a track. Could you check if it's possible to make that work? It would be a pity to raise an exception here.

desh2608 · 2023-04-20T18:45:52Z

Actually yeah we do allow MixedCut to be present as a MixTrack. In fact, the meeting simulation workflow explicitly uses this flexibility.

So you're right, if a new track is mixed into a MixedCut containing a transform, we can just create a nested MixedCut instead of adding to the tracks.

pzelasko

LGTM

desh2608 · 2023-04-24T17:13:13Z

LGTM

Cool, thanks. Merging.

desh2608 added 2 commits April 20, 2023 11:26

add transform attribute for MixedCut

ce0f5c1

add mix_first option in normalize_loudness

ab18682

pzelasko reviewed Apr 20, 2023

View reviewed changes

pzelasko added this to the v1.14 milestone Apr 20, 2023

desh2608 added 3 commits April 20, 2023 15:40

handle the case when mix is called on MixedCut with existing transforms

e4bca74

add test for mixing with transformed MixedCut

71a9236

Merge branch 'master' into mixed_cut_transform

7529919

pzelasko approved these changes Apr 24, 2023

View reviewed changes

desh2608 merged commit 1b31bf2 into lhotse-speech:master Apr 24, 2023

desh2608 deleted the mixed_cut_transform branch April 24, 2023 17:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `transforms` attribute for MixedCut #1035

Add `transforms` attribute for MixedCut #1035

desh2608 commented Apr 20, 2023

pzelasko Apr 20, 2023

pzelasko Apr 20, 2023 •

edited

Loading

desh2608 Apr 20, 2023

pzelasko commented Apr 20, 2023

desh2608 commented Apr 20, 2023

pzelasko commented Apr 20, 2023 •

edited

Loading

desh2608 commented Apr 20, 2023

pzelasko left a comment

desh2608 commented Apr 24, 2023

Add transforms attribute for MixedCut #1035

Add transforms attribute for MixedCut #1035

Conversation

desh2608 commented Apr 20, 2023

pzelasko Apr 20, 2023

Choose a reason for hiding this comment

pzelasko Apr 20, 2023 • edited Loading

Choose a reason for hiding this comment

desh2608 Apr 20, 2023

Choose a reason for hiding this comment

pzelasko commented Apr 20, 2023

desh2608 commented Apr 20, 2023

pzelasko commented Apr 20, 2023 • edited Loading

desh2608 commented Apr 20, 2023

pzelasko left a comment

Choose a reason for hiding this comment

desh2608 commented Apr 24, 2023

Add `transforms` attribute for MixedCut #1035

Add `transforms` attribute for MixedCut #1035

pzelasko Apr 20, 2023 •

edited

Loading

pzelasko commented Apr 20, 2023 •

edited

Loading