Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ASR] Mask-based dereverb algorithm #5693

Merged
merged 1 commit into from
Jan 22, 2023

Conversation

anteju
Copy link
Collaborator

@anteju anteju commented Dec 21, 2022

What does this PR do ?

This PR adds a multichannel input, multichannel output mask-based dereverberation algorithm based on weighted prediction error (WPE).

Collection: ASR

Changelog

  • New: Dereverberation algorithm implemented in MaskBasedDereverbWPE
  • Added unit tests for handling of convolution matrices
  • Input length is now optional for AudioToSpectrogram
  • Updated relevant unit tests

Usage

An example script with a test vector is attached to this PR.

Example script

pr_5693_example_script.zip

Code example

The algorithm can be used as follows:

import soundfile as sf
import torch

from nemo.collections.asr.modules.audio_modules import MaskBasedDereverbWPE
from nemo.collections.asr.modules.audio_preprocessing import AudioToSpectrogram, SpectrogramToAudio
from nemo.collections.asr.parts.preprocessing.segment import AudioSegment

# Parameters
fft_length = 512
hop_length = fft_length // 2
filter_length = 8
delay = 2
num_iterations = 3

# Load audio
sample_rate = 16000
x = AudioSegment.from_file(filename, target_sr=sample_rate).samples.T

# Add batch dimension, shape (B, C, T)
x = x[None, ...]

# Prepare analysis and synthesis transforms
stft = AudioToSpectrogram(fft_length=fft_length, hop_length=hop_length)
istft = SpectrogramToAudio(fft_length=fft_length, hop_length=hop_length)

# Analysis transform
X, _ = stft(input=torch.tensor(x))

# Prepare dereverb instance
dereverb = MaskBasedDereverbWPE(filter_length=filter_length, prediction_delay=delay, num_iterations=num_iterations)
# Processing
Y, _ = dereverb(input=X, mask=None)

# Synthesis transform: shape (B, C, T)
y, _ = istft(input=Y)

# Save audio
sf.write('output/dereverb_output.wav', y.cpu().numpy()[0, ...].T, sample_rate)

Signal example

Below we show spectrograms of the input signal and the processed output signal. Signals are available in the attached example. Only the first channel is shown.

Input signal Output signal
dereverb_input_ch0 dereverb_output_ch0

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

@github-actions github-actions bot added the ASR label Dec 21, 2022
@anteju anteju force-pushed the dev/multichannel-dereverb branch 2 times, most recently from cf576fc to 64ea05f Compare December 22, 2022 00:23
@anteju anteju force-pushed the dev/multichannel-dereverb branch 6 times, most recently from e75dd3a to e35fd98 Compare January 10, 2023 17:43
@anteju anteju force-pushed the dev/multichannel-dereverb branch 3 times, most recently from 89e889b to 9adc6ce Compare January 18, 2023 17:23
Copy link
Collaborator

@jbalam-nv jbalam-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jbalam-nv jbalam-nv merged commit 0336b8d into NVIDIA:main Jan 22, 2023
Kipok pushed a commit to Kipok/NeMo that referenced this pull request Jan 31, 2023
Signed-off-by: Ante Jukić <[email protected]>

Signed-off-by: Ante Jukić <[email protected]>
ericharper pushed a commit that referenced this pull request Jan 31, 2023
Signed-off-by: Ante Jukić <[email protected]>

Signed-off-by: Ante Jukić <[email protected]>
ericharper pushed a commit that referenced this pull request Jan 31, 2023
Signed-off-by: Ante Jukić <[email protected]>

Signed-off-by: Ante Jukić <[email protected]>
Kipok pushed a commit to Kipok/NeMo that referenced this pull request Jan 31, 2023
Signed-off-by: Ante Jukić <[email protected]>

Signed-off-by: Ante Jukić <[email protected]>
titu1994 pushed a commit to titu1994/NeMo that referenced this pull request Mar 24, 2023
Signed-off-by: Ante Jukić <[email protected]>

Signed-off-by: Ante Jukić <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants