Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
43a3921
add and delete files
Aug 31, 2018
2bc97b9
add files for TDNN training and modify some files
Sep 13, 2018
d816ad1
modified run.sh and add local/score.sh, get_reslts.sh
Sep 24, 2018
13ce9d0
update and clean up scripts
Oct 2, 2018
53fc524
minor modification
Oct 2, 2018
3a648fc
minor modification
Oct 2, 2018
078137e
Merge branch 'master' into reverb
Szu-JuiChen Oct 3, 2018
2b112e8
parameter update
Oct 3, 2018
04b42f4
Merge branch 'reverb' of https://github.com/Szu-JuiChen/kaldi into re…
Oct 3, 2018
8145d3c
Update run.sh
Szu-JuiChen Oct 3, 2018
37960b2
Update run_tdnn_1a.sh
Szu-JuiChen Oct 3, 2018
cb026a1
Update run_ivector_common.sh
Szu-JuiChen Oct 3, 2018
e775685
fix nan value issue
Oct 4, 2018
6d04afc
Added WPE
sas91 Oct 4, 2018
147cade
Merge pull request #1 from sas91/reverb
Szu-JuiChen Oct 4, 2018
c9e5f6a
Remove partial tag
Oct 6, 2018
327aabb
change naming chime5 to reverb
Oct 9, 2018
623a212
Added Beamformit
sas91 Oct 11, 2018
ccebba1
Updated GMM WPE and Beamformit Results
sas91 Oct 11, 2018
a4b6fb7
Included the beamforming script
sas91 Oct 11, 2018
c9e29fe
Added Beamformit config file
sas91 Oct 11, 2018
1797446
Merge pull request #2 from sas91/reverb
Szu-JuiChen Oct 11, 2018
1c771b7
Store 1ch and 2ch WPE wavefiles in separate directories
sas91 Oct 15, 2018
c2108ce
rm check_tools.sh and bug fixed in run.sh
Oct 15, 2018
d1e5998
remove clean room in recog lists, add wpe only in recog set
sas91 Oct 16, 2018
7b72e54
Merge branch 'reverb' of https://github.com/Szu-JuiChen/kaldi into re…
sas91 Oct 16, 2018
3dc5bb8
added wpe recog sets in run.sh
sas91 Oct 16, 2018
ecda4c1
Added 1ch without WPE also to recog sets
sas91 Oct 16, 2018
d1947df
bug fix for code refactoring in previous commit
sas91 Oct 16, 2018
21f2337
change the data storage place on the grid
Oct 17, 2018
3afa9ed
Merge pull request #3 from sas91/reverb
Szu-JuiChen Oct 19, 2018
f53468e
Small changes for Chime-5
vimalmanohar Oct 23, 2018
270a785
Small changes
vimalmanohar Oct 23, 2018
31f82f1
Added dereverberation measures, cln evaluation and updated RESULTS
sas91 Nov 7, 2018
3fb2981
Minor modification in scoring script
sas91 Nov 7, 2018
03ffe53
Merge pull request #4 from sas91/reverb
Szu-JuiChen Nov 7, 2018
f453337
Added patch files
sas91 Nov 7, 2018
9c0887d
Merge pull request #5 from sas91/reverb
Szu-JuiChen Nov 8, 2018
53b3259
Updated RESULTS according to Shinji's comments
sas91 Nov 8, 2018
bc0e0f7
Merge pull request #6 from sas91/reverb
Szu-JuiChen Nov 8, 2018
10d4713
Enabled SE computation by default and added flag to enable PESQ
sas91 Nov 15, 2018
7734371
Merge pull request #7 from sas91/reverb
Szu-JuiChen Nov 15, 2018
8d23156
update RESULTS and fix error in compute_se_scores.sh
Nov 18, 2018
5caf1ca
minor fix
Nov 18, 2018
fc0edd5
update RESULTS and fix error in compute_se_scores.sh
Nov 18, 2018
69659e6
remove some useless comment lines
Nov 18, 2018
135494b
1) removed unnecessary files 2) Add the shebang header 3) Add option…
sw005320 Nov 20, 2018
447cdea
delete unused config files
sw005320 Nov 22, 2018
21bdf1e
update reverb README.txt and some chime5 stuff
Nov 29, 2018
796e0ac
Chime-5 new recipe
vimalmanohar Nov 30, 2018
da479ce
Merge remote-tracking branch 'vimal/chime5' into reverb
Dec 1, 2018
91da6a9
Minor fixes
vimalmanohar Dec 2, 2018
31ea5d7
add nara_wpe basic version (no batching)
Jan 10, 2019
3023538
Update README.txt
Szu-JuiChen Jan 10, 2019
8b0b56e
Update cmd.sh
Szu-JuiChen Jan 10, 2019
03f495b
Merge pull request #17 from Szu-JuiChen/reverb
vimalmanohar Jan 13, 2019
741c0bf
Adding noise creating scripts
vimalmanohar Jan 13, 2019
0f80203
Merge branch 'chime5' of github.com:vimalmanohar/kaldi into chime5
vimalmanohar Jan 13, 2019
5b6a85d
modify mem requirement for run_wpe.sh
Jan 14, 2019
482f05e
modify mem requirement for run_wpe.sh
Jan 14, 2019
c377b88
Merge pull request #18 from Szu-JuiChen/reverb
vimalmanohar Jan 14, 2019
256bbdd
Updating run.sh
vimalmanohar Jan 25, 2019
e7a665f
Merge branch 'chime5' of github.com:vimalmanohar/kaldi into chime5
vimalmanohar Jan 25, 2019
e333165
Redoing multicondition training
vimalmanohar Jan 28, 2019
3b0cff1
Updating Chime5 recipe
vimalmanohar May 4, 2019
acf1cba
merging golden/master
vimalmanohar May 7, 2019
5537817
Adding checks and removing u05 from beamforming
vimalmanohar May 7, 2019
2ea4b8a
Adding some comments to run_wpe.py
vimalmanohar May 8, 2019
3754984
adding some explanations
vimalmanohar May 9, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion egs/chime5/s5/cmd.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,6 @@
# conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more information,
# or search for the string 'default_config' in utils/queue.pl or utils/slurm.pl.

export train_cmd="queue.pl --mem 2G"
export train_cmd="retry.pl queue.pl --mem 2G"
export decode_cmd="queue.pl --mem 4G"

15 changes: 0 additions & 15 deletions egs/chime5/s5/local/chain/tuning/run_tdnn_1a.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,21 +24,16 @@ decode_iter=
# training options
# training chunk-options
chunk_width=140,100,160
# we don't need extra left/right context for TDNN systems.
chunk_left_context=0
chunk_right_context=0
common_egs_dir=
xent_regularize=0.1

# training options
srand=0
remove_egs=true
reporting_email=

#decode options
test_online_decoding=false # if true, it will run the last decoding stage.


# End configuration section.
echo "$0 $@" # Print the command line for logging

Expand Down Expand Up @@ -176,7 +171,6 @@ EOF
steps/nnet3/xconfig_to_configs.py --xconfig-file $dir/configs/network.xconfig --config-dir $dir/configs/
fi


if [ $stage -le 14 ]; then
if [[ $(hostname -f) == *.clsp.jhu.edu ]] && [ ! -d $dir/egs/storage ]; then
utils/create_split_dir.pl \
Expand Down Expand Up @@ -204,15 +198,10 @@ if [ $stage -le 14 ]; then
--trainer.num-chunk-per-minibatch=256,128,64 \
--trainer.optimization.momentum=0.0 \
--egs.chunk-width=$chunk_width \
--egs.chunk-left-context=$chunk_left_context \
--egs.chunk-right-context=$chunk_right_context \
--egs.chunk-left-context-initial=0 \
--egs.chunk-right-context-final=0 \
--egs.dir="$common_egs_dir" \
--egs.opts="--frames-overlap-per-eg 0" \
--cleanup.remove-egs=$remove_egs \
--use-gpu=true \
--reporting.email="$reporting_email" \
--feat-dir=$train_data_dir \
--tree-dir=$tree_dir \
--lat-dir=$lat_dir \
Expand All @@ -235,10 +224,6 @@ if [ $stage -le 16 ]; then
(
steps/nnet3/decode.sh \
--acwt 1.0 --post-decode-acwt 10.0 \
--extra-left-context $chunk_left_context \
--extra-right-context $chunk_right_context \
--extra-left-context-initial 0 \
--extra-right-context-final 0 \
--frames-per-chunk $frames_per_chunk \
--nj 8 --cmd "$decode_cmd" --num-threads 4 \
--online-ivector-dir exp/nnet3${nnet3_affix}/ivectors_${data}_hires \
Expand Down
2 changes: 1 addition & 1 deletion egs/chime5/s5/local/nnet3/run_ivector_common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ nnet3_affix=_train_worn_u100k
gmm_dir=exp/${gmm}
ali_dir=exp/${gmm}_ali_${train_set}_sp

for f in data/${train_set}/feats.scp ${gmm_dir}/final.mdl; do
for f in data/${train_set}/utt2spk ${gmm_dir}/final.mdl; do
if [ ! -f $f ]; then
echo "$0: expected file $f to exist"
exit 1
Expand Down
54 changes: 54 additions & 0 deletions egs/chime5/s5/local/run_wpe.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#!/usr/bin/env python
# Copyright 2018 Johns Hopkins University (Author: Aswin Shanmugam Subramanian)
# Apache 2.0
# Works with both python2 and python3

import numpy as np
import soundfile as sf
import time
import os, errno
from tqdm import tqdm
import argparse

from nara_wpe.wpe import wpe
from nara_wpe.utils import stft, istft
from nara_wpe import project_root

parser = argparse.ArgumentParser()
parser.add_argument('--files', '-f', nargs='+')
args = parser.parse_args()

input_files = args.files[:len(args.files)//2]
output_files = args.files[len(args.files)//2:]
out_dir = os.path.dirname(output_files[0])
try:
os.makedirs(out_dir)
except OSError as e:
if e.errno != errno.EEXIST:
raise

stft_options = dict(
size=512,
shift=128,
window_length=None,
fading=True,
pad=True,
symmetric_window=False
)

sampling_rate = 16000
delay = 3
iterations = 5
taps = 10

signal_list = [
sf.read(f)[0]
for f in input_files
]
y = np.stack(signal_list, axis=0)
Y = stft(y, **stft_options).transpose(2, 0, 1)
Z = wpe(Y, iterations=iterations, statistics_mode='full').transpose(1, 2, 0)
z = istft(Z, size=stft_options['size'], shift=stft_options['shift'])

for d in range(len(signal_list)):
sf.write(output_files[d], z[d,:], sampling_rate)
85 changes: 85 additions & 0 deletions egs/chime5/s5/local/run_wpe.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
#!/bin/bash
# Copyright 2018 Johns Hopkins University (Author: Aswin Shanmugam Subramanian)
# Apache 2.0

. ./cmd.sh
. ./path.sh

# Config:
nj=4
cmd=run.pl

. utils/parse_options.sh || exit 1;

if [ $# != 3 ]; then
echo "Wrong #arguments ($#, expected 3)"
echo "Usage: local/run_wpe.sh [options] <wav-in-dir> <wav-out-dir> <array-id>"
echo "main options (for others, see top of script file)"
echo " --cmd <cmd> # Command to run in parallel with"
echo " --nj 50 # number of jobs for parallel processing"
exit 1;
fi

sdir=$1
odir=$2
array=$3
task=`basename $sdir`
expdir=exp/wpe/${task}_${array}
# Set bash to 'debug' mode, it will exit on :
# -e 'error', -u 'undefined variable', -o ... 'error in pipeline', -x 'print commands',
set -e
set -u
set -o pipefail

miniconda_dir=$HOME/miniconda3/
if [ ! -d $miniconda_dir ]; then
echo "$miniconda_dir does not exist. Please run '../../../tools/extras/install_miniconda.sh' and '../../../tools/extras/install_wpe.sh';"
fi

# check if WPE is installed
result=`$HOME/miniconda3/bin/python -c "\
try:
import nara_wpe
print('1')
except ImportError:
print('0')"`

if [ "$result" == "1" ]; then
echo "WPE is installed"
else
echo "WPE is not installed. Please run ../../../tools/extras/install_wpe.sh"
exit 1
fi

mkdir -p $odir
mkdir -p $expdir/log

# wavfiles.list can be used as the name of the output files
output_wavfiles=$expdir/wavfiles.list
find -L ${sdir} | grep -i ${array} > $expdir/channels_input
cat $expdir/channels_input | awk -F '/' '{print $NF}' | sed "s@S@$odir\/S@g" > $expdir/channels_output
paste -d" " $expdir/channels_input $expdir/channels_output > $output_wavfiles

# split the list for parallel processing
split_wavfiles=""
for n in `seq $nj`; do
split_wavfiles="$split_wavfiles $output_wavfiles.$n"
done
utils/split_scp.pl $output_wavfiles $split_wavfiles || exit 1;

echo -e "Dereverberation - $task - $array\n"
# making a shell script for each job
for n in `seq $nj`; do
cat <<-EOF > $expdir/log/wpe.$n.sh
while read line; do
$HOME/miniconda3/bin/python local/run_wpe.py \
--file \$line
done < $output_wavfiles.$n
EOF
done

chmod a+x $expdir/log/wpe.*.sh
$cmd JOB=1:$nj $expdir/log/wpe.JOB.log \
$expdir/log/wpe.JOB.sh

echo "`basename $0` Done."
33 changes: 33 additions & 0 deletions egs/chime5/s5b/RESULTS
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@

# tri2
%WER 76.40 [ 44985 / 58881, 3496 ins, 17652 del, 23837 sub ] exp/tri2/decode_dev_worn/wer_13_1.0
%WER 93.56 [ 55091 / 58881, 2132 ins, 35555 del, 17404 sub ] exp/tri2/decode_dev_beamformit_ref/wer_17_1.0

# tri3
%WER 72.81 [ 42869 / 58881, 3629 ins, 15998 del, 23242 sub ] exp/tri3/decode_dev_worn/wer_15_1.0
%WER 91.73 [ 54013 / 58881, 3519 ins, 27098 del, 23396 sub ] exp/tri3/decode_dev_beamformit_ref/wer_17_1.0

# nnet3 tdnn+chain
%WER 47.91 [ 28212 / 58881, 2843 ins, 8957 del, 16412 sub ] exp/chain_train_worn_u100k_cleaned/tdnn1a_sp/decode_dev_worn/wer_9_0.0
%WER 81.28 [ 47859 / 58881, 4210 ins, 27511 del, 16138 sub ] exp/chain_train_worn_u100k_cleaned/tdnn1a_sp/decode_dev_beamformit_ref/wer_9_0.5

# result with the challenge submission format (July 9, 2018)
# before the fix of speaker ID across arrays
session S02 room DINING: #words 8288, #errors 6593, wer 79.54 %
session S02 room KITCHEN: #words 12696, #errors 11096, wer 87.39 %
session S02 room LIVING: #words 15460, #errors 12219, wer 79.03 %
session S09 room DINING: #words 5766, #errors 4651, wer 80.66 %
session S09 room KITCHEN: #words 8911, #errors 7277, wer 81.66 %
session S09 room LIVING: #words 7760, #errors 6023, wer 77.61 %
overall: #words 58881, #errors 47859, wer 81.28 %

# result with the challenge submission format (July 9, 2018)
# after the fix of speaker ID across arrays
==== development set ====
session S02 room DINING: #words 8288, #errors 6556, wer 79.10 %
session S02 room KITCHEN: #words 12696, #errors 11096, wer 87.39 %
session S02 room LIVING: #words 15460, #errors 12182, wer 78.79 %
session S09 room DINING: #words 5766, #errors 4648, wer 80.61 %
session S09 room KITCHEN: #words 8911, #errors 7277, wer 81.66 %
session S09 room LIVING: #words 7760, #errors 6022, wer 77.60 %
overall: #words 58881, #errors 47781, wer 81.14 %
15 changes: 15 additions & 0 deletions egs/chime5/s5b/cmd.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# you can change cmd.sh depending on what type of queue you are using.
# If you have no queueing system and want to run on a local machine, you
# can change all instances 'queue.pl' to run.pl (but be careful and run
# commands one by one: most recipes will exhaust the memory on your
# machine). queue.pl works with GridEngine (qsub). slurm.pl works
# with slurm. Different queues are configured differently, with different
# queue names and different ways of specifying things like memory;
# to account for these differences you can create and edit the file
# conf/queue.conf to match your queue's configuration. Search for
# conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more information,
# or search for the string 'default_config' in utils/queue.pl or utils/slurm.pl.

export train_cmd="retry.pl queue.pl --mem 2G"
export decode_cmd="queue.pl --mem 4G"

50 changes: 50 additions & 0 deletions egs/chime5/s5b/conf/beamformit.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#BeamformIt sample configuration file for AMI data (http://groups.inf.ed.ac.uk/ami/download/)

# scrolling size to compute the delays
scroll_size = 250

# cross correlation computation window size
window_size = 500

#amount of maximum points for the xcorrelation taken into account
nbest_amount = 4

#flag wether to apply an automatic noise thresholding
do_noise_threshold = 1

#Percentage of frames with lower xcorr taken as noisy
noise_percent = 10

######## acoustic modelling parameters

#transition probabilities weight for multichannel decoding
trans_weight_multi = 25
trans_weight_nbest = 25

###

#flag wether to print the feaures after setting them, or not
print_features = 1

#flag wether to use the bad frames in the sum process
do_avoid_bad_frames = 1

#flag to use the best channel (SNR) as a reference
#defined from command line
do_compute_reference = 1

#flag wether to use a uem file or not(process all the file)
do_use_uem_file = 0

#flag wether to use an adaptative weights scheme or fixed weights
do_adapt_weights = 1

#flag wether to output the sph files or just run the system to create the auxiliary files
do_write_sph_files = 1

####directories where to store/retrieve info####
#channels_file = ./cfg-files/channels

#show needs to be passed as argument normally, here a default one is given just in case
#show_id = Ttmp

2 changes: 2 additions & 0 deletions egs/chime5/s5b/conf/mfcc.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
--use-energy=false
--sample-frequency=16000
10 changes: 10 additions & 0 deletions egs/chime5/s5b/conf/mfcc_hires.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# config for high-resolution MFCC features, intended for neural network training.
# Note: we keep all cepstra, so it has the same info as filterbank features,
# but MFCC is more easily compressible (because less correlated) which is why
# we prefer this method.
--use-energy=false # use average of log energy, not energy.
--sample-frequency=16000
--num-mel-bins=40
--num-ceps=40
--low-freq=40
--high-freq=-400
1 change: 1 addition & 0 deletions egs/chime5/s5b/conf/online_cmvn.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# configuration file for apply-cmvn-online, used in the script ../local/run_online_decoding.sh
1 change: 1 addition & 0 deletions egs/chime5/s5b/local/chain/run_tdnn.sh
Loading