-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New chime-5 recipe #2893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
New chime-5 recipe #2893
Changes from all commits
Commits
Show all changes
69 commits
Select commit
Hold shift + click to select a range
43a3921
add and delete files
2bc97b9
add files for TDNN training and modify some files
d816ad1
modified run.sh and add local/score.sh, get_reslts.sh
13ce9d0
update and clean up scripts
53fc524
minor modification
3a648fc
minor modification
078137e
Merge branch 'master' into reverb
Szu-JuiChen 2b112e8
parameter update
04b42f4
Merge branch 'reverb' of https://github.com/Szu-JuiChen/kaldi into re…
8145d3c
Update run.sh
Szu-JuiChen 37960b2
Update run_tdnn_1a.sh
Szu-JuiChen cb026a1
Update run_ivector_common.sh
Szu-JuiChen e775685
fix nan value issue
6d04afc
Added WPE
sas91 147cade
Merge pull request #1 from sas91/reverb
Szu-JuiChen c9e5f6a
Remove partial tag
327aabb
change naming chime5 to reverb
623a212
Added Beamformit
sas91 ccebba1
Updated GMM WPE and Beamformit Results
sas91 a4b6fb7
Included the beamforming script
sas91 c9e29fe
Added Beamformit config file
sas91 1797446
Merge pull request #2 from sas91/reverb
Szu-JuiChen 1c771b7
Store 1ch and 2ch WPE wavefiles in separate directories
sas91 c2108ce
rm check_tools.sh and bug fixed in run.sh
d1e5998
remove clean room in recog lists, add wpe only in recog set
sas91 7b72e54
Merge branch 'reverb' of https://github.com/Szu-JuiChen/kaldi into re…
sas91 3dc5bb8
added wpe recog sets in run.sh
sas91 ecda4c1
Added 1ch without WPE also to recog sets
sas91 d1947df
bug fix for code refactoring in previous commit
sas91 21f2337
change the data storage place on the grid
3afa9ed
Merge pull request #3 from sas91/reverb
Szu-JuiChen f53468e
Small changes for Chime-5
vimalmanohar 270a785
Small changes
vimalmanohar 31f82f1
Added dereverberation measures, cln evaluation and updated RESULTS
sas91 3fb2981
Minor modification in scoring script
sas91 03ffe53
Merge pull request #4 from sas91/reverb
Szu-JuiChen f453337
Added patch files
sas91 9c0887d
Merge pull request #5 from sas91/reverb
Szu-JuiChen 53b3259
Updated RESULTS according to Shinji's comments
sas91 bc0e0f7
Merge pull request #6 from sas91/reverb
Szu-JuiChen 10d4713
Enabled SE computation by default and added flag to enable PESQ
sas91 7734371
Merge pull request #7 from sas91/reverb
Szu-JuiChen 8d23156
update RESULTS and fix error in compute_se_scores.sh
5caf1ca
minor fix
fc0edd5
update RESULTS and fix error in compute_se_scores.sh
69659e6
remove some useless comment lines
135494b
1) removed unnecessary files 2) Add the shebang header 3) Add option…
sw005320 447cdea
delete unused config files
sw005320 21bdf1e
update reverb README.txt and some chime5 stuff
796e0ac
Chime-5 new recipe
vimalmanohar da479ce
Merge remote-tracking branch 'vimal/chime5' into reverb
91da6a9
Minor fixes
vimalmanohar 31ea5d7
add nara_wpe basic version (no batching)
3023538
Update README.txt
Szu-JuiChen 8b0b56e
Update cmd.sh
Szu-JuiChen 03f495b
Merge pull request #17 from Szu-JuiChen/reverb
vimalmanohar 741c0bf
Adding noise creating scripts
vimalmanohar 0f80203
Merge branch 'chime5' of github.com:vimalmanohar/kaldi into chime5
vimalmanohar 5b6a85d
modify mem requirement for run_wpe.sh
482f05e
modify mem requirement for run_wpe.sh
c377b88
Merge pull request #18 from Szu-JuiChen/reverb
vimalmanohar 256bbdd
Updating run.sh
vimalmanohar e7a665f
Merge branch 'chime5' of github.com:vimalmanohar/kaldi into chime5
vimalmanohar e333165
Redoing multicondition training
vimalmanohar 3b0cff1
Updating Chime5 recipe
vimalmanohar acf1cba
merging golden/master
vimalmanohar 5537817
Adding checks and removing u05 from beamforming
vimalmanohar 2ea4b8a
Adding some comments to run_wpe.py
vimalmanohar 3754984
adding some explanations
vimalmanohar File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| #!/usr/bin/env python | ||
| # Copyright 2018 Johns Hopkins University (Author: Aswin Shanmugam Subramanian) | ||
| # Apache 2.0 | ||
| # Works with both python2 and python3 | ||
|
|
||
| import numpy as np | ||
| import soundfile as sf | ||
| import time | ||
| import os, errno | ||
| from tqdm import tqdm | ||
| import argparse | ||
|
|
||
| from nara_wpe.wpe import wpe | ||
| from nara_wpe.utils import stft, istft | ||
| from nara_wpe import project_root | ||
|
|
||
| parser = argparse.ArgumentParser() | ||
| parser.add_argument('--files', '-f', nargs='+') | ||
| args = parser.parse_args() | ||
|
|
||
| input_files = args.files[:len(args.files)//2] | ||
| output_files = args.files[len(args.files)//2:] | ||
| out_dir = os.path.dirname(output_files[0]) | ||
| try: | ||
| os.makedirs(out_dir) | ||
| except OSError as e: | ||
| if e.errno != errno.EEXIST: | ||
| raise | ||
|
|
||
| stft_options = dict( | ||
| size=512, | ||
| shift=128, | ||
| window_length=None, | ||
| fading=True, | ||
| pad=True, | ||
| symmetric_window=False | ||
| ) | ||
|
|
||
| sampling_rate = 16000 | ||
| delay = 3 | ||
| iterations = 5 | ||
| taps = 10 | ||
|
|
||
| signal_list = [ | ||
| sf.read(f)[0] | ||
| for f in input_files | ||
| ] | ||
| y = np.stack(signal_list, axis=0) | ||
| Y = stft(y, **stft_options).transpose(2, 0, 1) | ||
| Z = wpe(Y, iterations=iterations, statistics_mode='full').transpose(1, 2, 0) | ||
| z = istft(Z, size=stft_options['size'], shift=stft_options['shift']) | ||
|
|
||
| for d in range(len(signal_list)): | ||
| sf.write(output_files[d], z[d,:], sampling_rate) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| #!/bin/bash | ||
| # Copyright 2018 Johns Hopkins University (Author: Aswin Shanmugam Subramanian) | ||
| # Apache 2.0 | ||
|
|
||
| . ./cmd.sh | ||
| . ./path.sh | ||
|
|
||
| # Config: | ||
| nj=4 | ||
| cmd=run.pl | ||
|
|
||
| . utils/parse_options.sh || exit 1; | ||
|
|
||
| if [ $# != 3 ]; then | ||
| echo "Wrong #arguments ($#, expected 3)" | ||
| echo "Usage: local/run_wpe.sh [options] <wav-in-dir> <wav-out-dir> <array-id>" | ||
| echo "main options (for others, see top of script file)" | ||
| echo " --cmd <cmd> # Command to run in parallel with" | ||
| echo " --nj 50 # number of jobs for parallel processing" | ||
| exit 1; | ||
| fi | ||
|
|
||
| sdir=$1 | ||
| odir=$2 | ||
| array=$3 | ||
| task=`basename $sdir` | ||
| expdir=exp/wpe/${task}_${array} | ||
| # Set bash to 'debug' mode, it will exit on : | ||
| # -e 'error', -u 'undefined variable', -o ... 'error in pipeline', -x 'print commands', | ||
| set -e | ||
| set -u | ||
| set -o pipefail | ||
|
|
||
| miniconda_dir=$HOME/miniconda3/ | ||
| if [ ! -d $miniconda_dir ]; then | ||
| echo "$miniconda_dir does not exist. Please run '../../../tools/extras/install_miniconda.sh' and '../../../tools/extras/install_wpe.sh';" | ||
| fi | ||
|
|
||
| # check if WPE is installed | ||
| result=`$HOME/miniconda3/bin/python -c "\ | ||
| try: | ||
| import nara_wpe | ||
| print('1') | ||
| except ImportError: | ||
| print('0')"` | ||
|
|
||
| if [ "$result" == "1" ]; then | ||
| echo "WPE is installed" | ||
| else | ||
| echo "WPE is not installed. Please run ../../../tools/extras/install_wpe.sh" | ||
| exit 1 | ||
| fi | ||
|
|
||
| mkdir -p $odir | ||
| mkdir -p $expdir/log | ||
|
|
||
| # wavfiles.list can be used as the name of the output files | ||
| output_wavfiles=$expdir/wavfiles.list | ||
| find -L ${sdir} | grep -i ${array} > $expdir/channels_input | ||
| cat $expdir/channels_input | awk -F '/' '{print $NF}' | sed "s@S@$odir\/S@g" > $expdir/channels_output | ||
| paste -d" " $expdir/channels_input $expdir/channels_output > $output_wavfiles | ||
|
|
||
| # split the list for parallel processing | ||
| split_wavfiles="" | ||
| for n in `seq $nj`; do | ||
| split_wavfiles="$split_wavfiles $output_wavfiles.$n" | ||
| done | ||
| utils/split_scp.pl $output_wavfiles $split_wavfiles || exit 1; | ||
|
|
||
| echo -e "Dereverberation - $task - $array\n" | ||
| # making a shell script for each job | ||
| for n in `seq $nj`; do | ||
| cat <<-EOF > $expdir/log/wpe.$n.sh | ||
| while read line; do | ||
| $HOME/miniconda3/bin/python local/run_wpe.py \ | ||
| --file \$line | ||
| done < $output_wavfiles.$n | ||
| EOF | ||
| done | ||
|
|
||
| chmod a+x $expdir/log/wpe.*.sh | ||
| $cmd JOB=1:$nj $expdir/log/wpe.JOB.log \ | ||
| $expdir/log/wpe.JOB.sh | ||
|
|
||
| echo "`basename $0` Done." |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
|
|
||
| # tri2 | ||
| %WER 76.40 [ 44985 / 58881, 3496 ins, 17652 del, 23837 sub ] exp/tri2/decode_dev_worn/wer_13_1.0 | ||
| %WER 93.56 [ 55091 / 58881, 2132 ins, 35555 del, 17404 sub ] exp/tri2/decode_dev_beamformit_ref/wer_17_1.0 | ||
|
|
||
| # tri3 | ||
| %WER 72.81 [ 42869 / 58881, 3629 ins, 15998 del, 23242 sub ] exp/tri3/decode_dev_worn/wer_15_1.0 | ||
| %WER 91.73 [ 54013 / 58881, 3519 ins, 27098 del, 23396 sub ] exp/tri3/decode_dev_beamformit_ref/wer_17_1.0 | ||
|
|
||
| # nnet3 tdnn+chain | ||
| %WER 47.91 [ 28212 / 58881, 2843 ins, 8957 del, 16412 sub ] exp/chain_train_worn_u100k_cleaned/tdnn1a_sp/decode_dev_worn/wer_9_0.0 | ||
| %WER 81.28 [ 47859 / 58881, 4210 ins, 27511 del, 16138 sub ] exp/chain_train_worn_u100k_cleaned/tdnn1a_sp/decode_dev_beamformit_ref/wer_9_0.5 | ||
|
|
||
| # result with the challenge submission format (July 9, 2018) | ||
| # before the fix of speaker ID across arrays | ||
| session S02 room DINING: #words 8288, #errors 6593, wer 79.54 % | ||
| session S02 room KITCHEN: #words 12696, #errors 11096, wer 87.39 % | ||
| session S02 room LIVING: #words 15460, #errors 12219, wer 79.03 % | ||
| session S09 room DINING: #words 5766, #errors 4651, wer 80.66 % | ||
| session S09 room KITCHEN: #words 8911, #errors 7277, wer 81.66 % | ||
| session S09 room LIVING: #words 7760, #errors 6023, wer 77.61 % | ||
| overall: #words 58881, #errors 47859, wer 81.28 % | ||
|
|
||
| # result with the challenge submission format (July 9, 2018) | ||
| # after the fix of speaker ID across arrays | ||
| ==== development set ==== | ||
| session S02 room DINING: #words 8288, #errors 6556, wer 79.10 % | ||
| session S02 room KITCHEN: #words 12696, #errors 11096, wer 87.39 % | ||
| session S02 room LIVING: #words 15460, #errors 12182, wer 78.79 % | ||
| session S09 room DINING: #words 5766, #errors 4648, wer 80.61 % | ||
| session S09 room KITCHEN: #words 8911, #errors 7277, wer 81.66 % | ||
| session S09 room LIVING: #words 7760, #errors 6022, wer 77.60 % | ||
| overall: #words 58881, #errors 47781, wer 81.14 % |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| # you can change cmd.sh depending on what type of queue you are using. | ||
| # If you have no queueing system and want to run on a local machine, you | ||
| # can change all instances 'queue.pl' to run.pl (but be careful and run | ||
| # commands one by one: most recipes will exhaust the memory on your | ||
| # machine). queue.pl works with GridEngine (qsub). slurm.pl works | ||
| # with slurm. Different queues are configured differently, with different | ||
| # queue names and different ways of specifying things like memory; | ||
| # to account for these differences you can create and edit the file | ||
| # conf/queue.conf to match your queue's configuration. Search for | ||
| # conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more information, | ||
| # or search for the string 'default_config' in utils/queue.pl or utils/slurm.pl. | ||
|
|
||
| export train_cmd="retry.pl queue.pl --mem 2G" | ||
| export decode_cmd="queue.pl --mem 4G" | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,50 @@ | ||
| #BeamformIt sample configuration file for AMI data (http://groups.inf.ed.ac.uk/ami/download/) | ||
|
|
||
| # scrolling size to compute the delays | ||
| scroll_size = 250 | ||
|
|
||
| # cross correlation computation window size | ||
| window_size = 500 | ||
|
|
||
| #amount of maximum points for the xcorrelation taken into account | ||
| nbest_amount = 4 | ||
|
|
||
| #flag wether to apply an automatic noise thresholding | ||
| do_noise_threshold = 1 | ||
|
|
||
| #Percentage of frames with lower xcorr taken as noisy | ||
| noise_percent = 10 | ||
|
|
||
| ######## acoustic modelling parameters | ||
|
|
||
| #transition probabilities weight for multichannel decoding | ||
| trans_weight_multi = 25 | ||
| trans_weight_nbest = 25 | ||
|
|
||
| ### | ||
|
|
||
| #flag wether to print the feaures after setting them, or not | ||
| print_features = 1 | ||
|
|
||
| #flag wether to use the bad frames in the sum process | ||
| do_avoid_bad_frames = 1 | ||
|
|
||
| #flag to use the best channel (SNR) as a reference | ||
| #defined from command line | ||
| do_compute_reference = 1 | ||
|
|
||
| #flag wether to use a uem file or not(process all the file) | ||
| do_use_uem_file = 0 | ||
|
|
||
| #flag wether to use an adaptative weights scheme or fixed weights | ||
| do_adapt_weights = 1 | ||
|
|
||
| #flag wether to output the sph files or just run the system to create the auxiliary files | ||
| do_write_sph_files = 1 | ||
|
|
||
| ####directories where to store/retrieve info#### | ||
| #channels_file = ./cfg-files/channels | ||
|
|
||
| #show needs to be passed as argument normally, here a default one is given just in case | ||
| #show_id = Ttmp | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| --use-energy=false | ||
| --sample-frequency=16000 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| # config for high-resolution MFCC features, intended for neural network training. | ||
| # Note: we keep all cepstra, so it has the same info as filterbank features, | ||
| # but MFCC is more easily compressible (because less correlated) which is why | ||
| # we prefer this method. | ||
| --use-energy=false # use average of log energy, not energy. | ||
| --sample-frequency=16000 | ||
| --num-mel-bins=40 | ||
| --num-ceps=40 | ||
| --low-freq=40 | ||
| --high-freq=-400 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| # configuration file for apply-cmvn-online, used in the script ../local/run_online_decoding.sh |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| tuning/run_tdnn_1a.sh |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.