Skip to content
Merged

Chime6 #3755

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
3ebf1d0
first commit for chime6 track 2
desh2608 Nov 12, 2019
1496fd2
added David's diarization system in pipeline
desh2608 Nov 15, 2019
2cf6649
added David's diarization system to pipeline
desh2608 Nov 16, 2019
ff274a5
'Adding chime6 with paderborn gss'
aarora8 Nov 17, 2019
ab3c16c
removed unnecessary files
desh2608 Nov 17, 2019
d2b629f
minor modifications
desh2608 Nov 18, 2019
f005997
removed unnecessary file
desh2608 Nov 18, 2019
30f3556
modification from review: adding comments, removing multicondition sc…
aarora8 Nov 18, 2019
d137a1e
made changes as per @david-ryan-snyder 's comments
desh2608 Nov 18, 2019
dcea748
added soft link to Callhome diarization
desh2608 Nov 18, 2019
ec31077
keep original text, utt2spk, segments files for dev/eval but renamed …
desh2608 Nov 18, 2019
687b774
Merge pull request #2 from chimechallenge/chime6pullrequest
sw005320 Nov 18, 2019
87c0781
Merge pull request #3 from chimechallenge/sad
sw005320 Nov 18, 2019
823d5af
fixed some bugs in previous PR for track 2
desh2608 Nov 19, 2019
09f30ca
added audio sync generation
sw005320 Nov 19, 2019
5cea419
add sync data generation script
sw005320 Nov 19, 2019
524d110
modify scripts for CHiME-6 sync audio
sw005320 Nov 19, 2019
79ed5d2
remove a debug line
sw005320 Nov 19, 2019
984278e
recover S12_U03 array
sw005320 Nov 20, 2019
d9e9e5c
Merge branch 'master' of https://github.com/chimechallenge/kaldi_chim…
desh2608 Nov 20, 2019
dfaa2ef
preparation for gss
sw005320 Nov 21, 2019
1bf57e9
fix to use chime6 version gss
sw005320 Nov 21, 2019
f4a2eb0
Merge pull request #8 from chimechallenge/audio_sync
sw005320 Nov 22, 2019
25bfed1
remove the evaluation set scoring
sw005320 Nov 22, 2019
0d7d5d9
fix some typos
sw005320 Nov 22, 2019
c2a84f8
multispk modifications
aarora8 Nov 23, 2019
84a148c
adding augmentation and asr part in track2
aarora8 Nov 23, 2019
bd3cd09
removing extra files from track2
aarora8 Nov 23, 2019
9c3dafe
use the original number of jobs in decoding
sw005320 Nov 23, 2019
3a9a3e4
Merge pull request #10 from chimechallenge/fix_track1
sw005320 Nov 23, 2019
13f2da4
reverting augmentation changes
aarora8 Nov 23, 2019
197a211
Merge pull request #7 from chimechallenge/sad
sw005320 Nov 23, 2019
1b9f7e2
fixing merge conflict
aarora8 Nov 23, 2019
2c02e31
modification from review
aarora8 Nov 23, 2019
a8eec21
modification from review
aarora8 Nov 23, 2019
18b283f
modification from review
aarora8 Nov 23, 2019
78dda56
creating links to track2 from track1
aarora8 Nov 23, 2019
c3b08ad
adding audio sync changes in track2
aarora8 Nov 23, 2019
9c8bb52
fixing merge conflict
aarora8 Nov 23, 2019
5a788be
minor change
aarora8 Nov 23, 2019
1cbe234
fixing cmd.sh
aarora8 Nov 23, 2019
5fbcdc0
making run.sh and decode.sh similar to track2
aarora8 Nov 24, 2019
317f86f
fix from test
aarora8 Nov 25, 2019
da52528
Merge pull request #12 from chimechallenge/sync_track2
sw005320 Nov 25, 2019
2ea6ac0
fixing merge conflict, fixing scoring, add gss parameters
aarora8 Nov 25, 2019
a765b19
modification from review
aarora8 Nov 25, 2019
d4d5800
modification from the review
aarora8 Nov 25, 2019
5dfbe6b
modification from review
aarora8 Nov 26, 2019
de30705
add an error comment
sw005320 Nov 26, 2019
0b074df
Merge branch 'multispkscr' of https://github.com/chimechallenge/kaldi…
sw005320 Nov 26, 2019
118c073
modification from review
aarora8 Nov 26, 2019
22ee61f
Merge branch 'multispkscr' of https://github.com/chimechallenge/kaldi…
aarora8 Nov 26, 2019
43c17c9
minor modifications
sw005320 Nov 26, 2019
99815bb
Merge branch 'multispkscr' of https://github.com/chimechallenge/kaldi…
sw005320 Nov 26, 2019
428d851
modification from review
aarora8 Nov 27, 2019
a81e8f9
Merge pull request #11 from chimechallenge/multispkscr
sw005320 Nov 27, 2019
dad627b
adding eval data
aarora8 Nov 27, 2019
8263538
1) fix disk access issues in local/generate_chime6_data.sh 2) fix mem…
sw005320 Nov 27, 2019
b15c823
Merge branch 'multispkscr' of https://github.com/chimechallenge/kaldi…
sw005320 Nov 27, 2019
587444c
add comments for wpe_v8
sw005320 Nov 27, 2019
7fe188e
fix local/prepare_data.sh for gss
boeddeker Nov 28, 2019
29b8b70
support relative path argument for local/prepare_data.sh
boeddeker Nov 28, 2019
b3ce942
Merge pull request #15 from chimechallenge/fix_prepare_data_for_gss
sw005320 Nov 28, 2019
a104358
Merge remote-tracking branch 'origin/master' into multispkscr
sw005320 Nov 28, 2019
512fd02
add comments
sw005320 Nov 28, 2019
89bc3ab
minor fix
aarora8 Nov 28, 2019
4b60568
minor fix
aarora8 Nov 28, 2019
7f9a252
modification from test
aarora8 Nov 28, 2019
896b3ac
add comments that we skipped the inconsistent time-stamp files
sw005320 Nov 28, 2019
ce360e7
Merge branch 'multispkscr' of https://github.com/chimechallenge/kaldi…
sw005320 Nov 28, 2019
f983e81
Merge branch 'master' of https://github.com/chimechallenge/kaldi_chim…
desh2608 Nov 29, 2019
910d480
fixed SAD training to be compatible with ASR in track 2
desh2608 Nov 29, 2019
f572947
minor edit
desh2608 Nov 29, 2019
11bea97
use 13-dim MFCCs in SAD for compatibility with GMM model
desh2608 Nov 29, 2019
d24d0ee
minor fix
desh2608 Nov 29, 2019
e59432e
added comments to download pretrained models
desh2608 Dec 1, 2019
63dd09c
minor fix, thanks @nateanl
aarora8 Dec 1, 2019
0d436e6
Merge branch 'multispkscr' of https://github.com/chimechallenge/kaldi…
aarora8 Dec 1, 2019
62140d4
updating comment
aarora8 Dec 1, 2019
2315754
Merge pull request #14 from chimechallenge/multispkscr
sw005320 Dec 2, 2019
1e08e73
Merge branch 'master' into sad
desh2608 Dec 2, 2019
6b80a71
Merge pull request #17 from chimechallenge/sad
sw005320 Dec 2, 2019
14c2d65
added chime6 data prep for decode in track 2
desh2608 Dec 2, 2019
5ea72d4
minor fix
desh2608 Dec 2, 2019
ce449b4
minor fix
desh2608 Dec 2, 2019
902aa9f
minor fix
aarora8 Dec 2, 2019
8c8eb6e
cosmetic fixes and modification for running on single array
aarora8 Dec 3, 2019
eede690
minor fix
aarora8 Dec 3, 2019
290d641
minor fix
aarora8 Dec 3, 2019
cd8d084
performing diarization on single array
aarora8 Dec 3, 2019
7f795c5
minor fix
aarora8 Dec 3, 2019
d8c9545
minor fix
aarora8 Dec 3, 2019
3397872
minor fix
aarora8 Dec 3, 2019
68f1944
added comment for data prep for eval
desh2608 Dec 3, 2019
13e0aae
Merge branch 'sad' of https://github.com/chimechallenge/kaldi_chime6 …
desh2608 Dec 3, 2019
b80268d
removed '_ref' from data dir name
desh2608 Dec 3, 2019
1350623
updating results for multiarray, using same time conversion routine a…
aarora8 Dec 3, 2019
df725a1
adding comments
aarora8 Dec 3, 2019
5f39b3c
minor fix
aarora8 Dec 3, 2019
e6a73ce
skipping printing WER message on commandline
aarora8 Dec 4, 2019
4caf7a6
removing creation of lores features
aarora8 Dec 4, 2019
bd35e24
not adding array ID to speaker ID for GSS, as it is not available
aarora8 Dec 4, 2019
ed6ac49
minor fix
aarora8 Dec 4, 2019
279209f
change nj if only decoding 1 array
desh2608 Dec 4, 2019
10a97b0
Merge branch 'sad' of https://github.com/chimechallenge/kaldi_chime6 …
desh2608 Dec 4, 2019
4eed953
added RESULTS file for track 2
desh2608 Dec 4, 2019
01a3516
Merge pull request #18 from chimechallenge/sad
sw005320 Dec 4, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,8 @@ GSYMS
/egs/*/*/plp
/egs/*/*/exp
/egs/*/*/data
/egs/*/*/wav
/egs/*/*/enhan

# /tools/
/tools/pocolm/
Expand Down
1 change: 1 addition & 0 deletions egs/chime5/s5b/local/nnet3/compare_wer.sh
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -130,3 +130,4 @@ done
echo

echo

4 changes: 3 additions & 1 deletion egs/chime5/s5b/local/nnet3/decode.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ post_decode_acwt=1.0 # important to change this when using chain models
extra_left_context_initial=0
extra_right_context_final=0

graph_affix=

score_opts="--min-lmwt 6 --max-lmwt 13"

. ./cmd.sh
Expand Down Expand Up @@ -94,7 +96,7 @@ if [ $stage -le 2 ]; then
fi
fi

decode_dir=$dir/decode_${data_set}${affix}
decode_dir=$dir/decode${graph_affix}_${data_set}${affix}
# generate the lattices
if [ $stage -le 3 ]; then
echo "Generating lattices, stage 1"
Expand Down
62 changes: 27 additions & 35 deletions egs/chime5/s5b/local/run_recog.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ json_dir=${chime5_corpus}/transcriptions
audio_dir=${chime5_corpus}/audio

# training and test data
train_set=train_worn_u100k
test_sets="eval_${enhancement}_ref"
train_set=train_worn_simu_u400k
test_sets="eval_${enhancement}_dereverb_ref"

# This script also needs the phonetisaurus g2p, srilm, beamformit
./local/check_tools.sh || exit 1
Expand All @@ -38,18 +38,27 @@ if [ $stage -le 4 ]; then
# Beamforming using reference arrays
# enhanced WAV directory
enhandir=enhan
dereverb_dir=${PWD}/wav/wpe/
for dset in eval; do
for mictype in u01 u02 u03 u04 u05 u06; do
local/run_beamformit.sh --cmd "$train_cmd" \
local/run_wpe.sh --nj 4 --cmd "$train_cmd --mem 120G" \
${audio_dir}/${dset} \
${dereverb_dir}/${dset} \
${mictype}
done
done
for dset in dev eval; do
for mictype in u01 u02 u03 u04 u05 u06; do
local/run_beamformit.sh --cmd "$train_cmd" \
${dereverb_dir}/${dset} \
${enhandir}/${dset}_${enhancement}_${mictype} \
${mictype}
done
done

for dset in eval; do
local/prepare_data.sh --mictype ref "$PWD/${enhandir}/${dset}_${enhancement}_u0*" \
${json_dir}/${dset} data/${dset}_${enhancement}_ref
${json_dir}/${dset} data/${dset}_${enhancement}_dereverb_ref
done
fi

Expand Down Expand Up @@ -92,28 +101,13 @@ if [ $stage -le 7 ]; then
done
fi

if [ $stage -le 17 ]; then
nnet3_affix=_${train_set}_cleaned
for datadir in ${test_sets}; do
utils/copy_data_dir.sh data/$datadir data/${datadir}_hires
done
for datadir in ${test_sets}; do
steps/make_mfcc.sh --nj 20 --mfcc-config conf/mfcc_hires.conf \
--cmd "$train_cmd" data/${datadir}_hires || exit 1;
steps/compute_cmvn_stats.sh data/${datadir}_hires || exit 1;
utils/fix_data_dir.sh data/${datadir}_hires || exit 1;
done
for data in $test_sets; do
steps/online/nnet2/extract_ivectors_online.sh --cmd "$train_cmd" --nj 20 \
data/${data}_hires exp/nnet3${nnet3_affix}/extractor \
exp/nnet3${nnet3_affix}/ivectors_${data}_hires
done
fi
nnet3_affix=_${train_set}_cleaned_rvb

lm_suffix=

if [ $stage -le 18 ]; then
# First the options that are passed through to run_ivector_common.sh
# (some of which are also used in this script directly).
lm_suffix=

# The rest are configs specific to this script. Most of the parameters
# are just hardcoded at this level, in the commands below.
Expand All @@ -138,16 +132,14 @@ if [ $stage -le 18 ]; then

for data in $test_sets; do
(
steps/nnet3/decode.sh \
--acwt 1.0 --post-decode-acwt 10.0 \
--extra-left-context $chunk_left_context \
--extra-right-context $chunk_right_context \
--extra-left-context-initial 0 \
--extra-right-context-final 0 \
--frames-per-chunk $frames_per_chunk \
--nj 8 --cmd "$decode_cmd" --num-threads 4 \
--online-ivector-dir exp/nnet3${nnet3_affix}/ivectors_${data}_hires \
$tree_dir/graph${lm_suffix} data/${data}_hires ${dir}/decode${lm_suffix}_${data} || exit 1
local/nnet3/decode.sh --affix 2stage --pass2-decode-opts "--min-active 1000" \
--acwt 1.0 --post-decode-acwt 10.0 \
--frames-per-chunk 150 --nj $decode_nj \
--ivector-dir exp/nnet3${nnet3_affix} \
--graph-affix ${lm_suffix} \
data/${data} data/lang${lm_suffix} \
$tree_dir/graph${lm_suffix} \
exp/chain${nnet3_affix}/tdnn1b_sp
) || touch $dir/.error &
done
wait
Expand All @@ -159,6 +151,6 @@ if [ $stage -le 20 ]; then
# please specify both dev and eval set directories so that the search parameters
# (insertion penalty and language model weight) will be tuned using the dev set
local/score_for_submit.sh \
--dev exp/chain_${train_set}_cleaned/tdnn1a_sp/decode_dev_${enhancement}_ref \
--eval exp/chain_${train_set}_cleaned/tdnn1a_sp/decode_eval_${enhancement}_ref
--dev exp/chain${nnet3_affix}/tdnn1b_sp/decode${lm_suffix}_dev_${enhancement}_dereverb_ref_2stage \
--eval exp/chain${nnet3_affix}/tdnn1b_sp/decode${lm_suffix}_eval_${enhancement}_dereverb_ref_2stage
fi
3 changes: 2 additions & 1 deletion egs/chime5/s5b/local/run_wpe.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ set -o pipefail

miniconda_dir=$HOME/miniconda3/
if [ ! -d $miniconda_dir ]; then
echo "$miniconda_dir does not exist. Please run '../../../tools/extras/install_miniconda.sh' and '../../../tools/extras/install_wpe.sh';"
echo "$miniconda_dir does not exist. Please run '$KALDI_ROOT/tools/extras/install_miniconda.sh'."
exit 1
fi

# check if WPE is installed
Expand Down
6 changes: 6 additions & 0 deletions egs/chime6/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
This is a kaldi recipe for the 6th CHiME Speech Separation and Recognition Challenge (CHiME-6).

See http://spandh.dcs.shef.ac.uk/chime_challenge/ for more detailed information.

s5_track1 : Track 1 of the challenge (oracle segments and speaker label is provided)
s5_track2 : Track 2 of the challenge (only raw audio is provided)
21 changes: 21 additions & 0 deletions egs/chime6/s5_track1/RESULTS
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@

# tri2
%WER 88.52 [ 52121 / 58881, 2023 ins, 30285 del, 19813 sub ] exp/tri2/decode_dev_gss/wer_17_0.5

# tri3
%WER 85.72 [ 50471 / 58881, 3079 ins, 23787 del, 23605 sub ] exp/tri3/decode_dev_gss/wer_17_0.5

# nnet3 tdnn+chain
%WER 41.21 [ 24267 / 58881, 2428 ins, 7606 del, 14233 sub ] exp/chain_train_worn_simu_u400k_cleaned_rvb/tdnn1b_sp/decode_dev_worn_2stage/wer_11_0.0
%WER 51.76 [ 30474 / 58881, 2665 ins, 11749 del, 16060 sub ] exp/chain_train_worn_simu_u400k_cleaned_rvb/tdnn1b_sp/decode_dev_gss_multiarray_2stage/wer_10_0.0

# result with the challenge submission format (Nov 17, 2019)
# after the fix of speaker ID across arrays
==== development set ====
session S02 room DINING: #words 8288, #errors 4459, wer 53.80 %
session S02 room KITCHEN: #words 12696, #errors 7170, wer 56.47 %
session S02 room LIVING: #words 15460, #errors 7388, wer 47.78 %
session S09 room DINING: #words 5766, #errors 3100, wer 53.76 %
session S09 room KITCHEN: #words 8911, #errors 4483, wer 50.30 %
session S09 room LIVING: #words 7760, #errors 3874, wer 49.92 %
overall: #words 58881, #errors 30474, wer 51.75 %
15 changes: 15 additions & 0 deletions egs/chime6/s5_track1/cmd.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# you can change cmd.sh depending on what type of queue you are using.
# If you have no queueing system and want to run on a local machine, you
# can change all instances 'queue.pl' to run.pl (but be careful and run
# commands one by one: most recipes will exhaust the memory on your
# machine). queue.pl works with GridEngine (qsub). slurm.pl works
# with slurm. Different queues are configured differently, with different
# queue names and different ways of specifying things like memory;
# to account for these differences you can create and edit the file
# conf/queue.conf to match your queue's configuration. Search for
# conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more information,
# or search for the string 'default_config' in utils/queue.pl or utils/slurm.pl.

export train_cmd="retry.pl queue.pl --mem 2G"
export decode_cmd="queue.pl --mem 4G"

50 changes: 50 additions & 0 deletions egs/chime6/s5_track1/conf/beamformit.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#BeamformIt sample configuration file for AMI data (http://groups.inf.ed.ac.uk/ami/download/)

# scrolling size to compute the delays
scroll_size = 250

# cross correlation computation window size
window_size = 500

#amount of maximum points for the xcorrelation taken into account
nbest_amount = 4

#flag wether to apply an automatic noise thresholding
do_noise_threshold = 1

#Percentage of frames with lower xcorr taken as noisy
noise_percent = 10

######## acoustic modelling parameters

#transition probabilities weight for multichannel decoding
trans_weight_multi = 25
trans_weight_nbest = 25

###

#flag wether to print the feaures after setting them, or not
print_features = 1

#flag wether to use the bad frames in the sum process
do_avoid_bad_frames = 1

#flag to use the best channel (SNR) as a reference
#defined from command line
do_compute_reference = 1

#flag wether to use a uem file or not(process all the file)
do_use_uem_file = 0

#flag wether to use an adaptative weights scheme or fixed weights
do_adapt_weights = 1

#flag wether to output the sph files or just run the system to create the auxiliary files
do_write_sph_files = 1

####directories where to store/retrieve info####
#channels_file = ./cfg-files/channels

#show needs to be passed as argument normally, here a default one is given just in case
#show_id = Ttmp

2 changes: 2 additions & 0 deletions egs/chime6/s5_track1/conf/mfcc.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
--use-energy=false
--sample-frequency=16000
10 changes: 10 additions & 0 deletions egs/chime6/s5_track1/conf/mfcc_hires.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# config for high-resolution MFCC features, intended for neural network training.
# Note: we keep all cepstra, so it has the same info as filterbank features,
# but MFCC is more easily compressible (because less correlated) which is why
# we prefer this method.
--use-energy=false # use average of log energy, not energy.
--sample-frequency=16000
--num-mel-bins=40
--num-ceps=40
--low-freq=40
--high-freq=-400
1 change: 1 addition & 0 deletions egs/chime6/s5_track1/conf/online_cmvn.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# configuration file for apply-cmvn-online, used in the script ../local/run_online_decoding.sh
10 changes: 10 additions & 0 deletions egs/chime6/s5_track1/conf/queue.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
command qsub -v PATH -cwd -S /bin/bash -j y -l arch=*64*
option mem=* -l mem_free=$0,ram_free=$0
option mem=0 # Do not add anything to qsub_opts
option num_threads=* -pe smp $0
option num_threads=1 # Do not add anything to qsub_opts
option max_jobs_run=* -tc $0
default gpu=0
option gpu=0 -q all.q -l hostname='!b19*'
option gpu=* -l gpu=$0 -q g.q -l hostname='!b19*'

37 changes: 37 additions & 0 deletions egs/chime6/s5_track1/local/add_location_to_uttid.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/bin/bash
# Author: Ashish Arora
# Apache 2.0

. ./cmd.sh
. ./path.sh

enhancement=gss
. utils/parse_options.sh || exit 1;

if [ $# != 3 ]; then
echo "Wrong #arguments ($#, expected 3)"
echo "Usage: local/add_location_to_uttid.sh [options] <json-transcription-in-dir>"
echo " <perutt-in-dir> <uttid-location-mapping-out-file>"
echo "main options (for others, see top of script file)"
echo " --enhancement # enhancement type (gss or beamformit)"
exit 1;
fi

jdir=$1
puttdir=$2
utt_loc_file=$3

# Set bash to 'debug' mode, it will exit on :
# -e 'error', -u 'undefined variable', -o ... 'error in pipeline', -x 'print commands',
set -e
set -u
set -o pipefail

if [[ ${enhancement} == *gss* ]]; then
local/get_location.py $jdir > $utt_loc_file
local/replace_uttid.py $utt_loc_file $puttdir/per_utt > $puttdir/per_utt_loc
fi

if [[ ${enhancement} == *beamformit* ]]; then
cat $puttdir/per_utt > $puttdir/per_utt_loc
fi
Loading