Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
95 commits
Select commit Hold shift + click to select a range
6501e9c
Further work on batch normalization
danpovey Mar 30, 2017
5f87237
Fix a bug for option --test-mode
tomkocse Apr 1, 2017
9c9adbe
Minor bug-fixes
danpovey Apr 2, 2017
3bde952
Proposed fix for how combination works with batch-norm (to be tested …
danpovey Apr 10, 2017
04a5645
Adding better nnet3 support for convnets, and addressing various issu…
danpovey Mar 22, 2017
ec03bf9
Merge branch 'nnet3_conv_rebase' into batch_norm_and_conv
danpovey Apr 10, 2017
5777c27
Fix some compilation issues arising from merge of convnets with batch…
danpovey Apr 11, 2017
57648d0
idct: Added idct-layer to xconfigs
vimalmanohar Apr 11, 2017
07ab874
Merge branch 'nnet3_conv_rebase' into batch_norm_and_conv
danpovey Apr 11, 2017
0fc7810
Change how we reverse order of nnet*-combine args, for more clarity.
danpovey Apr 11, 2017
5126ea7
Merge branch 'batch_norm' into batch_norm_and_conv
danpovey Apr 11, 2017
fed8e70
Merge branch 'master' of github.com:kaldi-asr/kaldi into idct
vimalmanohar Apr 12, 2017
9d1976a
Merge branch 'master' into idct
vimalmanohar Apr 12, 2017
94a0224
Initial draft of xconfig layer for convolution
danpovey Apr 13, 2017
cf91115
Merge branch 'idct' of https://github.com/vimalmanohar/kaldi into bat…
danpovey Apr 13, 2017
dbe122e
Adding first example script for CNN+TDNN.
danpovey Apr 13, 2017
da281e1
Fix bug in vectorization of CNNs
danpovey Apr 13, 2017
a39df46
[src,scripts] Add example script for CNN+TDNN, with results; remove u…
danpovey Apr 13, 2017
b0dbcf6
Merge remote-tracking branch 'upstream/master' into batch_norm_and_conv
danpovey Apr 15, 2017
0799b4d
[src] WaveData::Read unspecified streams; unsupport 8 and 32 bit form…
kkm000 Apr 17, 2017
1a97aba
[src] Make it possible to do 'egs' extraction without un-compressing …
danpovey Apr 17, 2017
a0b6015
Adding results for using batchnorm components instead of renorm
tomkocse Apr 19, 2017
f833f79
Removing old results in AMI
tomkocse Apr 20, 2017
fafb07f
Merge pull request #1557 from tomkocse/add_batch_result
danpovey Apr 20, 2017
4ffe5ea
[build] Slight change to how tests are reported, to figure out which …
danpovey Apr 20, 2017
3d011de
Setting up basic structure for CIFAR directory. (#1554)
danpovey Apr 21, 2017
bf1f34b
[egs,scripts] Some small fixes/changes to CIFAR setup
danpovey Apr 21, 2017
2fdf739
Merge remote-tracking branch 'upstream/master' into kaldi_52
danpovey Apr 21, 2017
0857a0f
[src] Fixes to prevent nnet3 test failures-- chiefly, bug-fix for loo…
danpovey Apr 22, 2017
e4e69f8
Merge remote-tracking branch 'upstream/master' into kaldi_52
danpovey Apr 22, 2017
bd55e1f
Revert "[src] WaveData::Read unspecified streams; unsupport 8 and 32 …
danpovey Apr 22, 2017
cad1388
[src,egs,scripts] Add code from @YiwenShaoStephen; add template for n…
danpovey Apr 22, 2017
2a18085
Simplifying python nnet3-training scripts, removing discriminative pr…
danpovey Apr 18, 2017
456a440
[scripts,egs] Add back make_tdnn_configs.py (used in some old scripts…
danpovey Apr 23, 2017
6e53b88
[scripts] put back components.py, used in steps/nnet3/lstm/make_confi…
danpovey Apr 23, 2017
043d8db
[egs] Removing some old BABEL scripts that would no longer work with …
danpovey Apr 23, 2017
9d6c306
[src] Add previously missing file nnet3-get-egs-simple.cc
danpovey Apr 23, 2017
5e45374
[src,scripts,egs] Image-recognition-related changes; code and script …
danpovey Apr 23, 2017
05d1177
[egs] Un-delete BABEL example script that was mistakenly deleted as u…
danpovey Apr 23, 2017
650d824
[src,egs,scripts] Some partially completed work relating mostly to im…
danpovey Apr 23, 2017
c222acc
[egs,scripts] Getting first experiment working on CIFAR-10.
danpovey Apr 24, 2017
60d5f87
[src,scripts] Some convolution-related fixes: now learning rates are …
danpovey Apr 25, 2017
526301b
[src,egs,scripts] Minor fixes; new example script with CNNs (no resul…
danpovey Apr 25, 2017
90fa5e3
[scripts,egs] Various nnet3/chain script simplifications; new cifar e…
danpovey Apr 25, 2017
5c16793
[scripts] Bug-fix to previous commit about simplifying nnet3 scripts
danpovey Apr 25, 2017
729fe3d
[scripts] image recognition: add script to transform matrix to png im…
YiwenShaoStephen Apr 25, 2017
ed94788
[scripts] Another fix to previously checked-in simplification of nnet…
danpovey Apr 25, 2017
2217015
[src] Fix test failure in nnet-derivative-test.cc
danpovey Apr 26, 2017
0a5f635
[egs] some fixes re CIFAR data preparation (#1579)
hhadian Apr 26, 2017
e17c1ee
[src] Possible fix for bug found by Gaofeng Cheng
danpovey Apr 26, 2017
4fd7f00
[src,scripts] Add 'test mode' to dropout component (#1578)
freewym Apr 26, 2017
6d948a2
[egs] cifar egs: fixes re CIFAR image to matrix (#1580)
hhadian Apr 26, 2017
780f5b5
[scripts] Add select_image_in_egs.py to easily see augmented images (…
hhadian Apr 26, 2017
3e1d69b
[egs] Clarifying comment.
danpovey Apr 26, 2017
3786beb
[scripts] Image recognition: debugging-related scripts.
YiwenShaoStephen Apr 26, 2017
1d5d9de
[scripts] Add image-augmentation option to train_raw_dnn.py (#1585)
hhadian Apr 27, 2017
932c88b
[src] Add dropout to Xconfig (#1589)
hhadian Apr 28, 2017
f8f64cf
[src,scripts,egs] Change how combination works with batch-norm; add b…
danpovey Apr 28, 2017
4e783ac
[egs] Add new CIFAR example script
danpovey Apr 28, 2017
e4a5af1
[src] Changing how batch-norm stats are recomputed so that training c…
danpovey Apr 28, 2017
cfe1ecc
[egs] Fix mode on cifar script
danpovey Apr 29, 2017
fd76df6
[src] Remove the GetLinearSymbolSequences() function (#1594)
vdp Apr 29, 2017
48bb3e6
[src] Small fix to how combination+batch-norm works (RE output-xent b…
danpovey Apr 29, 2017
6a4c50e
[scripts] Improving how shell commands are called from python (#1595)
danpovey Apr 30, 2017
0dea4ad
[egs] add results for cifar100 in run_cnn_1e.sh (#1599)
keli78 May 1, 2017
868c044
Merge remote-tracking branch 'upstream/master' into kaldi_52
danpovey May 1, 2017
81b2444
Merge remote-tracking branch 'upstream/master' into kaldi_52
danpovey May 1, 2017
810b35d
[src,egs] Changes to image augmentation code; example script in CIFAR…
hhadian May 1, 2017
3834fbd
[egs] adding more results/docs for experiments
danpovey May 2, 2017
a57c169
[scripts] Remove some deprecated options
danpovey May 2, 2017
010c556
[src] Refactor wave-reading code, un-support non-16-bit PCM, support …
kkm000 May 3, 2017
6193015
[egs,src] Add another augmentation experiment script; ad final-combin…
hhadian May 3, 2017
f29da2a
[scripts] Fix subtle threading problem causing python script to wait …
danpovey May 4, 2017
3eaae47
Merge remote-tracking branch 'upstream/master' into kaldi_52
danpovey May 4, 2017
223a835
[scripts] Script fix w.r.t idct layer in xconfigs, only affects when …
GaofengCheng May 5, 2017
18648cd
Revert "[scripts] Script fix w.r.t idct layer in xconfigs, only affec…
danpovey May 5, 2017
60b522d
[scripts] Fix bug in xconfig scripts when idct and lda are both used …
GaofengCheng May 5, 2017
13b08e4
[src] Add --fill-mode option in nnet3-egs-augment-image.cc (#1602)
YiwenShaoStephen May 5, 2017
e593f73
Merge remote-tracking branch 'upstream/master' into kaldi_52
danpovey May 5, 2017
d4c2fb4
[src] Minor fixes to nnet3-egs-augment-image.cc
danpovey May 7, 2017
7201337
[src] Fix bug found by Hainan and Yiming regarding looped computation…
danpovey May 11, 2017
d2d0738
[src,scripts,egs] Implement and tune resnets. (#1620)
danpovey May 13, 2017
480ea35
Merging master into kaldi_52 (#1621)
danpovey May 13, 2017
32dc7fe
[egs,scripts] Add, and use the --proportional-shrink option (approxim…
danpovey May 17, 2017
e727fa0
Changing proportional-shrink from 120 to 150 in mini-librispeech exam…
danpovey May 17, 2017
a18a40d
[src,egs,scripts] Add SVHN example; fix asymmetry in image-augmentati…
danpovey May 18, 2017
da179a1
[egs] Adding --proportional-shrink example for WSJ.
danpovey May 19, 2017
5934056
[egs] Further tuning of --proportional-shrink in WSJ
danpovey May 20, 2017
ec8dec6
[src,scripts,egs] Merge master into kaldi_52 (#1628)
danpovey May 24, 2017
6922c15
[src] Add extra diagnostic in nnet3-show-progress
danpovey May 25, 2017
c9d7ccf
[scripts] python3 compatibility: decode the output of get_command_std…
osadj May 26, 2017
21d11ff
[egs] adding proportional-shrink scripts to AMI (#1654)
GaofengCheng May 29, 2017
b68c428
Merge remote-tracking branch 'upstream/master' into kaldi_52
danpovey May 29, 2017
1de7f9e
Merge pull request #1656 from danpovey/kaldi_52_merge_master
danpovey May 29, 2017
393ef73
[build] Upgrade .version (this is official start of kaldi 5.2)
danpovey May 29, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 5 additions & 9 deletions egs/ami/s5b/RESULTS_ihm
Original file line number Diff line number Diff line change
Expand Up @@ -54,24 +54,20 @@
%WER 22.4 | 12643 89977 | 80.3 12.5 7.2 2.7 22.4 53.6 | -0.503 | exp/ihm/nnet3_cleaned/lstm_bidirectional_sp/decode_eval/ascore_10/eval_hires.ctm.filt.sys

############################################
# cleanup + chain TDNN model.
# cleanup + chain TDNN model
# local/chain/run_tdnn.sh --mic ihm --stage 4 &
# for d in exp/ihm/chain_cleaned/tdnn1d_sp_bi/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
%WER 21.7 | 13098 94488 | 81.1 10.4 8.4 2.8 21.7 54.4 | 0.096 | exp/ihm/chain_cleaned/tdnn1d_sp_bi/decode_dev/ascore_10/dev_hires.ctm.filt.sys
%WER 22.1 | 12643 89979 | 80.5 12.1 7.4 2.6 22.1 52.8 | 0.185 | exp/ihm/chain_cleaned/tdnn1d_sp_bi/decode_eval/ascore_10/eval_hires.ctm.filt.sys
# for d in exp/ihm/chain_cleaned/tdnn1e_sp_bi/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
%WER 21.4 | 13098 94487 | 81.4 10.1 8.5 2.8 21.4 53.7 | 0.090 | exp/ihm/chain_cleaned/tdnn1e_batch_sp_bi/decode_dev/ascore_10/dev_hires.ctm.filt.sys
%WER 21.5 | 12643 89977 | 81.0 11.8 7.2 2.5 21.5 52.4 | 0.168 | exp/ihm/chain_cleaned/tdnn1e_batch_sp_bi/decode_eval/ascore_10/eval_hires.ctm.filt.sys

# cleanup + chain TDNN model. Uses LDA instead of PCA for ivector features.
# local/chain/tuning/run_tdnn_1b.sh --mic ihm --stage 4 &
# for d in exp/ihm/chain_cleaned/tdnn1b_sp_bi/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
%WER 22.0 | 13098 94488 | 80.8 10.2 9.0 2.8 22.0 54.7 | 0.102 | exp/ihm/chain_cleaned/tdnn1b_sp_bi/decode_dev/ascore_10/dev_hires.ctm.filt.sys
%WER 22.2 | 12643 89968 | 80.3 12.1 7.6 2.6 22.2 52.9 | 0.170 | exp/ihm/chain_cleaned/tdnn1b_sp_bi/decode_eval/ascore_10/eval_hires.ctm.filt.sys

# local/chain/run_tdnn.sh --mic ihm --train-set train --gmm tri3 --nnet3-affix "" --stage 4
# chain TDNN model without cleanup [note: cleanup helps very little on this IHM data.]
# for d in exp/ihm/chain/tdnn1d_sp_bi/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
%WER 21.8 | 13098 94484 | 80.7 9.7 9.6 2.5 21.8 54.2 | 0.114 | exp/ihm/chain/tdnn1d_sp_bi/decode_dev/ascore_10/dev_hires.ctm.filt.sys
%WER 22.1 | 12643 89965 | 80.2 11.5 8.3 2.3 22.1 52.5 | 0.203 | exp/ihm/chain/tdnn1d_sp_bi/decode_eval/ascore_10/eval_hires.ctm.filt.sy


# local/chain/multi_condition/run_tdnn.sh --mic ihm
# cleanup + chain TDNN model + IHM reverberated data
# for d in exp/ihm/chain_cleaned_rvb/tdnn_sp_bi/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
Expand Down
25 changes: 13 additions & 12 deletions egs/ami/s5b/RESULTS_mdm
Original file line number Diff line number Diff line change
Expand Up @@ -54,20 +54,13 @@
%WER 41.6 | 13964 89980 | 62.7 23.1 14.2 4.3 41.6 65.6 | 0.649 | exp/mdm8/nnet3/tdnn_sp_ihmali/decode_eval/ascore_12/eval_hires_o4.ctm.filt.sys


################

# local/chain/run_tdnn.sh --mic mdm8 --stage 11 &
# cleanup + chain TDNN model, alignments from mdm8 data itself.
# for d in exp/mdm8/chain_cleaned/tdnn_sp_bi/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
%WER 37.9 | 14471 94512 | 65.9 17.4 16.6 3.8 37.9 67.4 | 0.625 | exp/mdm8/chain_cleaned/tdnn_sp_bi/decode_dev/ascore_9/dev_hires_o4.ctm.filt.sys
%WER 41.3 | 13696 89959 | 62.0 18.6 19.4 3.3 41.3 67.2 | 0.591 | exp/mdm8/chain_cleaned/tdnn_sp_bi/decode_eval/ascore_9/eval_hires_o4.ctm.filt.sys


############################################
# cleanup + chain TDNN model, alignments from IHM data (IHM alignments help).
# local/chain/run_tdnn.sh --mic mdm8 --use-ihm-ali true --stage 12 &
# for d in exp/mdm8/chain_cleaned/tdnn1d_sp_bi_ihmali/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
%WER 36.4 | 15140 94513 | 67.3 17.5 15.2 3.6 36.4 63.2 | 0.613 | exp/mdm8/chain_cleaned/tdnn1d_sp_bi_ihmali/decode_dev/ascore_9/dev_hires_o4.ctm.filt.sys
%WER 39.7 | 13835 89969 | 63.2 18.4 18.4 3.0 39.7 65.7 | 0.584 | exp/mdm8/chain_cleaned/tdnn1d_sp_bi_ihmali/decode_eval/ascore_9/eval_hires_o4.ctm.filt.sys
# for d in exp/mdm8/chain_cleaned/tdnn1e_sp_bi/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
%WER 36.0 | 14597 94517 | 67.8 17.7 14.5 3.8 36.0 64.9 | 0.623 | exp/mdm8/chain_cleaned/tdnn1e_sp_bi_ihmali/decode_dev/ascore_9/dev_hires_o4.ctm.filt.sys
%WER 39.3 | 13872 89973 | 63.9 19.0 17.1 3.2 39.3 65.1 | 0.594 | exp/mdm8/chain_cleaned/tdnn1e_sp_bi_ihmali/decode_eval/ascore_9/eval_hires_o4.ctm.filt.sys


# local/chain/run_tdnn.sh --use-ihm-ali true --mic mdm8 --train-set train --gmm tri3 --nnet3-affix "" --stage 12 &
# chain TDNN model-- no cleanup, but IHM alignments.
Expand All @@ -76,6 +69,14 @@
%WER 36.9 | 15282 94502 | 67.1 18.5 14.4 4.1 36.9 62.5 | 0.635 | exp/mdm8/chain/tdnn1d_sp_bi_ihmali/decode_dev/ascore_8/dev_hires_o4.ctm.filt.sys
%WER 40.2 | 13729 89992 | 63.3 19.8 17.0 3.5 40.2 66.4 | 0.608 | exp/mdm8/chain/tdnn1d_sp_bi_ihmali/decode_eval/ascore_8/eval_hires_o4.ctm.filt.sys


# local/chain/run_tdnn.sh --mic mdm8 --stage 11 &
# cleanup + chain TDNN model, alignments from mdm8 data itself.
# for d in exp/mdm8/chain_cleaned/tdnn_sp_bi/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
%WER 37.9 | 14471 94512 | 65.9 17.4 16.6 3.8 37.9 67.4 | 0.625 | exp/mdm8/chain_cleaned/tdnn_sp_bi/decode_dev/ascore_9/dev_hires_o4.ctm.filt.sys
%WER 41.3 | 13696 89959 | 62.0 18.6 19.4 3.3 41.3 67.2 | 0.591 | exp/mdm8/chain_cleaned/tdnn_sp_bi/decode_eval/ascore_9/eval_hires_o4.ctm.filt.sys


# local/chain/multi_condition/run_tdnn.sh --mic mdm8 --use-ihm-ali true --train-set train_cleaned --gmm tri3_cleaned
# cleanup + chain TDNN model, MDM original + IHM reverberated data, alignments from IHM data
# for d in exp/mdm8/chain_cleaned_rvb/tdnn_sp_rvb_bi_ihmali/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
Expand Down
18 changes: 7 additions & 11 deletions egs/ami/s5b/RESULTS_sdm
Original file line number Diff line number Diff line change
Expand Up @@ -52,33 +52,29 @@
%WER 37.9 | 15953 94512 | 66.7 22.0 11.3 4.7 37.9 58.9 | 0.734 | exp/sdm1/nnet3_cleaned/lstm_bidirectional_sp_ihmali/decode_dev/ascore_12/dev_hires_o4.ctm.filt.sys
%WER 41.2 | 13271 89635 | 62.9 23.8 13.2 4.2 41.2 67.8 | 0.722 | exp/sdm1/nnet3_cleaned/lstm_bidirectional_sp_ihmali/decode_eval/ascore_11/eval_hires_o4.ctm.filt.sys

# =========================

############################################
# cleanup + chain TDNN model, alignments from IHM data (IHM alignments help)
# local/chain/run_tdnn.sh --mic sdm1 --use-ihm-ali true --stage 12 &
# for d in exp/sdm1/chain_cleaned/tdnn1e_sp_bi/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
%WER 39.1 | 14457 94509 | 64.6 19.7 15.7 3.7 39.1 66.5 | 0.585 | exp/sdm1/chain_cleaned/tdnn1e_sp_bi_ihmali/decode_dev/ascore_9/dev_hires_o4.ctm.filt.sys
%WER 43.2 | 13551 89981 | 60.3 20.9 18.8 3.5 43.2 67.1 | 0.554 | exp/sdm1/chain_cleaned/tdnn1e_sp_bi_ihmali/decode_eval/ascore_9/eval_hires_o4.ctm.filt.sys

# local/chain/run_tdnn.sh --mic sdm1 --stage 12 &

# cleanup + chain TDNN model, alignments from sdm1 data itself.
# for d in exp/sdm1/chain_cleaned/tdnn_sp_bi/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
%WER 41.6 | 14357 94500 | 62.0 19.2 18.8 3.6 41.6 68.4 | 0.592 | exp/sdm1/chain_cleaned/tdnn_sp_bi/decode_dev/ascore_9/dev_hires_o4.ctm.filt.sys
%WER 45.4 | 12886 89960 | 58.1 21.0 20.9 3.5 45.4 71.9 | 0.558 | exp/sdm1/chain_cleaned/tdnn_sp_bi/decode_eval/ascore_9/eval_hires_o4.ctm.filt.sys



# cleanup + chain TDNN model, alignments from IHM data (IHM alignments help).
# local/chain/run_tdnn.sh --mic sdm1 --use-ihm-ali true --stage 12 &
# cleanup + chain TDNN model, cleaned data and alignments from ihm data.
# for d in exp/sdm1/chain_cleaned/tdnn1d_sp_bi_ihmali/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
%WER 39.5 | 14280 94503 | 64.0 19.3 16.7 3.5 39.5 67.7 | 0.582 | exp/sdm1/chain_cleaned/tdnn1d_sp_bi_ihmali/decode_dev/ascore_9/dev_hires_o4.ctm.filt.sys
%WER 43.9 | 13566 89961 | 59.3 20.9 19.9 3.1 43.9 67.9 | 0.547 | exp/sdm1/chain_cleaned/tdnn1d_sp_bi_ihmali/decode_eval/ascore_9/eval_hires_o4.ctm.filt.sys


# no-cleanup + chain TDNN model, IHM alignments.
# A bit worse than with cleanup [+0.3, +0.4].
# local/chain/run_tdnn.sh --use-ihm-ali true --mic sdm1 --train-set train --gmm tri3 --nnet3-affix "" --stage 12
# for d in exp/sdm1/chain/tdnn1d_sp_bi_ihmali/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
%WER 39.8 | 15384 94535 | 64.4 21.0 14.6 4.2 39.8 62.8 | 0.610 | exp/sdm1/chain/tdnn1d_sp_bi_ihmali/decode_dev/ascore_8/dev_hires_o4.ctm.filt.sys
%WER 44.3 | 14046 90002 | 59.6 23.1 17.3 3.9 44.3 65.6 | 0.571 | exp/sdm1/chain/tdnn1d_sp_bi_ihmali/decode_eval/ascore_8/eval_hires_o4.ctm.filt.sys


# local/chain/multi_condition/run_tdnn.sh --mic sdm1 --use-ihm-ali true --train-set train_cleaned --gmm tri3_cleaned
# cleanup + chain TDNN model, SDM original + IHM reverberated data, alignments from ihm data.
# for d in exp/sdm1/chain_cleaned_rvb/tdnn_sp_rvb_bi_ihmali/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done
Expand Down
2 changes: 1 addition & 1 deletion egs/ami/s5b/local/chain/run_tdnn.sh
266 changes: 266 additions & 0 deletions egs/ami/s5b/local/chain/tuning/run_tdnn_1e.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,266 @@
#!/bin/bash

# same as 1b but uses batchnorm components instead of renorm

# Results on 03/27/2017:
# local/chain/compare_wer_general.sh ihm tdnn1b_sp_bi tdnn1e_sp_bi
# System tdnn1b_sp_bi tdnn1e_sp_bi
# WER on dev 21.9 21.4
# WER on eval 22.2 21.5
# Final train prob -0.0906771 -0.0857669
# Final valid prob -0.126942 -0.124401
# Final train prob (xent) -1.4427 -1.37837
# Final valid prob (xent) -1.60284 -1.5634

set -e -o pipefail
# First the options that are passed through to run_ivector_common.sh
# (some of which are also used in this script directly).
stage=0
mic=ihm
nj=30
min_seg_len=1.55
use_ihm_ali=false
train_set=train_cleaned
gmm=tri3_cleaned # the gmm for the target data
ihm_gmm=tri3 # the gmm for the IHM system (if --use-ihm-ali true).
num_threads_ubm=32
ivector_transform_type=pca
nnet3_affix=_cleaned # cleanup affix for nnet3 and chain dirs, e.g. _cleaned

# The rest are configs specific to this script. Most of the parameters
# are just hardcoded at this level, in the commands below.
train_stage=-10
tree_affix= # affix for tree directory, e.g. "a" or "b", in case we change the configuration.
tdnn_affix=1e #affix for TDNN directory, e.g. "a" or "b", in case we change the configuration.
common_egs_dir= # you can set this to use previously dumped egs.

# End configuration section.
echo "$0 $@" # Print the command line for logging

. ./cmd.sh
. ./path.sh
. ./utils/parse_options.sh


if ! cuda-compiled; then
cat <<EOF && exit 1
This script is intended to be used with GPUs but you have not compiled Kaldi with CUDA
If you want to use GPUs (and have them), go to src/, and configure and make on a machine
where "nvcc" is installed.
EOF
fi

local/nnet3/run_ivector_common.sh --stage $stage \
--mic $mic \
--nj $nj \
--min-seg-len $min_seg_len \
--train-set $train_set \
--gmm $gmm \
--num-threads-ubm $num_threads_ubm \
--ivector-transform-type "$ivector_transform_type" \
--nnet3-affix "$nnet3_affix"

# Note: the first stage of the following script is stage 8.
local/nnet3/prepare_lores_feats.sh --stage $stage \
--mic $mic \
--nj $nj \
--min-seg-len $min_seg_len \
--use-ihm-ali $use_ihm_ali \
--train-set $train_set

if $use_ihm_ali; then
gmm_dir=exp/ihm/${ihm_gmm}
ali_dir=exp/${mic}/${ihm_gmm}_ali_${train_set}_sp_comb_ihmdata
lores_train_data_dir=data/$mic/${train_set}_ihmdata_sp_comb
tree_dir=exp/$mic/chain${nnet3_affix}/tree_bi${tree_affix}_ihmdata
lat_dir=exp/$mic/chain${nnet3_affix}/${gmm}_${train_set}_sp_comb_lats_ihmdata
dir=exp/$mic/chain${nnet3_affix}/tdnn${tdnn_affix}_sp_bi_ihmali
# note: the distinction between when we use the 'ihmdata' suffix versus
# 'ihmali' is pretty arbitrary.
else
gmm_dir=exp/${mic}/$gmm
ali_dir=exp/${mic}/${gmm}_ali_${train_set}_sp_comb
lores_train_data_dir=data/$mic/${train_set}_sp_comb
tree_dir=exp/$mic/chain${nnet3_affix}/tree_bi${tree_affix}
lat_dir=exp/$mic/chain${nnet3_affix}/${gmm}_${train_set}_sp_comb_lats
dir=exp/$mic/chain${nnet3_affix}/tdnn${tdnn_affix}_sp_bi
fi

train_data_dir=data/$mic/${train_set}_sp_hires_comb
train_ivector_dir=exp/$mic/nnet3${nnet3_affix}/ivectors_${train_set}_sp_hires_comb
final_lm=`cat data/local/lm/final_lm`
LM=$final_lm.pr1-7


for f in $gmm_dir/final.mdl $lores_train_data_dir/feats.scp \
$train_data_dir/feats.scp $train_ivector_dir/ivector_online.scp; do
[ ! -f $f ] && echo "$0: expected file $f to exist" && exit 1
done


if [ $stage -le 11 ]; then
if [ -f $ali_dir/ali.1.gz ]; then
echo "$0: alignments in $ali_dir appear to already exist. Please either remove them "
echo " ... or use a later --stage option."
exit 1
fi
echo "$0: aligning perturbed, short-segment-combined ${maybe_ihm}data"
steps/align_fmllr.sh --nj $nj --cmd "$train_cmd" \
${lores_train_data_dir} data/lang $gmm_dir $ali_dir
fi

[ ! -f $ali_dir/ali.1.gz ] && echo "$0: expected $ali_dir/ali.1.gz to exist" && exit 1

if [ $stage -le 12 ]; then
echo "$0: creating lang directory with one state per phone."
# Create a version of the lang/ directory that has one state per phone in the
# topo file. [note, it really has two states.. the first one is only repeated
# once, the second one has zero or more repeats.]
if [ -d data/lang_chain ]; then
if [ data/lang_chain/L.fst -nt data/lang/L.fst ]; then
echo "$0: data/lang_chain already exists, not overwriting it; continuing"
else
echo "$0: data/lang_chain already exists and seems to be older than data/lang..."
echo " ... not sure what to do. Exiting."
exit 1;
fi
else
cp -r data/lang data/lang_chain
silphonelist=$(cat data/lang_chain/phones/silence.csl) || exit 1;
nonsilphonelist=$(cat data/lang_chain/phones/nonsilence.csl) || exit 1;
# Use our special topology... note that later on may have to tune this
# topology.
steps/nnet3/chain/gen_topo.py $nonsilphonelist $silphonelist >data/lang_chain/topo
fi
fi

if [ $stage -le 13 ]; then
# Get the alignments as lattices (gives the chain training more freedom).
# use the same num-jobs as the alignments
steps/align_fmllr_lats.sh --nj 100 --cmd "$train_cmd" ${lores_train_data_dir} \
data/lang $gmm_dir $lat_dir
rm $lat_dir/fsts.*.gz # save space
fi

if [ $stage -le 14 ]; then
# Build a tree using our new topology. We know we have alignments for the
# speed-perturbed data (local/nnet3/run_ivector_common.sh made them), so use
# those.
if [ -f $tree_dir/final.mdl ]; then
echo "$0: $tree_dir/final.mdl already exists, refusing to overwrite it."
exit 1;
fi
steps/nnet3/chain/build_tree.sh --frame-subsampling-factor 3 \
--context-opts "--context-width=2 --central-position=1" \
--leftmost-questions-truncate -1 \
--cmd "$train_cmd" 4200 ${lores_train_data_dir} data/lang_chain $ali_dir $tree_dir
fi

xent_regularize=0.1

if [ $stage -le 15 ]; then
echo "$0: creating neural net configs using the xconfig parser";

num_targets=$(tree-info $tree_dir/tree |grep num-pdfs|awk '{print $2}')
learning_rate_factor=$(echo "print 0.5/$xent_regularize" | python)

mkdir -p $dir/configs
cat <<EOF > $dir/configs/network.xconfig
input dim=100 name=ivector
input dim=40 name=input

# please note that it is important to have input layer with the name=input
# as the layer immediately preceding the fixed-affine-layer to enable
# the use of short notation for the descriptor
fixed-affine-layer name=lda input=Append(-1,0,1,ReplaceIndex(ivector, t, 0)) affine-transform-file=$dir/configs/lda.mat

# the first splicing is moved before the lda layer, so no splicing here
relu-batchnorm-layer name=tdnn1 dim=450
relu-batchnorm-layer name=tdnn2 input=Append(-1,0,1) dim=450
relu-batchnorm-layer name=tdnn3 input=Append(-1,0,1) dim=450
relu-batchnorm-layer name=tdnn4 input=Append(-3,0,3) dim=450
relu-batchnorm-layer name=tdnn5 input=Append(-3,0,3) dim=450
relu-batchnorm-layer name=tdnn6 input=Append(-3,0,3) dim=450
relu-batchnorm-layer name=tdnn7 input=Append(-3,0,3) dim=450

## adding the layers for chain branch
relu-batchnorm-layer name=prefinal-chain input=tdnn7 dim=450 target-rms=0.5
output-layer name=output include-log-softmax=false dim=$num_targets max-change=1.5

# adding the layers for xent branch
# This block prints the configs for a separate output that will be
# trained with a cross-entropy objective in the 'chain' models... this
# has the effect of regularizing the hidden parts of the model. we use
# 0.5 / args.xent_regularize as the learning rate factor- the factor of
# 0.5 / args.xent_regularize is suitable as it means the xent
# final-layer learns at a rate independent of the regularization
# constant; and the 0.5 was tuned so as to make the relative progress
# similar in the xent and regular final layers.
relu-batchnorm-layer name=prefinal-xent input=tdnn7 dim=450 target-rms=0.5
output-layer name=output-xent dim=$num_targets learning-rate-factor=$learning_rate_factor max-change=1.5

EOF

steps/nnet3/xconfig_to_configs.py --xconfig-file $dir/configs/network.xconfig --config-dir $dir/configs/
fi

if [ $stage -le 16 ]; then
if [[ $(hostname -f) == *.clsp.jhu.edu ]] && [ ! -d $dir/egs/storage ]; then
utils/create_split_dir.pl \
/export/b0{5,6,7,8}/$USER/kaldi-data/egs/ami-$(date +'%m_%d_%H_%M')/s5b/$dir/egs/storage $dir/egs/storage
fi

steps/nnet3/chain/train.py --stage $train_stage \
--cmd "$decode_cmd" \
--feat.online-ivector-dir $train_ivector_dir \
--feat.cmvn-opts "--norm-means=false --norm-vars=false" \
--chain.xent-regularize $xent_regularize \
--chain.leaky-hmm-coefficient 0.1 \
--chain.l2-regularize 0.00005 \
--chain.apply-deriv-weights false \
--chain.lm-opts="--num-extra-lm-states=2000" \
--egs.dir "$common_egs_dir" \
--egs.opts "--frames-overlap-per-eg 0" \
--egs.chunk-width 150 \
--trainer.num-chunk-per-minibatch 128 \
--trainer.frames-per-iter 1500000 \
--trainer.num-epochs 4 \
--trainer.optimization.num-jobs-initial 2 \
--trainer.optimization.num-jobs-final 12 \
--trainer.optimization.initial-effective-lrate 0.001 \
--trainer.optimization.final-effective-lrate 0.0001 \
--trainer.max-param-change 2.0 \
--cleanup.remove-egs true \
--feat-dir $train_data_dir \
--tree-dir $tree_dir \
--lat-dir $lat_dir \
--dir $dir
fi


graph_dir=$dir/graph_${LM}
if [ $stage -le 17 ]; then
# Note: it might appear that this data/lang_chain directory is mismatched, and it is as
# far as the 'topo' is concerned, but this script doesn't read the 'topo' from
# the lang directory.
utils/mkgraph.sh --self-loop-scale 1.0 data/lang_${LM} $dir $graph_dir
fi

if [ $stage -le 18 ]; then
rm $dir/.error 2>/dev/null || true
for decode_set in dev eval; do
(
steps/nnet3/decode.sh --acwt 1.0 --post-decode-acwt 10.0 \
--nj $nj --cmd "$decode_cmd" \
--online-ivector-dir exp/$mic/nnet3${nnet3_affix}/ivectors_${decode_set}_hires \
--scoring-opts "--min-lmwt 5 " \
$graph_dir data/$mic/${decode_set}_hires $dir/decode_${decode_set} || exit 1;
) || touch $dir/.error &
done
wait
if [ -f $dir/.error ]; then
echo "$0: something went wrong in decoding"
exit 1
fi
fi
exit 0
Loading