add reverberated TDNN+LSTM recipe on AMI; a fix to aspire recipe#1314
add reverberated TDNN+LSTM recipe on AMI; a fix to aspire recipe#1314vijayaditya merged 3 commits intokaldi-asr:masterfrom
Conversation
|
Will do.
Vijay
…On Jan 3, 2017 6:58 PM, "Daniel Povey" ***@***.***> wrote:
@vijayaditya, do you have time to review this?
On Tue, Jan 3, 2017 at 6:36 PM, Tom Ko ***@***.***> wrote:
> ------------------------------
> You can view, comment on, or merge this pull request online at:
>
> #1314
> Commit Summary
>
> - add reverberated TDNN+LSTM recipe on AMI; a fix to aspire recipe
>
> File Changes
>
> - *M* egs/ami/s5b/RESULTS_ihm
> <https://github.com/kaldi-asr/kaldi/pull/1314/files#diff-0> (9)
> - *M* egs/ami/s5b/RESULTS_mdm
> <https://github.com/kaldi-asr/kaldi/pull/1314/files#diff-1> (9)
> - *M* egs/ami/s5b/RESULTS_sdm
> <https://github.com/kaldi-asr/kaldi/pull/1314/files#diff-2> (14)
> - *M* egs/ami/s5b/local/chain/multi_condition/run_tdnn.sh
> <https://github.com/kaldi-asr/kaldi/pull/1314/files#diff-3> (6)
> - *A* egs/ami/s5b/local/chain/multi_condition/run_tdnn_lstm.sh
> <https://github.com/kaldi-asr/kaldi/pull/1314/files#diff-4> (334)
> - *M* egs/aspire/s5/local/chain/tuning/run_tdnn_7b.sh
> <https://github.com/kaldi-asr/kaldi/pull/1314/files#diff-5> (5)
>
> Patch Links:
>
> - https://github.com/kaldi-asr/kaldi/pull/1314.patch
> - https://github.com/kaldi-asr/kaldi/pull/1314.diff
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#1314>, or mute the thread
> <https://github.com/notifications/unsubscribe-auth/
ADJVu9wd3IHuy0D32zMOQ8f6xADi7fu3ks5rOwWvgaJpZM4LaPIs>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1314 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADtwoPWKpp5Ea-6fbHyxtbu6XcxZ7UWzks5rOwrQgaJpZM4LaPIs>
.
|
| # local/chain/multi_condition/run_tdnn_lstm.sh --mic mdm8 --use-ihm-ali true --train-set train_cleaned --gmm tri3_cleaned | ||
| # cleanup + chain TDNN+LSTM model, MDM original + IHM reverberated data, alignments from IHM data | ||
| # *** best system *** | ||
| %WER 31.8 | 14488 94497 | 71.8 15.4 12.8 3.5 31.8 62.7 | 0.698 | exp/mdm8/chain_cleaned_rvb/tdnn_lstm1i_sp_rvb_bi_ihmali/decode_dev/ascore_10/dev_hires_o4.ctm.filt.sys |
There was a problem hiding this comment.
@tomkocse Very impressive results. Please add the version number of your script which generates these results (eg. run_tdnn_lstm_1a.sh ) as we might replace run_tdnn_lstm.sh with a better tdnn_lstm architecture in the future.
Move this script to tuning directory as run_tdnn_lstm_1b.sh and softlink it from run_tdnn_lstm.sh
egs/ami/s5b/RESULTS_sdm
Outdated
| %WER 40.9 | 13807 89961 | 62.4 20.0 17.6 3.3 40.9 65.7 | 0.612 | exp/sdm1/chain_cleaned/tdnn_lstm1i_sp_bi_ihmali_ld5/decode_eval/ascore_10/eval_hires_o4.ctm.filt.sys | ||
|
|
||
|
|
||
| # local/chain/multi_condition/run_tdnn_lstm.sh --mic sdm1 --use-ihm-ali true --train-set train_cleaned --gmm tri3_cleaned |
There was a problem hiding this comment.
same comment as above; start versioning your run*.sh scripts.
| @@ -0,0 +1,334 @@ | |||
| #!/bin/bash | |||
|
|
|||
| # This is a chain-training script with TDNN+LSTM neural networks. | |||
There was a problem hiding this comment.
Not checking this in detail as I assume it is similar to multi_condition/run_tdnn.sh except for the xconfig and some lstm specific parameters.
|
|
||
| # check steps/libs/nnet3/xconfig/lstm.py for the other options and defaults | ||
| lstmp-layer name=lstm1 cell-dim=1024 recurrent-projection-dim=256 non-recurrent-projection-dim=256 delay=-3 | ||
| relu-renorm-layer name=tdnn4 input=Append(-3,0,3) dim=1024 |
There was a problem hiding this comment.
Try increasing the parameters further as you have more data than the SDM/MDM only case. Though we might not ultimately recommend a large parameter model it is better to know the least possible WER with this architecture and simulated multi_condition data. You can try increasing the relu-norm-layer dim in increments of 256 or 128.
| --egs.chunk-right-context $chunk_right_context \ | ||
| --trainer.num-chunk-per-minibatch 64 \ | ||
| --trainer.frames-per-iter 1500000 \ | ||
| --trainer.num-epochs 4 \ |
There was a problem hiding this comment.
Check if you can increase the number of epochs.
| data/train data/train_temp_for_lats | ||
| utils/data/combine_short_segments.sh \ | ||
| data/train_temp_for_lats $min_seg_len data/train_min${min_seg_len} | ||
| steps/compute_cmvn_stats.sh data/train_min${min_seg_len} exp/make_mfcc/train_min${min_seg_len} mfcc || exit 1; |
There was a problem hiding this comment.
I recommend using the new style of not specifying logdirs.
|
Vijay, wouldn't the training be rather slow with more epochs? IIRC he's
already doubled the amount of data versus the regular 3x-perturbed data, so
that affects the meaning of epochs.
…On Tue, Jan 3, 2017 at 7:39 PM, Vijayaditya Peddinti < ***@***.***> wrote:
***@***.**** requested changes on this pull request.
Could you also move softlink run_tdnn_lstm.sh in the chain/ directory to
tdnn_lstm_1i.sh script.
------------------------------
In egs/ami/s5b/RESULTS_mdm
<#1314 (review)>:
> %WER 34.6 | 15116 94508 | 69.6 17.6 12.9 4.1 34.6 62.3 | 0.687 | exp/mdm8/chain_cleaned/tdnn_lstm1i_sp_bi_ihmali_ld5/decode_dev/ascore_9/dev_hires_o4.ctm.filt.sys
%WER 37.1 | 14343 90002 | 66.3 18.8 14.9 3.4 37.1 62.3 | 0.659 | exp/mdm8/chain_cleaned/tdnn_lstm1i_sp_bi_ihmali_ld5/decode_eval/ascore_9/eval_hires_o4.ctm.filt.sys
+
+
+# local/chain/multi_condition/run_tdnn_lstm.sh --mic mdm8 --use-ihm-ali true --train-set train_cleaned --gmm tri3_cleaned
+# cleanup + chain TDNN+LSTM model, MDM original + IHM reverberated data, alignments from IHM data
+# *** best system ***
+%WER 31.8 | 14488 94497 | 71.8 15.4 12.8 3.5 31.8 62.7 | 0.698 | exp/mdm8/chain_cleaned_rvb/tdnn_lstm1i_sp_rvb_bi_ihmali/decode_dev/ascore_10/dev_hires_o4.ctm.filt.sys
@tomkocse <https://github.com/tomkocse> Very impressive results. Please
add the version number of your script which generates these results (eg.
run_tdnn_lstm_1a.sh ) as we might replace run_tdnn_lstm.sh with a better
tdnn_lstm architecture in the future.
Move this script to tuning directory as run_tdnn_lstm_1b.sh and softlink
it from run_tdnn_lstm.sh
------------------------------
In egs/ami/s5b/RESULTS_sdm
<#1314 (review)>:
> %WER 37.6 | 15122 94495 | 66.1 18.7 15.1 3.7 37.6 63.2 | 0.646 | exp/sdm1/chain_cleaned/tdnn_lstm1i_sp_bi_ihmali_ld5/decode_dev/ascore_10/dev_hires_o4.ctm.filt.sys
%WER 40.9 | 13807 89961 | 62.4 20.0 17.6 3.3 40.9 65.7 | 0.612 | exp/sdm1/chain_cleaned/tdnn_lstm1i_sp_bi_ihmali_ld5/decode_eval/ascore_10/eval_hires_o4.ctm.filt.sys
+
+
+# local/chain/multi_condition/run_tdnn_lstm.sh --mic sdm1 --use-ihm-ali true --train-set train_cleaned --gmm tri3_cleaned
same comment as above; start versioning your run*.sh scripts.
------------------------------
In egs/ami/s5b/local/chain/multi_condition/run_tdnn_lstm.sh
<#1314 (review)>:
> @@ -0,0 +1,334 @@
+#!/bin/bash
+
+# This is a chain-training script with TDNN+LSTM neural networks.
Not checking this in detail as I assume it is similar to
multi_condition/run_tdnn.sh except for the xconfig and some lstm specific
parameters.
------------------------------
In egs/ami/s5b/local/chain/multi_condition/run_tdnn_lstm.sh
<#1314 (review)>:
> + input dim=100 name=ivector
+ input dim=40 name=input
+
+ # please note that it is important to have input layer with the name=input
+ # as the layer immediately preceding the fixed-affine-layer to enable
+ # the use of short notation for the descriptor
+ fixed-affine-layer name=lda input=Append(-1,0,1,ReplaceIndex(ivector, t, 0)) affine-transform-file=$dir/configs/lda.mat
+
+ # the first splicing is moved before the lda layer, so no splicing here
+ relu-renorm-layer name=tdnn1 dim=1024
+ relu-renorm-layer name=tdnn2 input=Append(-1,0,1) dim=1024
+ relu-renorm-layer name=tdnn3 input=Append(-1,0,1) dim=1024
+
+ # check steps/libs/nnet3/xconfig/lstm.py for the other options and defaults
+ lstmp-layer name=lstm1 cell-dim=1024 recurrent-projection-dim=256 non-recurrent-projection-dim=256 delay=-3
+ relu-renorm-layer name=tdnn4 input=Append(-3,0,3) dim=1024
Try increasing the parameters further as you have more data than the
SDM/MDM only case. Though we might not ultimately recommend a large
parameter model it is better to know the least possible WER with this
architecture and simulated multi_condition data. You can try increasing the
relu-norm-layer dim in increments of 256 or 128.
------------------------------
In egs/ami/s5b/local/chain/multi_condition/run_tdnn_lstm.sh
<#1314 (review)>:
> + --cmd "$decode_cmd" \
+ --feat.online-ivector-dir $train_ivector_dir \
+ --feat.cmvn-opts "--norm-means=false --norm-vars=false" \
+ --chain.xent-regularize $xent_regularize \
+ --chain.leaky-hmm-coefficient 0.1 \
+ --chain.l2-regularize 0.00005 \
+ --chain.apply-deriv-weights false \
+ --chain.lm-opts="--num-extra-lm-states=2000" \
+ --egs.dir "$common_egs_dir" \
+ --egs.opts "--frames-overlap-per-eg 0" \
+ --egs.chunk-width $chunk_width \
+ --egs.chunk-left-context $chunk_left_context \
+ --egs.chunk-right-context $chunk_right_context \
+ --trainer.num-chunk-per-minibatch 64 \
+ --trainer.frames-per-iter 1500000 \
+ --trainer.num-epochs 4 \
Check if you can increase the number of epochs.
------------------------------
In egs/aspire/s5/local/chain/tuning/run_tdnn_7b.sh
<#1314 (review)>:
> @@ -100,6 +100,7 @@ if [ $stage -le 9 ]; then
data/train data/train_temp_for_lats
utils/data/combine_short_segments.sh \
data/train_temp_for_lats $min_seg_len data/train_min${min_seg_len}
+ steps/compute_cmvn_stats.sh data/train_min${min_seg_len} exp/make_mfcc/train_min${min_seg_len} mfcc || exit 1;
I recommend using the new style of not specifying logdirs.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1314 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVuy5A7sJGhLt89e5TciPGf8LcJAG1ks5rOxRXgaJpZM4LaPIs>
.
|
No description provided.