Skip to content

Conversation

@vijayaditya
Copy link
Contributor

The TDNN+LSTM was found to be as good as the BLSTM architecture in previous experiments (not checked in). This PR has scripts to specify these models as xconfig scripts.

I will add the results in a day.

@danpovey
Copy link
Contributor

This is great news as it means we can do online decoding with less latency.
@tomkocse, you could if you want use this setup to test the 'looped'
decoding PR.

On Mon, Nov 21, 2016 at 4:29 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

The TDNN+LSTM was found to be as good as the BLSTM architecture in
previous experiments (not checked in). This PR has scripts to specify these
models as xconfig scripts.

I will add the results in a day.

You can view, comment on, or merge this pull request online at:

#1205
Commit Summary

  • Added the basic TDNN+LSTM scripts

File Changes

Patch Links:


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1205, or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu93BLM9mfOP0eu4unVE-LkVV8QtCks5rAg1NgaJpZM4K4uAq
.

@vijayaditya
Copy link
Contributor Author

The latency of the TDNN+LSTM models right now is 21 frames, less than half
of that of the BLSTMs (51 frames). Further the TDNN layer outputs can be
shared across neighboring chunks unlike the BLSTM. However the best
performing TDNN+LSTM model has the same number of parameters as the BLSTM.
Reducing the parameters to 75% or 50% has increased WER.

The current set of experiments being run are designed to find the optimal
combination of TDNN and LSTM layers.

--Vijay

On Mon, Nov 21, 2016 at 4:32 PM, Daniel Povey notifications@github.com
wrote:

This is great news as it means we can do online decoding with less latency.
@tomkocse, you could if you want use this setup to test the 'looped'
decoding PR.

On Mon, Nov 21, 2016 at 4:29 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

The TDNN+LSTM was found to be as good as the BLSTM architecture in
previous experiments (not checked in). This PR has scripts to specify
these
models as xconfig scripts.

I will add the results in a day.

You can view, comment on, or merge this pull request online at:

#1205
Commit Summary

  • Added the basic TDNN+LSTM scripts

File Changes

Patch Links:


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1205, or mute the thread
<https://github.com/notifications/unsubscribe-
auth/ADJVu93BLM9mfOP0eu4unVE-LkVV8QtCks5rAg1NgaJpZM4K4uAq>
.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1205 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADtwoKb89E4BY3OXxnfw3925bjQ7VN4iks5rAg4DgaJpZM4K4uAq
.

@vijayaditya
Copy link
Contributor Author

I added the tdnn_7i recipe which has almost twice the number of parameters as our current best tdnn, tdnn_7h. It however does not give us better results. However the log-probabilities (both train and CV) look better. The local/chain/run_tdnn.sh script still points to tdnn_7h.sh

image

@vijayaditya
Copy link
Contributor Author

I added the results for the TDNN+LSTM setup. As expected they look better than the TDNN and LSTM results. Further TDNN+LSTM without pretraining is working better than TDNN+LSTM with it. (Please note that we saw improvements due to removal of layer-wise pretraining in TDNNs but not in LSTMs).

#System                  lstm_6j   tdnn_7h    this(old) this(new)
#WER on train_dev(tg)      14.66     13.84     13.88     13.42
#WER on train_dev(fg)      13.42     12.84     12.99     12.42
#WER on eval2000(tg)        16.8      16.5      16.0      15.7
#WER on eval2000(fg)        15.4      14.8      14.5      14.2
#Final train prob     -0.0824531-0.0889771-0.0515623-0.0538088
#Final valid prob     -0.0989325 -0.113102-0.0784436-0.0800484
#Final train prob (xent)      -1.15506   -1.2533 -0.782815   -0.7603
#Final valid prob (xent)      -1.24364  -1.36743 -0.946914 -0.949909

BLSTM training is still running. I might have the results in a day. The below plots compares the log-likelihood values for the TDNN+LSTM and BLSTM setups.

image

self.config = { 'input':'[-1]',
'dim':-1,
'max-change' : 0.75,
'bias-stddev' : 0,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we now support ng-affine-options, so it is no longer necessary to explicitly support *stddev options.

@@ -0,0 +1,110 @@
# Copyright 2016 Johns Hopkins University (Dan Povey)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danpovey any comments.

@danpovey
Copy link
Contributor

OK with me.

On Tue, Nov 22, 2016 at 8:13 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@vijayaditya commented on this pull request.

In egs/wsj/s5/steps/libs/nnet3/xconfig/tdnn.py
#1205 (review):

@@ -0,0 +1,110 @@
+# Copyright 2016 Johns Hopkins University (Dan Povey)

@danpovey https://github.com/danpovey any comments.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1205 (review),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVuzzP2FSbIMhaTbRYy5ZNQoKgxwncks5rA5NLgaJpZM4K4uAq
.

1. Added TDNN+LSTM recipe which performs similar to BLSTM
model with significantly smaller latency (21 frames vs 51 frames).
2. Added BLSTM results in xconfig setup, without layer-wise
discriminative pre-training (2.7% rel. improvement)
3. Added an example TDNN recipe which uses subset of feature vector from
neighboring time steps (results pending).

xconfig : Added a tdnn layer which can deal with subset-dim option.
@vijayaditya
Copy link
Contributor Author

ready for review.

@vijayaditya vijayaditya changed the title WIP : Added the basic TDNN+LSTM scripts Added the basic TDNN+LSTM scripts Nov 23, 2016
@danpovey
Copy link
Contributor

Looks good to me, merge when you want.

@vijayaditya vijayaditya merged commit a3d4430 into kaldi-asr:master Nov 23, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants