nnet1: redesigning LSTM, BLSTM code, #950

KarelVesely84 · 2016-08-05T20:41:15Z

Any volunteer for the review?

introducing interface 'MultistreamComponent',
- handles stream-lengths and stream-resets,
rewritten most of training tools 'nnet-train-lstm-streams',
'nnet-train-blstm-streams',
introducing 'RecurrentComponent' with simple forward recurrency.
the LSTM/BLSTM components have clipping presets we recently
found helpful for BLSTM-CTC system.
renaming tools and components (removing 'streams' from names)
updating the scripts for generating lstm/blstm prototypes
updating 'rm' lstm/blstm examples

- introducing interface 'MultistreamComponent', - handles stream-lengths and stream-resets, - rewritten most of training tools 'nnet-train-lstm-streams', 'nnet-train-blstm-streams', - introducing 'RecurrentComponent' with simple forward recurrency. - the LSTM/BLSTM components have clipping presets we recently found helpful for BLSTM-CTC system. - renaming tools and components (removing 'streams' from names) - updating the scripts for generating lstm/blstm prototypes - updating 'rm' lstm/blstm examples

danpovey · 2016-08-05T20:44:19Z

src/nnetbin/nnet-train-multistream.cc

+      // number of frames we'll pack as the streams,
+      std::vector<int32> frame_num_utt;
+
+      // pack the parallel data,


It seems to me like the code would be easier to follow if you broke some things out into functions.
This is way past the "soft limit" of 40 lines per function.

Hi, okay, I rewrote this place. I understand that the short functions are easier to understand than the 'script-like' code of the binary with many dependencies. The benefit of 'script-like' code is that it is easy to modify. As a compromise I used the brace-blocks with a comment in some places. Yes, that particular place was a tedious... Thank you for pointing it out.

What happens in the 'nnet-train-multistream' code is that first, whole sentences are filled into 'vector<Matrix<>>', from which they are sliced into a multi-stream mini-batch with interleaved layout of frames:
[feaRow1_stream1,
feaRow1_stream2,
...
feaRow1_streamN,
feaRow2_stream1,
feaRow2_stream2,
...
feaRow2_streamN
]

Similar thing can be found in EESEN, where the 'batch' is formed from whole utterances of roughly same length. The multi-stream training with whole sentences is then implemented in 'nnet-train-multistream-perutt'.

[The compilation bug is fixed, the backward compatibility of models also]

danpovey · 2016-08-05T21:10:21Z

Karel, regarding back-compatibility, I'm not sure if it can read the old models-- if not, can you at least make sure it dies with an informative error message?

danpovey · 2016-08-05T23:14:10Z

Also, notice that the build failed.

danpovey · 2016-08-08T18:51:23Z

src/nnetbin/nnet-train-multistream.cc

+      // pass the info about padding,
+      nnet.SetSeqLengths(frame_num_utt);
+      // Show the 'utt' lengths in the VLOG[2],
+      if (kaldi::g_kaldi_verbose_level >= 2) {


You're supposed to check the verbose level with GetVerboseLevel().

- using GetVerboseLevel(), - avoiding 'WriteIntegerVector' for writing to KALDI_LOG by introducing: 'operator<< (std::ostream, std::vector<T>)' in kaldi-error.h

danpovey reviewed Aug 5, 2016
View reviewed changes

KarelVesely84 added 3 commits August 8, 2016 12:57

fixup of travis compilation issue

b068f1b

adding bwd-compatibility for LSTM/BLSTM models

eee1830

introducing 'ReadData' for getting 1 valid sentence

232e1a5

danpovey reviewed Aug 8, 2016
View reviewed changes

integrating changes proposed by Dan,

72ba09e

- using GetVerboseLevel(), - avoiding 'WriteIntegerVector' for writing to KALDI_LOG by introducing: 'operator<< (std::ostream, std::vector<T>)' in kaldi-error.h

danpovey merged commit 500b2eb into kaldi-asr:master Aug 11, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nnet1: redesigning LSTM, BLSTM code, #950

nnet1: redesigning LSTM, BLSTM code, #950

Uh oh!

KarelVesely84 commented Aug 5, 2016

Uh oh!

danpovey Aug 5, 2016

Uh oh!

KarelVesely84 Aug 8, 2016 •

edited

Loading

Uh oh!

danpovey commented Aug 5, 2016

Uh oh!

danpovey commented Aug 5, 2016

Uh oh!

danpovey Aug 8, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nnet1: redesigning LSTM, BLSTM code, #950

nnet1: redesigning LSTM, BLSTM code, #950

Uh oh!

Conversation

KarelVesely84 commented Aug 5, 2016

Uh oh!

danpovey Aug 5, 2016

Choose a reason for hiding this comment

Uh oh!

KarelVesely84 Aug 8, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danpovey commented Aug 5, 2016

Uh oh!

danpovey commented Aug 5, 2016

Uh oh!

danpovey Aug 8, 2016

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KarelVesely84 Aug 8, 2016 •

edited

Loading