changed the definition of deriv-truncate-margin option. Now the margi… #1224

freewym · 2016-11-28T22:00:31Z

…n is defined on top of the model contexts, which applies to more general network archs

danpovey · 2016-11-28T22:04:49Z

I wonder if this will affect what the optimal deriv-truncate option is, e.g. from 10 to 5? Do you plan to do some experimentation before this is merged? I assume it would change the behavior of the existing setups.

freewym · 2016-11-28T22:10:54Z

Since the model contexts include the input splicing (usually [-2,-1,0,1,2]), so now {min|max}-deriv-time are actually offset by 2. If we want to make them the same as before, we just need to set deriv-truncate-margin=8.

…n is defined on top of the model contexts, which applies to more general network archs

danpovey · 2016-11-28T22:17:20Z

egs/wsj/s5/steps/nnet3/chain/train.py

-                        "e.g., During BLSTM model training if the chunk-width=150 and deriv-truncate-margin=5, then the derivative will be "
-                        "backpropagated up to t=-5 and t=154 in the forward and backward LSTM sequence respectively; "
+                        "e.g., if chunk-width=150, model-left-context=2, model-right-context=10 and deriv-truncate-margin=5, "
+                        "then the derivative will be backpropagated up to t=-5-2=7 and t=149+5+10=164 to left and right respectively; "


these lines are too long. Also -5-2 != 7. I think this option is unclear though. It might be better to say:

(Relevant only for recurrent models). If specified, gives the margin (in input frames) around
the 'required' part of each chunk that the derivatives are propagated to. If unset, the
derivatives are propagated all the way to the boundaries of the input data. E.g. 10 is a
reasonable setting. Note: the 'required' part of the chunk is defined by the
model's {left,right}-context.

…max deriv time is consistent with their xent counterparts and the previous chain recipes

freewym · 2016-11-28T22:42:56Z

I changed the margin to 8. Now it is equivalent to the previous margin=10. I think We could change it in the future if we find a better margin.

danpovey · 2016-11-29T00:20:19Z

OK, I guess this is pretty harmless. Merging.

freewym · 2016-11-29T00:46:23Z

@vimalmanohar you might also need to make a minor change according to the change made in train.py here.

danpovey · 2016-11-29T01:10:09Z

I think it shows up as a merge conflict-- I know because I merged this commit into my branch that has Vimal's changes. But it's straightforward to resolve.

…

On Mon, Nov 28, 2016 at 7:46 PM, Yiming Wang ***@***.***> wrote: @vimalmanohar <https://github.com/vimalmanohar> you might also need to make a minor change according to the change made in train.py here. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#1224 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu2FBe8OsEaHBmRMUruhNqtrsi-oWks5rC3XhgaJpZM4K-T_x> .

danpovey · 2016-11-29T01:13:34Z

Hmm... the same option should probably appear in train_rnn.py and train_raw_rnn.py, but it doesn't. Vimal, you should probably take care of that (if you think that options should be there, which I think makes sense.) I was also a little confused by this code (from Vimal's branch).. I assume it's for back compatibility? parser.add_argument("--trainer.num-chunk-per-minibatch", "--trainer.rnn.num-chunk-per-minibatch", type=int, dest='num_chunk_per_minibatch', default=512, help="Number of sequences to be processed in " "parallel every minibatch")

…

On Mon, Nov 28, 2016 at 8:10 PM, Daniel Povey ***@***.***> wrote: I think it shows up as a merge conflict-- I know because I merged this commit into my branch that has Vimal's changes. But it's straightforward to resolve. On Mon, Nov 28, 2016 at 7:46 PM, Yiming Wang ***@***.***> wrote: > @vimalmanohar <https://github.com/vimalmanohar> you might also need to > make a minor change according to the change made in train.py here. > > — > You are receiving this because you modified the open/close state. > Reply to this email directly, view it on GitHub > <#1224 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/ADJVu2FBe8OsEaHBmRMUruhNqtrsi-oWks5rC3XhgaJpZM4K-T_x> > . >

freewym · 2016-11-29T01:47:55Z

Right now train_rnn.py uses --trainer.rnn.num-bptt-steps to determine {min,max}-deriv-time, which assumes it network is "pure" recurrent; if necessary I can add the option --trainer.deriv-truncate-margin and deprecate num-bptt-steps.

danpovey · 2016-11-29T01:55:04Z

I'm concerned that any changes you make to currently-checked-in code will clash with Vimal's PR. Is it possible for you to make a patch against Vimal's PR, and create a WIP PR that contains Vimal's PR plus your changes on top of it? Then Vimal or I can pull changes from that branch. Also, Vimal, can you please merge master into your branch? It will make my life easier in figuring out what changes I have in my LSTM-integration branch, versus your branch. There may be some bug fixes from me that should be included. BTW, one option I am considering is to merge your changes along with my fast-lstm changes, since I'm working on top of your PR.

…

On Mon, Nov 28, 2016 at 8:47 PM, Yiming Wang ***@***.***> wrote: Right now train_rnn.py uses --trainer.rnn.num-bptt-steps to determine {min,max}-deriv-time, which assumes it network is "pure" recurrent; if necessary I can add the option --trainer.deriv-truncate-margin and deprecate num-bptt-steps. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#1224 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu62KUQt2A52tmO4ychctKe9p1qHVks5rC4RNgaJpZM4K-T_x> .

freewym · 2016-11-29T01:59:53Z

OK. + @vimalmanohar in case you didn't get noticed from here

changed the definition of deriv-truncate-margin option. Now the margi…

de33748

…n is defined on top of the model contexts, which applies to more general network archs

danpovey reviewed Nov 28, 2016

View reviewed changes

fix

27430ed

freewym force-pushed the deriv-trunc branch from 7072fa5 to 27430ed Compare November 28, 2016 22:31

set deriv-truncate-margin=8 for all relevant recipes so that the min/…

bae657a

…max deriv time is consistent with their xent counterparts and the previous chain recipes

make the lines shorter

51a13a3

danpovey merged commit e95aeee into kaldi-asr:master Nov 29, 2016

freewym deleted the deriv-trunc branch November 29, 2016 00:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

changed the definition of deriv-truncate-margin option. Now the margi… #1224

changed the definition of deriv-truncate-margin option. Now the margi… #1224

Uh oh!

freewym commented Nov 28, 2016

Uh oh!

danpovey commented Nov 28, 2016

Uh oh!

freewym commented Nov 28, 2016

Uh oh!

danpovey Nov 28, 2016

Uh oh!

freewym commented Nov 28, 2016

Uh oh!

danpovey commented Nov 29, 2016

Uh oh!

freewym commented Nov 29, 2016

Uh oh!

danpovey commented Nov 29, 2016 via email

Uh oh!

danpovey commented Nov 29, 2016 via email

Uh oh!

freewym commented Nov 29, 2016

Uh oh!

danpovey commented Nov 29, 2016 via email

Uh oh!

freewym commented Nov 29, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

changed the definition of deriv-truncate-margin option. Now the margi… #1224

changed the definition of deriv-truncate-margin option. Now the margi… #1224

Uh oh!

Conversation

freewym commented Nov 28, 2016

Uh oh!

danpovey commented Nov 28, 2016

Uh oh!

freewym commented Nov 28, 2016

Uh oh!

danpovey Nov 28, 2016

Choose a reason for hiding this comment

Uh oh!

freewym commented Nov 28, 2016

Uh oh!

danpovey commented Nov 29, 2016

Uh oh!

freewym commented Nov 29, 2016

Uh oh!

danpovey commented Nov 29, 2016 via email

Uh oh!

danpovey commented Nov 29, 2016 via email

Uh oh!

freewym commented Nov 29, 2016

Uh oh!

danpovey commented Nov 29, 2016 via email

Uh oh!

freewym commented Nov 29, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants