-
Notifications
You must be signed in to change notification settings - Fork 5.4k
changed default value of zeroing-threshold in BackpropTruncationCompo… #1240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…nent to 15; updated the results on AMI
| BaseFloat clipping_threshold = 15.0; | ||
| BaseFloat zeroing_threshold = 2.0; | ||
| BaseFloat clipping_threshold = 30.0; | ||
| BaseFloat zeroing_threshold = 15.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Larger values of these quantities are more dangerous, i.e. more likely to lead to instability.
I don't think it's sufficient to just test this on one setup, because it's the potential for divergence that this is supposed to guard against. Have you done any other tests?
|
Also, this PR would need to be on top of the 'fast_lstm' branch-- there are
other LSTM config-generation objects there that would have to be changed.
But I want you to test it in that setup. And I'd be more comfortable with
smaller thresholds, like 5 and 15 or 5 and 20, instead of 20 and 30, if
there is no clear difference in results. It's safer in situations where
divergence is a possibility. The WER improvements you had in the RESULTS
file were rather unimpressive.
Dan
…On Thu, Dec 1, 2016 at 11:42 PM, Yiming Wang ***@***.***> wrote:
…nent to 15; updated the results on AMI
------------------------------
You can view, comment on, or merge this pull request online at:
#1240
Commit Summary
- changed default value of zeroing-threshold in
BackpropTruncationComponent to 15; updated the results on AMI
File Changes
- *M* egs/ami/s5b/RESULTS_ihm
<https://github.com/kaldi-asr/kaldi/pull/1240/files#diff-0> (5)
- *M* egs/ami/s5b/RESULTS_sdm
<https://github.com/kaldi-asr/kaldi/pull/1240/files#diff-1> (5)
- *M* egs/wsj/s5/steps/libs/nnet3/xconfig/lstm.py
<https://github.com/kaldi-asr/kaldi/pull/1240/files#diff-2> (16)
- *M* egs/wsj/s5/steps/nnet3/components.py
<https://github.com/kaldi-asr/kaldi/pull/1240/files#diff-3> (4)
- *M* egs/wsj/s5/steps/nnet3/lstm/make_configs.py
<https://github.com/kaldi-asr/kaldi/pull/1240/files#diff-4> (2)
- *M* src/nnet3/nnet-general-component.cc
<https://github.com/kaldi-asr/kaldi/pull/1240/files#diff-5> (4)
Patch Links:
- https://github.com/kaldi-asr/kaldi/pull/1240.patch
- https://github.com/kaldi-asr/kaldi/pull/1240.diff
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1240>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu8mTeUUEuaIERTq7Hd83XVmtbsGjks5rD6G9gaJpZM4LCMeq>
.
|
|
This PR is already on top of fast_lstm. The old WERs reported in RESULTS are obtained without zeroing (i.e. using ClipGradientComponent as the comment said). The results of tuning zeroing-threshold on ihm are: I have not tuned it on sdm1 After the fix of max-deriv-time, the gradient explosion did not happen even on the babel georgian multicondition data (which had the most severe problem before the fix): when I disabled the zeroing, the clipped-proportion is at most ~0.004. I also tuned the zeroing-threshold on swbd blstm_6i: The reason I chose 15.0 as threshold rather than 5 or 10 is mainly based on the WER on swbd. Not sure how much variation there could be for different runs with the same settings |
|
OK I will think about it. |
|
FYI, added the zeroed-proportion stats of the 1st layer at the last iteration, which shows how often zeroing was activated: ami ihm swbd |
|
OK, I'll merge this. |
…nent to 15; updated the results on AMI