fix rnnlm related code style#1283
Conversation
|
Sure.
On Wed, Dec 21, 2016 at 5:01 PM Daniel Povey <notifications@github.com>
wrote:
@hainan-xv, can you please check over this?
On Wed, Dec 21, 2016 at 12:19 PM, Ke Li ***@***.***> wrote:
> fix rnnlm related code style in Kaldi src/lm/
> ------------------------------
> You can view, comment on, or merge this pull request online at:
>
> #1283
> Commit Summary
>
> - fix rnnlm related code style
>
> File Changes
>
> - *M* src/lm/arpa-file-parser-test.cc
> <https://github.com/kaldi-asr/kaldi/pull/1283/files#diff-0> (30)
> - *M* src/lm/arpa-file-parser.h
> <https://github.com/kaldi-asr/kaldi/pull/1283/files#diff-1> (8)
> - *M* src/lm/arpa-lm-compiler-test.cc
> <https://github.com/kaldi-asr/kaldi/pull/1283/files#diff-2> (5)
> - *M* src/lm/const-arpa-lm.cc
> <https://github.com/kaldi-asr/kaldi/pull/1283/files#diff-3> (2)
> - *M* src/lm/mikolov-rnnlm-lib.cc
> <https://github.com/kaldi-asr/kaldi/pull/1283/files#diff-4> (174)
> - *M* src/lm/mikolov-rnnlm-lib.h
> <https://github.com/kaldi-asr/kaldi/pull/1283/files#diff-5> (2)
>
> Patch Links:
>
> - https://github.com/kaldi-asr/kaldi/pull/1283.patch
> - https://github.com/kaldi-asr/kaldi/pull/1283.diff
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#1283>, or mute the thread
> <
https://github.com/notifications/unsubscribe-auth/ADJVu2_g6aKyv7fUd1vT9OtQ8YknYPwsks5rKYm9gaJpZM4LTVgI
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1283 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AFMCDlLXvHemglXO1FY9t_udCtUtiBazks5rKaG-gaJpZM4LTVgI>
.
--
Sent from Gmail Mobile
|
| // if (mpos.first != ngrams_.end()) | ||
| // KALDI_ERR << "Maismatch at index " << mpos.first - ngrams_.begin(); | ||
| //TODO:auto above requres C++11, and I cannot spell out the type!!! | ||
| // TODO: auto above requres C++11, and I cannot spell out the type!!! |
There was a problem hiding this comment.
I don't think there is need to change this file
| -0.2 <s> a </s>\n\ | ||
| -0.3 <s> a \xCE\xB2\n\ | ||
| -0.2 <s> a </s>\n\ | ||
| \\end\\"; |
There was a problem hiding this comment.
definitely don't make changes like this - you're changing the behavior of the code by doing this
| kRaiseError, ///< Abort on OOV words | ||
| kAddToSymbols, ///< Add novel words to the symbol table. | ||
| kReplaceWithUnk, ///< Replace OOV words with <unk>. | ||
| kReplaceWithUnk, ///< Replace OOV words with <unk>. |
There was a problem hiding this comment.
Not very sure about this, but I think there should be a space after the comment // thing?
//this is bad
// this is good
There was a problem hiding this comment.
I just followed the rule "two spaces between code and comments". I'll modify it as what you suggested.
| #include <string.h> | ||
| #include <math.h> | ||
| #include "lm/mikolov-rnnlm-lib.h" | ||
| #include "util/table-types.h" |
There was a problem hiding this comment.
you should probably move this also
|
|
||
| vocab_hash_size = 100000000; | ||
| vocab_hash = (int *)calloc(vocab_hash_size, sizeof(int)); | ||
| vocab_hash = reinterpret_cast<int *>(calloc(vocab_hash_size, sizeof(int))); |
There was a problem hiding this comment.
@danpovey I am not sure about this... it seems we don't really use calloc() much and use new instead. Not sure what is the right thing to do here...
There was a problem hiding this comment.
@danpovey I think everything is fine except here, which I am not sure of...
| real CRnnLM::random(real min, real max) { | ||
| return rand()/(real)RAND_MAX*(max-min)+min; | ||
| return rand() / (real)RAND_MAX * (max-min) + min; | ||
| } |
src/lm/mikolov-rnnlm-lib.cc
Outdated
| for (a = 1; a < bptt + bptt_block; a++) { | ||
| bptt_history[a] = 0; | ||
| } | ||
| for (a = bptt + bptt_block-1; a > 1; a--) { |
src/lm/mikolov-rnnlm-lib.cc
Outdated
| b += vocab[i].cn; | ||
| } | ||
| for (i = 0; i < vocab_size; i++) { | ||
| df+= vocab[i].cn / static_cast<double>(b); |
hainan-xv
left a comment
There was a problem hiding this comment.
It'd be good to try to test the code on real data to see if it breaks anything.
|
@hainan-xv do you consider this checked and ready to merge? |
|
it's OK, she just changed the cast style...
…On Wed, Dec 28, 2016 at 11:54 AM, hainan-xv ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/lm/mikolov-rnnlm-lib.cc
<#1283>:
> @@ -147,7 +147,7 @@ CRnnLM::CRnnLM() {
srand(rand_seed);
vocab_hash_size = 100000000;
- vocab_hash = (int *)calloc(vocab_hash_size, sizeof(int));
+ vocab_hash = reinterpret_cast<int *>(calloc(vocab_hash_size, sizeof(int)));
@danpovey <https://github.com/danpovey> I think everything is fine except
here, which I am not sure of...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1283>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu6PKgrSA-_O3SiXtE2Fi97NYnizmks5rMr55gaJpZM4LTVgI>
.
|
|
I did a rescoring test using the trained rnnlm (Mikolov's) on eval92 and dev93 (nnet2 online) and the results after rescoring are better than before. So I think the style modifications should not be wrong. |
|
OK this PR should be ready to merge then. |
src/lm/arpa-file-parser-test.cc
Outdated
| ngram 1=4\n\ | ||
| ngram 2=2\n\ | ||
| ngram 3=2\n\ | ||
| ngram 1 = 4\n\ |
There was a problem hiding this comment.
you should revert these changes.... the rules don't apply inside strings (would be surprised if cpplint.py noticed this).... and the test code should test the normal arpa format which the old string did.
There was a problem hiding this comment.
Sure. will do that soon.
fix rnnlm related code style in Kaldi src/lm/