-
Notifications
You must be signed in to change notification settings - Fork 5.4k
WIP : original LM from TedliumRelease2 #1164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@david-ryan-snyder, if you were going to test out the ivector change thing, perhaps you could do it on this setup and kill 2 birds with one stone, making sure this PR runs smoothly? |
|
@danpovey, sure, no problem. |
|
@david-ryan-snyder, don't forget to test this! |
|
On it now. Will update ASAP. |
| @@ -59,7 +59,8 @@ if [ $stage -le 0 ]; then | |||
| rm ${dir}/data/text/* 2>/dev/null || true | |||
|
|
|||
| # cantab-TEDLIUM is the larger data source. gzip it. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment on line 61 is out of date, isn't it?
| # cantab-TEDLIUM is the larger data source. gzip it. | ||
| sed 's/ <\/s>//g' < db/cantab-TEDLIUM/cantab-TEDLIUM.txt | gzip -c > ${dir}/data/text/train.txt.gz | ||
| gunzip db/TEDLIUM_release2/LM/*.en.gz | ||
| cat db/TEDLIUM_release2/LM/*.en | sed 's/ <\/s>//g' | local/join_suffix.py | gzip -c > ${dir}/data/text/train.txt.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to unzip these files on the disk? If not, I think you should replace it with this:
gunzip -c db/TEDLIUM_release2/LM/*.en.gz | sed 's/ <\/s>//g' | local/join_suffix.py | gzip -c > ${dir}/data/text/train.txt.gz
| # get_data_prob.py: log-prob of data/local/local_lm/data/real_dev_set.txt given model data/local/local_lm/data/wordlist_4.pocolm was -5.13902242865 per word [perplexity = 170.514153159] over 18290.0 words. | ||
| # even older results, before adding min-counts: | ||
| # get_data_prob.py: log-prob of data/local/local_lm/data/real_dev_set.txt given model data/local/local_lm/data/lm_4 was -5.10576291033 per word [perplexity = 164.969879761] over 18290.0 words. | ||
| #[perplexity = 157.87] over 18290.0 words |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deleted lines appear to have provided a more detailed comment, about the logprob and the model being used. Is it possible to do the same with your updates?
| # This will only work if you have GPUs on your system (and note that it requires | ||
| # you to have the queue set up the right way... see kaldi-asr.org/doc/queue.html) | ||
| local/chain/run_tdnn.sh | ||
| local/chain/run_tdnn.sh --train-set train --gmm tri3 --nnet3-affix "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your RESULTS file, you say that
This is about 0.6% worse than the corresponding results with cleanup.
If that's the case, shouldn't the version with cleanup be the default here (like it was before)?
|
I did something dumb with the PCA vs LDA test I have to rerun something. In the meantime, I added some comments to the pull request that I hope will be helpful in getting it through. |
|
I will fix the first 2 comments. |
|
@david-ryan-snyder, is this ready to merge? |
|
@danpovey, not quite. @vince62s said he didn't run the cleaned version with his changes. I ran both on the CLSP grid. Hopefully @vince62s can decide what he needs from the following results to complete his RESULTS file. %WER 27.8 | 507 17783 | 75.7 17.5 6.8 3.4 27.8 96.6 | 0.071 | exp/tri1/decode_nosp_dev/score_10_0.0/ctm.filt.filt.sys |
|
@danpovey , the previous results used the normal LDA features for the ivector. Here are results using PCA. %WER 9.5 (vs 9.8) | 507 17783 | 91.8 5.8 2.4 1.3 9.5 76.9 | 0.075 | exp/chain_cleaned/tdnn_sp_bi/decode_dev/score_10_0.0/ctm.filt.filt.sys |
|
not easy to compare as they look different... are the results about the
same? how do they differ?
…On Fri, Nov 25, 2016 at 6:40 PM, david-ryan-snyder ***@***.*** > wrote:
@danpovey <https://github.com/danpovey> , the previous results used the
normal LDA features for the ivector. Here are results using PCA.
%WER 9.5 | 507 17783 | 91.8 5.8 2.4 1.3 9.5 76.9 | 0.075 |
exp/chain_cleaned/tdnn_sp_bi/decode_dev/score_10_0.0/ctm.filt.filt.sys
%WER 8.8 | 507 17783 | 92.4 5.3 2.3 1.3 8.8 76.3 | 0.092 |
exp/chain_cleaned/tdnn_sp_bi/decode_dev_rescore/score_10_0.
0/ctm.filt.filt.sys
%WER 9.9 | 1155 27500 | 91.4 6.0 2.6 1.3 9.9 74.2 | 0.129 |
exp/chain_cleaned/tdnn_sp_bi/decode_test/score_10_0.0/ctm.filt.filt.sys
%WER 9.3 | 1155 27500 | 91.9 5.6 2.5 1.3 9.3 72.3 | 0.100 |
exp/chain_cleaned/tdnn_sp_bi/decode_test_rescore/score_10_
0.0/ctm.filt.filt.sys
%WER 10.0 | 507 17783 | 91.3 5.9 2.8 1.3 10.0 78.1 | 0.101 |
exp/chain/tdnn_sp_bi/decode_dev/score_9_0.0/ctm.filt.filt.sys
%WER 9.2 | 507 17783 | 92.1 5.3 2.6 1.3 9.2 75.5 | 0.061 |
exp/chain/tdnn_sp_bi/decode_dev_rescore/score_9_0.0/ctm.filt.filt.sys
%WER 10.2 | 1155 27500 | 91.0 6.0 3.0 1.2 10.2 75.5 | 0.063 |
exp/chain/tdnn_sp_bi/decode_test/score_9_0.0/ctm.filt.filt.sys
%WER 9.8 | 1155 27500 | 91.2 5.4 3.4 1.1 9.8 73.9 | 0.095 |
exp/chain/tdnn_sp_bi/decode_test_rescore/score_10_0.0/ctm.filt.filt.sys
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1164 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVux5rqnRv1qRE361JctX1g5Tlgt4Zks5rB3HpgaJpZM4KmlLW>
.
|
|
I added a (vs XX.X) in each line so that you can see the corresponding results with LDA. Bottom line is, PCA is better in 4 (out of 8) of the results, LDA is better in 3, and they're the same in 1. After averaging all 8 results, PCA is 9.59% and LDA Is 9.63%. |
|
Is this the first time you tested it after your fix?
Maybe we should test on 1 other setup before committing.
…On Fri, Nov 25, 2016 at 6:54 PM, david-ryan-snyder ***@***.*** > wrote:
I added a (vs XX.X) in each line so that you can see the corresponding
results with LDA.
Bottom line is, PCA is better in 4 (out of 8) of the results, LDA is
better in 3, and they're the same in 1. After averaging all 8 results, PCA
is 9.59% and LDA Is 9.63%.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1164 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVuzw-gSPFiL1JxL5JNz_CCmw3GS_Lks5rB3VNgaJpZM4KmlLW>
.
|
|
No, the PCA vs LDA stuff.
…On Fri, Nov 25, 2016 at 7:02 PM, david-ryan-snyder ***@***.*** > wrote:
@danpovey <https://github.com/danpovey> are you referring to the @vince62s
<https://github.com/vince62s> or the PCA vs LDA stuff?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1164 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu-5cquD62MbES78auRPf6Ah7Xq9nks5rB3cmgaJpZM4KmlLW>
.
|
|
There was no fix. The PCA vs LDA results in #1123 use the same setup as here. |
|
@danpovey Since we want to try the PCA vs LDA thing on more recipes, is it OK if we allow @vince62s to finish up the PR without that change? Since we already know the results, it won't need to be rerun later, when we decide to add in PCA for ivectors. @vince62s I think the main thing is that you need to update your RESULTS file with the cleaned results I posted in an earlier comment. Also, since the cleaned results are better, I imagine you will want to make them the default in the run.sh file (like it was before). |
|
Yes that's fine.
…On Fri, Nov 25, 2016 at 7:42 PM, david-ryan-snyder ***@***.*** > wrote:
@danpovey <https://github.com/danpovey> Since we want to try the PCA vs
LDA thing on more recipes, is it OK if we allow @vince62s
<https://github.com/vince62s> to finish up the PR without that change?
Since we already know the results, it won't need to be rerun later, when we
decide to add in PCA for ivectors.
@vince62s <https://github.com/vince62s> I think the main thing is that
you need to update your RESULTS file with the cleaned results I posted in
an earlier comment. Also, since the cleaned results are better, I imagine
you will want to make them the default in the run.sh file (like it was
before).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1164 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu9CCrPpa4TGb_1EVIW1ZM7nBvDhNks5rB4CHgaJpZM4KmlLW>
.
|
|
well, what should we do ? just remove all the old LM results or leave them for reference ? also in @david-ryan-snyder results, there is no "standard" nnet3 results, do I just omit / remove from the file ? |
|
I think that's an @danpovey question. I can run the other nnet3 results if we need them. |
|
The RESULTS file can have both the new and old results if you want, just
make sure the new ones are displayed more prominently, e.g. first have the
new results, and then say, "below this point are the results before we
changed the LM [describe what you changed]; these old results are more
complete and include neural net results."
…On Sat, Nov 26, 2016 at 11:09 AM, david-ryan-snyder < ***@***.***> wrote:
I think that's an @danpovey <https://github.com/danpovey> question.
I can run the other nnet3 results if we need them.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1164 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu-oUdY55eJSt9kZfT4MnoGwPaLJaks5rCFnBgaJpZM4KmlLW>
.
|
egs/tedlium/s5_r2/RESULTS
Outdated
| # local/chain/run_tdnn.sh --train-set train --gmm tri3 --nnet3-affix "" | ||
| # for d in exp/chain/tdnn_sp_bi/decode_*; do grep Sum $d/*/*ys | utils/best_wer.sh; done | ||
| # This is about 0.6% worse than the corresponding results with cleanup. | ||
| AFTER MAX-CHANGE PER COMPONENT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this "AFTER MAX-CHANGE PER COMPONENT" line.. that's history now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, and the 0.6% worse is no longer 0.6% I'll fix it
| --chain.l2-regularize 0.00005 \ | ||
| --chain.apply-deriv-weights false \ | ||
| --chain.lm-opts="--num-extra-lm-states=2000" \ | ||
| --chain.lm-opts="--ngram-order=5 --num-extra-lm-states=2000" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you really find that --ngram-order=5 was better? By how much?
In general I don't like too much tuning that's specific to specific egs directories. People copy them to other setups, and I prefer to have settings that will work everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I don't recall exactly. It was very slightly better but not sure how much.
Also I am wondering if this is a change I may have reset after my first commit.
@david-ryan-snyder : in your run it was --ngram-order=5 or just no specified, ie default 4 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here I mentioned what it was https://groups.google.com/forum/#!topic/kaldi-help/N4NeQ0g4B7Y
baseline: phone lm order 4 - --no-prune-ngram-order 3 - extra 2000
10.9 - 10.4 - 10.6 - 10.2
phone lm 5 - --no-prune-ngram-order 3 - extra 2000
10.5 - 10.2 - 10.5 - 10.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran it with what you had in the PR, --ngram-order=5.
|
I tested that option on another setup and did not find it helpful (may have
been a little worse), so decided not to go for that. Can you please change
it back to 4 and we'll rerun that part?
…On Sat, Nov 26, 2016 at 4:12 PM, david-ryan-snyder ***@***.*** > wrote:
***@***.**** commented on this pull request.
------------------------------
In egs/tedlium/s5_r2/local/chain/run_tdnn.sh
<#1164>:
> @@ -149,7 +149,7 @@ if [ $stage -le 18 ]; then
--chain.leaky-hmm-coefficient 0.1 \
--chain.l2-regularize 0.00005 \
--chain.apply-deriv-weights false \
- --chain.lm-opts="--num-extra-lm-states=2000" \
+ --chain.lm-opts="--ngram-order=5 --num-extra-lm-states=2000" \
I ran it with what you had in the PR, --ngram-order=5.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1164>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu3WXd-sNrHPxfD0Z58fZ5zY72Z4Sks5rCKCygaJpZM4KmlLW>
.
|
|
I can rerun it on the CLSP grid without the --ngram-order=5. |
|
yes go ahead, we'll see how it goes. I just pushed the change. |
|
@danpovey, below are the results with ngram-order=4 on the CLSP grid. The average (across both cleaned and regular versions) WER is 9.43% for the ngram-order=4 and 9.63% with ngram-order=5. @vince62s, should we update the RESULTS file to reflect this? %WER 9.8 | 507 17783 | 91.6 6.0 2.4 1.5 9.8 80.1 | -0.038 | exp/chain/tdnn_sp_bi/decode_dev/score_8_0.0/ctm.filt.filt.sys |
|
I will update so but this is annoying how consistently you seem to have better results with n-gram 4 versus me on a single server better with n-gram 5. |
|
ok I think now it should be ok to merge. |
| --chain.l2-regularize 0.00005 \ | ||
| --chain.apply-deriv-weights false \ | ||
| --chain.lm-opts="--num-extra-lm-states=2000" \ | ||
| --chain.lm-opts="--ngram-order=4 --num-extra-lm-states=2000" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please remove the --ngram-order=4 since it's the default? Then should be ready to merge.
|
Thanks! Merging. |
This is a modification of the s5_r2 recipe to take into account the LM from Tedlium2 paper.
Gives better results and much better results than in the Paper.
Please review.
Vincent