AMI TDNN Results Update by david-ryan-snyder · Pull Request #1545 · kaldi-asr/kaldi

david-ryan-snyder · 2017-04-14T16:06:55Z

This is a continuation of Issue #1123 (which can't be closed yet). The main point of this is to update more of the results as we continue to switch to using PCA (instead of LDA) for the ivector transform.

Update to the TDNN results in RESULTS_{ihm,sdm,mdm}. [All of the results improve slightly.]
Change to egs/ami/s5b/local/chain/tuning/run_tdnn_1{a,b,c,d}.sh so that the file $dir/egs/.nodelete is not created by default. User should now run with --cleanup false to prevent the egs from being deleted after training.
Bug fix in egs/ami/s5b/run.sh. Previously, if BeamformIt was not installed, the user was instructed to go to ../../../tools/ and run make beamformit. However, this rule doesn't exist in the Makefile. I replaced the instructions with ../../../tools/; extras/install_beamformit.sh; cd -. [Alternatively, we could add just add a rule to the Makefile to do that, but I didn't see many rules involving extras/ in the Makefile.]

…ectors used in ASR. Results are reported in the default TDNN recipe in AMI. Updating steps/online/nnet2/{train_diag_ubm.sh,train_ivector_extractor.sh} so that they now backup the contents of their destination directory if it already exists.

…tor-pca2

… (tdnn1d). Fixing minor bug in egs/ami/s5b/local/chain/tuning/run_tdnn_*.sh scripts.

danpovey · 2017-04-14T18:52:32Z

egs/ami/s5b/local/chain/tuning/run_tdnn_1b.sh

  fi

- touch $dir/egs/.nodelete # keep egs around when that run dies.
+  mkdir -p $dir/egs && touch $dir/egs/.nodelete # keep egs around when that run dies.


Actually, would you mind just deleting these lines? The 'proper' way now to stop the egs being removed is to run with --cleanup false. Anyway it isn't necessarily a good default to keep the egs around.

danpovey · 2017-04-14T19:12:38Z

@david-ryan-snyder (and this isn't specifically about this commit but about your changes to how the iVector extraction is done in the example scripts),
What kind of WER differences, if any, are you seeing by changing LDA->PCA for the i-vector space?
What I am wondering is whether they are substantial enough that we should ask people like @GaofengCheng and @tomkocse who may be adding new results in the near future, to run the script from the start so that iVectors get re-computed.

david-ryan-snyder · 2017-04-14T19:38:01Z

@danpovey,

I don't think they need to. The difference is usually very small (or no difference) and the recipes they are working on are probably still using LDA by default anyway.

At the moment, the only recipe that uses PCA instead of LDA features by default is egs/ami/s5b/local/chain/run_tdnn.sh . All of recipes in AMI, and all other examples continue to use the LDA transform as before (at least until we update them, which we are in the process of).

An apples-to-apples comparison in AMI ihm gives a WER of 22.0/22.2 for PCA and 21.9/22.3 for LDA (dev/eval). So they're basically the same. Looking at some of my older results for this (which did not get committed yet), it was slightly worse for Tedlium (a difference of about 0.08% absolute), and slightly better on WSJ.

…eping egs

david-ryan-snyder · 2017-04-14T19:46:53Z

@danpovey, also I removed the "$dir/egs/.nodelete" line.

david-ryan-snyder · 2017-04-14T19:47:02Z

Thanks!

danpovey · 2017-04-14T19:47:19Z

OK, merged.
My feeling is it's probably best to be consistent and do this everywhere-- the results changes are probably just noise.

david-ryan-snyder · 2017-04-14T20:06:55Z

@danpovey,

I can make this the default wherever possible. But, I think it will still require targeted updates for each recipe, since different recipes use different ivector extractor training scripts. So, there's an aim for consistency, but I think in the short term there will have to settle for some discrepancies...

The problem with making all of the changes simultaneously in one PR is that by the time I test all the recipes, there will be updates to at least a few of them that make the new changes out of date...

danpovey · 2017-04-14T20:07:50Z

Not urgent at all, I just mean eventually.

…

On Fri, Apr 14, 2017 at 1:06 PM, david-ryan-snyder ***@***.*** > wrote: @danpovey <https://github.com/danpovey>, I can make that this the default wherever possible. But, I think it will still require targeted updates for each recipe, since different recipes use different ivector extractor training scripts. So, there's an aim for consistency, but I think in the short term there will have to settle for some discrepancies... The problem with making all of the changes simultaneously in one PR is that by the time I test all the recipes, there will be updates to at least a few of them that make the new changes out of date... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1545 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu4RbkihgbFNdWiNXfbZyQVhSKHy-ks5rv9HhgaJpZM4M93uq> .

david-ryan-snyder · 2017-04-14T20:09:18Z

Oh, got it. Yeah, that's the plan.

@osadjadi

* 'master' of https://github.com/kaldi-asr/kaldi: (21 commits) [egs] bug-fix in egs/ami/s5/run_ihm.sh (kaldi-asr#1577) [src] Minor bug-fixes in compute-wer-bootci and WSJ run.sh. Thanks: @osadjadi [egs] Add soft link for mini-librispeech setup [egs] adding results and cleanup in mini-librispeech [egs] Add mini-librispeech example scripts [intended as a sanity-checker/tutorial setup] (kaldi-asr#1566) [src] Fix to testing code signal-test.cc, change threshold to resolve failure (kaldi-asr#1565) [src] Add documentation for dropout function. [src,scripts,egs] Add dropout for nnet3 LSTMs, with recipes. (kaldi-asr#1537) [src] nnet3 online silence weighting - adding frame subsampling factor (kaldi-asr#1559) [doc] Small edit to hmm.dox, clarifying something [egs] Added check for kaldi_lm being installed in fisher_swbd recipe. (kaldi-asr#1558) Update travis.yml so PRs to kaldi_52 are built [srcipts] steps/nnet3/report/generate_plots.py: plot 5,50,95th percentile of value and derivative instead of mean+-stddev (kaldi-asr#1472) [egs] AMI TDNN Results Update (kaldi-asr#1545) [src] add template instantiations for ConvertStringToReal, address issue kaldi-asr#1544 [egs,scripts,src] SID and LID tools and scripts: cosmetic improvements, better error-handling, and various minor fixes; results unchanged. (kaldi-asr#1543) [src] Change ConvertStringToReal to be locale-independent (i.e. always-US). Fixes android issue. (kaldi-asr#1513) [scripts] nnet3 : fix issue where LDA estimation failed for LSTMs with label delay (kaldi-asr#1540) [scripts] fix to get_egs_targets.sh (thanks: David Pye) [src] Fix copy-feats for using the --write-num-frames and --compress true flags at the same time (kaldi-asr#1541) ...

* [scripts,egs] Adding options for using PCA instead of LDA+MLLT for ivectors used in ASR. Results are reported in the default TDNN recipe in AMI. Updating steps/online/nnet2/{train_diag_ubm.sh,train_ivector_extractor.sh} so that they now backup the contents of their destination directory if it already exists. * [egs,scripts] Updating AMI TDNN results to reflect the current recipe (tdnn1d). Fixing minor bug in egs/ami/s5b/local/chain/tuning/run_tdnn_*.sh scripts. * [egs] Updating chain scripts in AMI so that they do not default to keeping egs

David Snyder and others added 5 commits April 1, 2017 21:03

Merge branch 'master' of https://github.com/kaldi-asr/kaldi into ivec…

3122357

…tor-pca2

Merge branch 'master' of https://github.com/kaldi-asr/kaldi into ivec…

fc5e895

…tor-pca2

Merge branch 'master' of https://github.com/kaldi-asr/kaldi into ivec…

7d9fca2

…tor-pca2

[egs,scripts] Updating AMI TDNN results to reflect the current recipe…

0e56d0a

… (tdnn1d). Fixing minor bug in egs/ami/s5b/local/chain/tuning/run_tdnn_*.sh scripts.

danpovey reviewed Apr 14, 2017

View reviewed changes

[egs] Updating chain scripts in AMI so that they do not default to ke…

10ec717

…eping egs

danpovey merged commit 8891750 into kaldi-asr:master Apr 14, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMI TDNN Results Update#1545

AMI TDNN Results Update#1545
danpovey merged 6 commits intokaldi-asr:masterfrom
david-ryan-snyder:ivector-pca2

david-ryan-snyder commented Apr 14, 2017 •

edited

Loading

Uh oh!

danpovey Apr 14, 2017

Uh oh!

danpovey commented Apr 14, 2017

Uh oh!

david-ryan-snyder commented Apr 14, 2017 •

edited

Loading

Uh oh!

david-ryan-snyder commented Apr 14, 2017

Uh oh!

david-ryan-snyder commented Apr 14, 2017

Uh oh!

danpovey commented Apr 14, 2017

Uh oh!

david-ryan-snyder commented Apr 14, 2017 •

edited

Loading

Uh oh!

danpovey commented Apr 14, 2017 via email

Uh oh!

david-ryan-snyder commented Apr 14, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

david-ryan-snyder commented Apr 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danpovey Apr 14, 2017

Choose a reason for hiding this comment

Uh oh!

danpovey commented Apr 14, 2017

Uh oh!

david-ryan-snyder commented Apr 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-ryan-snyder commented Apr 14, 2017

Uh oh!

david-ryan-snyder commented Apr 14, 2017

Uh oh!

danpovey commented Apr 14, 2017

Uh oh!

david-ryan-snyder commented Apr 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danpovey commented Apr 14, 2017 via email

Uh oh!

david-ryan-snyder commented Apr 14, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

david-ryan-snyder commented Apr 14, 2017 •

edited

Loading

david-ryan-snyder commented Apr 14, 2017 •

edited

Loading

david-ryan-snyder commented Apr 14, 2017 •

edited

Loading