-
Notifications
You must be signed in to change notification settings - Fork 5.4k
linking run_tdnn.sh (see https://groups.google.com/d/msg/kaldi-help/UAKh81Oapyw/etHsBG13BAAJ) #3056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
OK, but you need to add the script that it points to as well. I'll wait till you have WERs for that, though. |
|
sure, started with a clean setup - will update the numbers once all done
(running on a small-ish machine with a single GPU only - so will take a
while)
about linking, run_tdnn.sh -> tuning/run_tdnn_1a.sh
I just added run_tdnn.sh which points to this script:
tuning/run_tdnn_1a.sh and it's already there - so nothing is missing as
far as I can see
~/kaldi/egs/tedlium/s5_r3/local/chain$ ls -la
total 16
drwxr-xr-x 3 morrie morrie 4096 Feb 26 11:08 .
drwxr-xr-x 5 morrie morrie 4096 Feb 26 11:07 ..
-rwxr-xr-x 1 morrie morrie 3334 Feb 26 11:07 compare_wer_general.sh
lrwxrwxrwx 1 morrie morrie 21 Feb 26 11:07 run_tdnnf.sh ->
tuning/run_tdnn_1b.sh
lrwxrwxrwx 1 morrie morrie 21 Feb 26 11:08 run_tdnn.sh ->
tuning/run_tdnn_1a.sh
drwxr-xr-x 2 morrie morrie 4096 Feb 26 11:08 tuning
…On Tue, 26 Feb 2019 at 19:15, Daniel Povey ***@***.***> wrote:
OK, but you need to add the script that it points to as well. I'll wait
till you have WERs for that, though.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Atyo_RvM5fu4LxmybDITYzOnskhrEA7eks5vRXmtgaJpZM4bR7ik>
.
|
|
No, that script run_tdnn_1a.sh is not the right one, it is an older version
of the broken one in run_tdnnf.sh.
You should take a script from s5_r2.
Run with the number of jobs in the script, and --use-gpu wait, so it will
use only one GPU but give the same results.
…On Tue, Feb 26, 2019 at 2:23 PM jyhnnhyj ***@***.***> wrote:
sure, started with a clean setup - will update the numbers once all done
(running on a small-ish machine with a single GPU only - so will take a
while)
about linking, run_tdnn.sh -> tuning/run_tdnn_1a.sh
I just added run_tdnn.sh which points to this script:
tuning/run_tdnn_1a.sh and it's already there - so nothing is missing as
far as I can see
~/kaldi/egs/tedlium/s5_r3/local/chain$ ls -la
total 16
drwxr-xr-x 3 morrie morrie 4096 Feb 26 11:08 .
drwxr-xr-x 5 morrie morrie 4096 Feb 26 11:07 ..
-rwxr-xr-x 1 morrie morrie 3334 Feb 26 11:07 compare_wer_general.sh
lrwxrwxrwx 1 morrie morrie 21 Feb 26 11:07 run_tdnnf.sh ->
tuning/run_tdnn_1b.sh
lrwxrwxrwx 1 morrie morrie 21 Feb 26 11:08 run_tdnn.sh ->
tuning/run_tdnn_1a.sh
drwxr-xr-x 2 morrie morrie 4096 Feb 26 11:08 tuning
On Tue, 26 Feb 2019 at 19:15, Daniel Povey ***@***.***>
wrote:
> OK, but you need to add the script that it points to as well. I'll wait
> till you have WERs for that, though.
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#3056 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/Atyo_RvM5fu4LxmybDITYzOnskhrEA7eks5vRXmtgaJpZM4bR7ik
>
> .
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu1lmBe6h4XFy2V0ZuCo1D_I1BxVIks5vRYmVgaJpZM4bR7ik>
.
|
|
okay I see - sure, will use that file
will update the results here once done
…On Tue, 26 Feb 2019 at 20:36, Daniel Povey ***@***.***> wrote:
No, that script run_tdnn_1a.sh is not the right one, it is an older version
of the broken one in run_tdnnf.sh.
You should take a script from s5_r2.
Run with the number of jobs in the script, and --use-gpu wait, so it will
use only one GPU but give the same results.
On Tue, Feb 26, 2019 at 2:23 PM jyhnnhyj ***@***.***> wrote:
> sure, started with a clean setup - will update the numbers once all done
> (running on a small-ish machine with a single GPU only - so will take a
> while)
>
> about linking, run_tdnn.sh -> tuning/run_tdnn_1a.sh
> I just added run_tdnn.sh which points to this script:
> tuning/run_tdnn_1a.sh and it's already there - so nothing is missing as
> far as I can see
>
> ~/kaldi/egs/tedlium/s5_r3/local/chain$ ls -la
> total 16
> drwxr-xr-x 3 morrie morrie 4096 Feb 26 11:08 .
> drwxr-xr-x 5 morrie morrie 4096 Feb 26 11:07 ..
> -rwxr-xr-x 1 morrie morrie 3334 Feb 26 11:07 compare_wer_general.sh
> lrwxrwxrwx 1 morrie morrie 21 Feb 26 11:07 run_tdnnf.sh ->
> tuning/run_tdnn_1b.sh
> lrwxrwxrwx 1 morrie morrie 21 Feb 26 11:08 run_tdnn.sh ->
> tuning/run_tdnn_1a.sh
> drwxr-xr-x 2 morrie morrie 4096 Feb 26 11:08 tuning
>
>
> On Tue, 26 Feb 2019 at 19:15, Daniel Povey ***@***.***>
> wrote:
>
> > OK, but you need to add the script that it points to as well. I'll wait
> > till you have WERs for that, though.
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > <#3056 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/Atyo_RvM5fu4LxmybDITYzOnskhrEA7eks5vRXmtgaJpZM4bR7ik
> >
> > .
> >
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#3056 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/ADJVu1lmBe6h4XFy2V0ZuCo1D_I1BxVIks5vRYmVgaJpZM4bR7ik
>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Atyo_YFyPnN4Fp_UR8lEcj_c4pEhvVcfks5vRYzMgaJpZM4bR7ik>
.
|
|
a couple of questions: If I skip that step and change which seems a mismatch in params, I can try to fix these, but just wanted to double check if I should be using a different set of scripts... |
|
There are two problems here. Secondly, that run_tdnn.sh script, if just copied from s5_r2, may not be fully compatible with the setup. You will have to remove --min-seg-len option; and do a diff with the existing 'run_tdnn.sh' script and try to figure out which differences have to do with things like a change in the directory setup of tdnn s5_r3 vs. s5_r2, or other local changes, and apply those as needed. |
|
re Vimal's fix, I merged with it and can confirm it solves the problem. |
|
Any update?
…On Fri, Mar 1, 2019 at 6:45 AM jyhnnhyj ***@***.***> wrote:
re Vimal's fix, I merged with it and can confirm it solves the problem.
(re Python3, that's right, but during Kaldi setup, it creates a link for
Python2.7 and was expecting to pick that one) - but anyway ,this is now
solved. I'll continue with the rest and update the progress/issues here.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu3Vr810ZPj1FRDoVtO2RBu19b0UHks5vSRLxgaJpZM4bR7ik>
.
|
|
sorry for not updating earlier, was distracted by a couple of other deadlines, plan to resume this on Monday - already run the run_cleanup_segmentation.sh last week, worked as expected and would start training on Monday |
|
a quick update that I just started running the training script |
|
so the ivector part ran successfully - but after that step, when it was validating the files, it fails |
|
Just remove the _comb part of the filename, it is something that used to
exist in older recipes, that we removed.
…On Wed, Mar 13, 2019 at 6:12 AM jyhnnhyj ***@***.***> wrote:
so the ivector part ran successfully - but after that step, when it was
validating the files, it fails
local/chain/run_tdnn.sh: expected file
data/train_cleaned_sp_hires_comb/feats.scp to exist
I tried to understand what this _comb thing is about - but couldn't trace
where it should have been created - any suggestions?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu51O-rBY6HaygSXzYhD9UhR2IkTqks5vWM8lgaJpZM4bR7ik>
.
|
|
so after those comb changes, it progressed and now failed complaining about config files I checked librispeech and noticed how the new xconfig file is dumped |
|
I told you to take the script from tedlium s5_r2! I think you might be
getting it from tedlium s5!
…On Thu, Mar 14, 2019 at 8:38 AM jyhnnhyj ***@***.***> wrote:
so after those comb changes, it progressed and now failed complaining
about config files
here is the error message:
Traceback (most recent call last):
File "steps/nnet3/chain/train.py", line 625, in main
train(args, run_opts)
File "steps/nnet3/chain/train.py", line 302, in train
variables = common_train_lib.parse_generic_config_vars_file(var_file)
File "steps/libs/nnet3/train/common.py", line 352, in parse_generic_config_vars_file
"i.e. xconfig_to_configs.py.".format(field_value))
Exception: You have num_hidden_layers=7 (real meaning: your config files are intended to do discriminative pretraining). Since Kaldi 5.2, this is no longer supported --> use newer config-creation scripts, i.e. xconfig_to_configs.py.
I checked librispeech and noticed how the new xconfig file is dumped
also another one in tedlium local/chain/run_tdnnf.sh
which network config I should use?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVuwl06UIvwjc36U5tGQtlnsIQOaJzks5vWkLagaJpZM4bR7ik>
.
|
|
you're right - I'm sorry, somehow when trying to fix some of the issues I re-used that old script |
|
a quick update about the training progress, it's in iteration 90/227 |
|
all done, here are the results: not sure how are these comparable to the previous results? so the new results seems be slightly worse than tdnn1g_sp, but better than tdnn1f_sp_bi ? |
|
OK. It's still better than the results in the current run_tdnnf.sh, but
not by as much as I had hoped.
Please show the output of steps/info/chain_dir_info.pl exp/chain/tdnn1g_sp
And also you should be calling this 1c, and naming the script
run_tdnn_1c.sh.
But that's not urgent right now. Please make sure the script in in your
PR, I want to have a look and see that
everything looks right.
…On Mon, Mar 18, 2019 at 6:08 AM jyhnnhyj ***@***.***> wrote:
all done, here are the results:
dev: %WER 8.03 [ 1428 / 17783, 255 ins, 274 del, 899 sub ]
dev_rescore: %WER 7.44 [ 1323 / 17783, 242 ins, 267 del, 814 sub ]
test: %WER 10.11 [ 2780 / 27500, 252 ins, 1083 del, 1445 sub ]
test_rescore: %WER 7.85 [ 2158 / 27500, 323 ins, 560 del, 1275 sub ]
not sure how are these comparable to the previous results?
In the header of the run_tdnn.sh, I can see there:
# System tdnn1f_sp_bi tdnn1g_sp
# WER on dev(orig) 8.9 7.9
# WER on dev(rescored) 8.1 7.3
# WER on test(orig) 9.1 8.0
# WER on test(rescored) 8.6 7.6
so the new results seems be slightly worse than tdnn1g_sp, but better than
tdnn1f_sp_bi ?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu6ZenOEzjNo9mQFpUQw1dJPUGPUrks5vX2WvgaJpZM4bR7ik>
.
|
|
sure |
|
made the changes, but somehow messed up my kaldi fork, had to delete it and now can't push to this anymore |
|
OK, we will discuss on #3149. |
Dan's suggestion at https://groups.google.com/d/msg/kaldi-help/UAKh81Oapyw/etHsBG13BAAJ