Skip to content

Conversation

@s-mousmita
Copy link
Contributor

@david-ryan-snyder @danpovey Adding update.
Changes are made with reference to the old pull request.

@david-ryan-snyder
Copy link
Contributor

@s-mousmita, thanks for the update.

Could you do the following:

  • Make egs/lre07/v2/lid a symbolic link to egs/lre07/v1/lid instead of a copy
  • Add your new DNN-based LID scripts to egs/lre07/v1/lid

@david-ryan-snyder
Copy link
Contributor

The point is that we don't want to be maintaining two copies of lid/, one in egs/lre07/v1 and one in egs/lre07/v2 .

See what we did in egs/sre10/v1/sid and egs/sre10/v2/sid. They both point to egs/sre08/v1/sid .

@s-mousmita
Copy link
Contributor Author

Yes, sorry but script name?

Language Evaluation. The subdirectory v1 demonstrates the standard LID
system, which is an I-Vector based recipe using full covarience GMM-UBM and logistic regression model.
The subdirectory v2 demonstrates the LID sysem using a Time Delay Deep Neural Network based UBM
which is used to replace the GMM-UBM of v1. The DNN is trained using about 1800 hours of the english portion of Fisher.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be mindful of typos here. E.g., covarience -> covariance, and sysem - > system.

Also, please format the text so that lines don't go over 80 columns.

@david-ryan-snyder
Copy link
Contributor

@s-mousmita, please let me know when you've addressed the comments

@s-mousmita
Copy link
Contributor Author

Thanks @david-ryan-snyder .Please check updated scripts.


nnet_feats="ark,s,cs:apply-cmvn-sliding --center=true scp:$sdata_dnn/JOB/feats.scp ark:- |"

## Set up SDC features.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the extra '#'?

@danpovey
Copy link
Contributor

danpovey commented Sep 3, 2016

I just had a look at it...
I think it would be better to move the scripts that are currently in local/dnn/ -- at least the scripts that were copied from steps/nnet2-- to be moved to lid/nnet2/. That way, those scripts won't have to be duplicated if someone wants to create another setup using the same methods.

Also be careful that sometimes the program pathname appears in usage messages (it's wrong in some cases currently)-- grepping for steps/nnet2 and local/dnn in those scripts should find all the places you'd need to change the path.

@s-mousmita
Copy link
Contributor Author

I have removed local/dnn and created lid/nnet2. Paths local/dnn and steps/nnet2 at various places of the scripts have been changed to lid/nnet2.

@david-ryan-snyder
Copy link
Contributor

david-ryan-snyder commented Sep 3, 2016

I don't think we should move everything from local/dnn to lid/nnet2. My suggestion:

  • Recipe specific things go into local. This includes scripts that refer to datasets, or specific instantiations of nnet configurations. That means that train_dnn.sh, run_nnet2_common.sh, run_nnet2_multisplice.sh, Fisher data preparation scripts, etc, should remain in local/dnn
  • General purpose scripts that can be used by any LID example should go into lid. Scripts such as get_egs2.sh, get_lda.sh, make_multisplice_configs.py, relabel_egs2.sh, etc would go into lid/nnet2
  • See the contents of egs/fisher_english/s5/local for examples of what should be in local, if in doubt
  • Also, since we're refactoring things, please double check that all the scripts you're moving around are actually needed. For example, I think we don't need a lid/nnet2/remove_egs.sh. There's probably an identical one in steps/nnet2 that can be called instead.

@s-mousmita
Copy link
Contributor Author

@david-ryan-snyder The following are correct?

  • get_egs2.sh, relabel_egs2.sh and get_lda.sh will go to /lid/nnet2/
  • train_dnn.sh, fisher_data_prep.sh, fisher_prepare_dict.sh, fisher_train_lms.sh, fisher_create_test_lang.sh, remove_dup_utts.sh, run_nnet2_multisplice.sh, run_nnet2_common.sh and train_multisplice_accel2.sh will go to local/dnn/
  • get_num_frames.sh, make_multisplice_configs.py, align.sh and remove_egs.sh will be used from steps/nnet2.

@danpovey
Copy link
Contributor

danpovey commented Sep 3, 2016

train_multisplice_accel2.sh is not dataset-specific so it should go into
lid/nnet2. Otherwise that looks plausible.

Dan

On Sat, Sep 3, 2016 at 1:47 PM, Mousmita Sarma notifications@github.com
wrote:

@david-ryan-snyder https://github.com/david-ryan-snyder The following
are correct?

  • get_egs2.sh, relabel_egs2.sh and get_lda.sh will go to /lid/nnet2/
  • train_dnn.sh, fisher_data_prep.sh, fisher_prepare_dict.sh,
    fisher_train_lms.sh, fisher_create_test_lang.sh, remove_dup_utts.sh,
    run_nnet2_multisplice.sh, run_nnet2_common.sh and
    train_multisplice_accel2.sh will go to local/dnn/
  • get_num_frames.sh, make_multisplice_configs.py, align.sh and
    remove_egs.sh will be used from steps/nnet2.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#999 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu66_Rg_P4Tau5tSHXIZI8HVpi4X_ks5qmbKegaJpZM4JrEUQ
.

@s-mousmita s-mousmita force-pushed the LID_DNN_update branch 4 times, most recently from dae5139 to d73b43d Compare September 3, 2016 19:37
@s-mousmita
Copy link
Contributor Author

I have made the changes suggested above.

@danpovey
Copy link
Contributor

danpovey commented Sep 3, 2016

Would you mind trying to run it one more time before we commit it?
With all the name changes, there is a strong possibility of errors.

On Sat, Sep 3, 2016 at 3:41 PM, Mousmita Sarma notifications@github.com
wrote:

I have made the changes suggested above.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#999 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu1Qg_Jq5SHalGd05kD10su6i-x9uks5qmc1QgaJpZM4JrEUQ
.

@david-ryan-snyder
Copy link
Contributor

I agree with the script locations in the current commit.

@ngoel17
Copy link
Contributor

ngoel17 commented Sep 4, 2016

Dan,
Would it make sense to run it on JHU machines so that we don't change
even the dataset paths?

Nagendra

On Sat, Sep 3, 2016, 3:42 PM Daniel Povey notifications@github.com wrote:

Would you mind trying to run it one more time before we commit it?
With all the name changes, there is a strong possibility of errors.

On Sat, Sep 3, 2016 at 3:41 PM, Mousmita Sarma notifications@github.com
wrote:

I have made the changes suggested above.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#999 (comment),
or mute
the thread
<
https://github.com/notifications/unsubscribe-auth/ADJVu1Qg_Jq5SHalGd05kD10su6i-x9uks5qmc1QgaJpZM4JrEUQ

.


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
#999 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AIZyeB_vZIPgipJu7Tsj1ZYNF43NCbcnks5qmc2ggaJpZM4JrEUQ
.

@danpovey
Copy link
Contributor

danpovey commented Sep 4, 2016

Yes, that sounds good.

On Sun, Sep 4, 2016 at 12:24 AM, Nagendra Goel notifications@github.com
wrote:

Dan,
Would it make sense to run it on JHU machines so that we don't change
even the dataset paths?

Nagendra

On Sat, Sep 3, 2016, 3:42 PM Daniel Povey notifications@github.com
wrote:

Would you mind trying to run it one more time before we commit it?
With all the name changes, there is a strong possibility of errors.

On Sat, Sep 3, 2016 at 3:41 PM, Mousmita Sarma <notifications@github.com

wrote:

I have made the changes suggested above.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#999 (comment),
or mute
the thread
<
https://github.com/notifications/unsubscribe-auth/ADJVu1Qg_
Jq5SHalGd05kD10su6i-x9uks5qmc1QgaJpZM4JrEUQ

.


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
#999 (comment),
or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AIZyeB_
vZIPgipJu7Tsj1ZYNF43NCbcnks5qmc2ggaJpZM4JrEUQ>
.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#999 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu4WFbD8HHG2d5WteVcc_01amH29Cks5qmkgWgaJpZM4JrEUQ
.

@danpovey
Copy link
Contributor

I assume someone is still working on this?

@ngoel17
Copy link
Contributor

ngoel17 commented Sep 21, 2016

Training is running from scratch on JHU machines.

On Tue, Sep 20, 2016, 7:41 PM Daniel Povey notifications@github.com wrote:

I assume someone is still working on this?


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
#999 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AIZyeMZ4AW1pW_COTsTbZ4qYxVqSf3lXks5qsG83gaJpZM4JrEUQ
.

@danpovey
Copy link
Contributor

@ngoel17, any progress on this?

@s-mousmita
Copy link
Contributor Author

@danpovey @ngoel17 Its doing the last steps, I-vector extraction.

@s-mousmita
Copy link
Contributor Author

I have updated the scripts.

Copy link
Contributor

@david-ryan-snyder david-ryan-snyder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of it looks good. I think the main thing I'd like to see updated is that there's a huge file that is added to the commit that could instead be a symbolic link.

--transform-dir "$transform_dir" --online-ivector-dir "$online_ivector_dir" \
--iter $x $data $lang $dir $dir/ali_$time || exit 1

lid/nnet2/relabel_egs2.sh --cmd "$cmd" --iter $x $dir/ali_$time \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure we can't use steps/nnet2/relabel_egs2.sh here?

$egs_in $egs_out || exit 1
fi

echo "$0: Finished relabeling training examples"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure we need this script? Can we not use the one in steps/nnet2/relabel_egs2.sh?

export decode_cmd=run.pl
export cuda_cmd=run.pl
export mkgraph_cmd=run.pl

Copy link
Contributor

@david-ryan-snyder david-ryan-snyder Oct 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the queue options are handled was changed in this commit: b3bbc03

Basically you need to update the comment and the export statements to be more similar to what you see in the link above. The default commands should include queue.pl in it.

return -1;
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For files like this, that are unlikely to ever be modified, it's better to make a symbolic link to the same file in egs/lre07/v1/local.

Copy link
Contributor

@david-ryan-snyder david-ryan-snyder Oct 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably don't need to do that for any of the other scripts.

Language Evaluation. The subdirectory v1 demonstrates the standard
LID system, which is an I-Vector based recipe using full covariance
GMM-UBM and logistic regression model. The subdirectory v2 demonstrates
the LID system using a Time Delay Deep Neural Network based UBM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider replacing "Time Delay Deep Neural Network" with "time-delay deep neural network."

@s-mousmita
Copy link
Contributor Author

@david-ryan-snyder Scripts are updated according to your recent comments.

Copy link
Contributor

@david-ryan-snyder david-ryan-snyder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ready. @danpovey, do you have any lingering doubts before accepting it?

Actually N/M since the check failed. Let's wait until we figure that out.

@david-ryan-snyder
Copy link
Contributor

david-ryan-snyder commented Oct 30, 2016

@danpovey it looks like the error is in exp-test. Have you seen this in recent PRs? I'm thinking this is probably not related to @s-mousmita's updates.

Successfully configured OpenBLAS from /home/travis/xroot/usr. /usr/bin/ld: cannot find -lgfortran collect2: ld returned 1 exit status make: *** [exp-test] Error 1 ./configure: line 222: ./exp-test: No such file or directory

@danpovey
Copy link
Contributor

That exp-test thing was not what was causing the build to fail, it's normal
on the travis build.
The reason was:
Your test run exceeded 120 minutes.
Probably just random- there were no source changes in the PR.
I'll probably merge today.

On Sun, Oct 30, 2016 at 11:41 AM, david-ryan-snyder <
notifications@github.com> wrote:

@danpovey https://github.com/danpovey it looks like the error is in
exp-test. Have you seen this in recent PRs?

Successfully configured OpenBLAS from /home/travis/xroot/usr.
/usr/bin/ld: cannot find -lgfortran
collect2: ld returned 1 exit status
make: *** [exp-test] Error 1
./configure: line 222: ./exp-test: No such file or directory


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#999 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu1s9bmy0cJqaVEfRTKoZsXBuR2stks5q5LqhgaJpZM4JrEUQ
.

@danpovey
Copy link
Contributor

danpovey commented Nov 1, 2016

Thanks! Merging.

@danpovey danpovey merged commit 08869e3 into kaldi-asr:master Nov 1, 2016
@s-mousmita
Copy link
Contributor Author

@danpovey @david-ryan-snyder Thank you for approving the recipe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants