-
Notifications
You must be signed in to change notification settings - Fork 5.4k
nnet3: removing add_lda and include_log_softmax options #1573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -80,6 +80,7 @@ def get_args(): | |
| rule as accepted by the --minibatch-size option of | ||
| nnet3-merge-egs; run that program without args to see | ||
| the format.""") | ||
|
|
||
| parser.add_argument("--trainer.optimization.proportional-shrink", type=float, | ||
| dest='proportional_shrink', default=0.0, | ||
| help="""If nonzero, this will set a shrinkage (scaling) | ||
|
|
@@ -92,6 +93,11 @@ def get_args(): | |
| Unlike for train_rnn.py, this is applied unconditionally, | ||
| it does not depend on saturation of nonlinearities. | ||
| Can be used to roughly approximate l2 regularization.""") | ||
| parser.add_argument("--compute-average-posteriors", | ||
| type=str, action=common_lib.StrToBoolAction, | ||
| choices=["true", "false"], default=False, | ||
| help="""If true, then the average output of the | ||
| network is computed and dumped as post.final.vec""") | ||
|
|
||
| # General options | ||
| parser.add_argument("--nj", type=int, default=4, | ||
|
|
@@ -198,11 +204,7 @@ def train(args, run_opts): | |
| try: | ||
| model_left_context = variables['model_left_context'] | ||
| model_right_context = variables['model_right_context'] | ||
| if 'include_log_softmax' in variables: | ||
| include_log_softmax = common_lib.str_to_bool( | ||
| variables['include_log_softmax']) | ||
| else: | ||
| include_log_softmax = False | ||
|
|
||
| except KeyError as e: | ||
| raise Exception("KeyError {0}: Variables need to be defined in " | ||
| "{1}".format(str(e), '{0}/configs'.format(args.dir))) | ||
|
|
@@ -286,7 +288,9 @@ def train(args, run_opts): | |
| # use during decoding | ||
| common_train_lib.copy_egs_properties_to_exp_dir(egs_dir, args.dir) | ||
|
|
||
| if (args.stage <= -3) and os.path.exists(args.dir+"/configs/init.config"): | ||
| add_lda = common_train_lib.is_lda_added(config_dir) | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are there other reasons we use init.config other than to make the LDA-like transform?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably not. It seems like to be removed in the transfer learning PR.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. init.config is still used to create the initial model. The check is needed to know if the LDA needs to be trained.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. did you see my comment:
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not understand what needs to be done. The current solution is needed to know if LDA needs to be trained. The function is_lda_added can be changed to read init.raw if needed. |
||
| if (add_lda and args.stage <= -3): | ||
| logger.info('Computing the preconditioning matrix for input features') | ||
|
|
||
| train_lib.common.compute_preconditioning_matrix( | ||
|
|
@@ -417,7 +421,7 @@ def learning_rate(iter, current_num_jobs, num_archives_processed): | |
| common_lib.force_symlink("{0}.raw".format(num_iters), | ||
| "{0}/final.raw".format(args.dir)) | ||
|
|
||
| if include_log_softmax and args.stage <= num_iters + 1: | ||
| if compute_average_posteriors and args.stage <= num_iters + 1: | ||
| logger.info("Getting average posterior for output-node 'output'.") | ||
| train_lib.common.compute_average_posterior( | ||
| dir=args.dir, iter='final', egs_dir=egs_dir, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there other reasons we use init.config other than to make the LDA-like transform?