-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Semi-supervised training on Fisher English #2140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…vised Travis was failing to compile(not sure why)-- I used the "Update Branch" button
Conflicts: src/chain/chain-denominator-smbr.cc
danpovey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized I had some pending comments on this that I had not submitted.
There is a conflict too.
| nj=15 # This should be set to the maximum number of jobs you are | ||
| # comfortable to run in parallel; you can increase it if your disk | ||
| # speed is greater and you have more machines. | ||
| max_jobs_run=15 # This should be set to the maximum number of jobs you are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're ever using the --nj option, fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change jobs -> nnet3-chain-get-egs jobs.
| @@ -0,0 +1,21 @@ | |||
| #!/bin/bash | |||
|
|
|||
| # Copyright 2017 Vimal Manohar | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a reference to this script or any other script, in the run.sh.
If you don't put a commented-out reference to this in the run.sh, it's not obvious in which order things should be called. This needs to be made much clearer than it is now.
If it's more than just a couple of lines, you could introduce an intermediate script, something like local/semisup/run_semisupervised_50k.sh and local/semisup/run_semisupervised_100k.sh, which you'd invoke in a comment from run.sh. But these scripts shouldn't have any variables; they should just be lists of concrete invocations of other scripts (like local/chain/tuning/run_tdnn_1a.sh and local/semisup/blah/blah) with concrete arguments. I don't want people to think of it as anything more than a piece of documentation saying in what order to call things.
| # The output directory has the format of an alignment directory. | ||
| # It can optionally read alignments from a directory, in which case, | ||
| # the script gets frame-level posteriors of the pdf corresponding to those | ||
| # alignments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make clear that the weights are output as weights.scp
| # LM for decoding unsupervised data: 4gram | ||
| # Supervision: Naive split lattices | ||
|
|
||
| # train_set train_sup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comment explaining which is output-0 vs output-1; same if similar things appear elsewhere.
| supervised_set=train_sup | ||
| unsupervised_set=train_unsup100k_250k | ||
|
|
||
| sup_chain_dir=exp/semisup_100k/chain/tdnn_1a_sp # supervised chain system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice if you could explain in a comment which of these are inputs and which are outputs.
| @@ -0,0 +1,201 @@ | |||
| #!/bin/bash | |||
|
|
|||
| # Copyright 2017 Vimal Manohar | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There needs to be a reference to this and run_100k.sh in the run.sh, commented out, showing at what point to run them.
And in the scripts in local/ that this calls, I think there should be a note that local/semisup/run_50k.sh shows how to call this. (Same for 100k). This was not very discoverable to me.
| # which is different from run_50k.sh, which uses combined supervised + | ||
| # unsupervised set. | ||
|
|
||
| . ./cmd.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that this script is very simple and linear and configuration-free, but I think adding a --stage option would be helpful to users.
|
I made the changes. |
danpovey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're making progress.
Some more comments after looking through the code a bit more carefully.
I'm asking you to merge Hossein's PR, solve the integration issues, and re-test; that will kill two birds with one stone.
| # This script rescores non-compact, (possibly) undeterminized lattices with the | ||
| # ConstArpaLm format language model. | ||
| # This is similar to steps/lmrescore_const_arpa.sh, but expects | ||
| # non-compact lattices as input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this text:
If you use the option "--write compact false" it outputs non-compact lattices; the purpose is to add
in LM scores while leaving the frame-by-frame acoustic scores in the same position that they were in
in the input, undeterminized lattices. This is important in our 'chain' semi-supervised training recipes,
where it helps us to split lattices while keeping the scores at the edges of the split points correct.
| this_frame_subsampling_factor=$(cat $this_alidir/frame_subsampling_factor) | ||
| fi | ||
|
|
||
| if (( $frame_subsampling_factor % $this_frame_subsampling_factor != 0 )); then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested a construct like this, it doesn't work because 0 and 1 are != "true" or "false".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked that it works. Double parenthesis returns true or false.
|
|
||
| if [ $stage -le -1 ]; then | ||
| # Convert the alignments to the new tree. Note: we likely will not use these | ||
| # converted alignments in the CTC system directly, but they could be useful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change CTC->chain (more lava flow).
| nj=15 # This should be set to the maximum number of jobs you are | ||
| # comfortable to run in parallel; you can increase it if your disk | ||
| # speed is greater and you have more machines. | ||
| max_jobs_run=15 # This should be set to the maximum number of jobs you are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change jobs -> nnet3-chain-get-egs jobs.
| # it doesn't make sense to use different options than were used as input to the | ||
| # LDA transform). This is used to turn off CMVN in the online-nnet experiments. | ||
| lattice_lm_scale= # If supplied, the graph/lm weight of the lattices will be | ||
| # used (with this scale) in generating supervisions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specify that this would normally be 0 for conventional supervised training, but may be close to 1
for the unsupervised part of the data in semi-supervised training
src/lat/lattice-functions.h
Outdated
| /// | ||
| /// @param [in] lat Input lattice. Expected to be top-sorted. Otherwise the | ||
| /// function will crash. | ||
| /// @param [out] acoustic_scores Pointer to a map where the mapping from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation doesn't seem to be consistent with the function signature: you say it is a map
to acoustic score, but it returns a pair.
src/lat/lattice-functions.h
Outdated
| /// ComputeAcousticScoresMap into the lattice. | ||
| /// | ||
| /// @param [in] acoustic_scores A map from the pair (frame-index, | ||
| //pdf-id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix this. And you say it's a map to acoustic score: why is it a pair?
src/latbin/lattice-to-fst.cc
Outdated
| po.Register("project-input", &project_input, | ||
| "Project to input labels (transition-ids); applicable only " | ||
| "when --read-compact=false"); | ||
| po.Register("project-output", &project_output, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean word-ids. But I don't think this option makes sense. The lattice would be mostly epsilons. Can you remove it if it's not necessary.
Is project-input needed? Please simplify to only what you need for this PR.
src/latbin/lattice-to-fst.cc
Outdated
| fst::VectorFst<StdArc> fst; | ||
| { | ||
| Lattice lat; | ||
| ConvertLattice(clat, &lat); // convert to non-compact form.. won't introduce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that these multi-line comments have different indentation on different lines.
src/nnet3/nnet-example-utils.cc
Outdated
| stats_.PrintStats(); | ||
| } | ||
|
|
||
| void ScaleFst(BaseFloat scale, fst::StdVectorFst *fst) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A function that does this already exists somewhere in fstext-utils.h. You can remove this from the header too.
danpovey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some small comments.
| # Final train prob (xent) -1.9246 -1.5926 -1.6454 | ||
| # Final valid prob (xent) -2.1873 -1.7990 -1.7107 | ||
|
|
||
| # train_set semisup15k_100k_250k semisup50k_100k_250k semisup100k_250k |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are all of these results still part of the scripts? remove any that are not.
If these lines came from compare_wer.sh it would be nice if you could show the corresponding command line.
and an explanation of the naming convention would be nice too.... in semisupX_Y_Z, it's not clear what X, Y and Z are.
|
|
||
| echo "$0: generating egs from the supervised data" | ||
| steps/nnet3/chain/get_egs.sh --cmd "$decode_cmd" \ | ||
| --left-context $egs_left_context --right-context $egs_right_context \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix this indentation level
| # Unsupervised weight: 1.0 | ||
| # Weights for phone LM (supervised, unsupervised): 3,2 | ||
| # LM for decoding unsupervised data: 4gram | ||
| # Supervision: Naive split lattices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this accurate, that you are using naive splitting? You seem to be also using it for the other example scripts; but the paper seems to say that "smart splitting" is generally better. Can you clarify which options control the type of splitting?
|
Smart splitting is not committed in this pull request. That requires other
binaries to be added.
On Sun, Mar 4, 2018 at 7:23 PM Daniel Povey ***@***.***> wrote:
***@***.**** commented on this pull request.
some small comments.
------------------------------
In egs/fisher_english/s5/local/semisup/chain/tuning/run_tdnn_1a.sh
<#2140 (comment)>:
> +
+# This is fisher chain recipe for training a model on a subset of around
+# 100-300 hours of supervised data.
+# This system uses phone LM to model UNK.
+# local/semisup/run_50k.sh and local/semisup/run_100k.sh show how to call this.
+
+# train_set train_sup15k train_sup50k train_sup
+# ivector_train_set semisup15k_100k_250k semisup50k_100k_250k train_sup
+# WER on dev 27.75 21.41 19.23
+# WER on test 27.24 21.03 19.01
+# Final train prob -0.0959 -0.1035 -0.1224
+# Final valid prob -0.1823 -0.1667 -0.1503
+# Final train prob (xent) -1.9246 -1.5926 -1.6454
+# Final valid prob (xent) -2.1873 -1.7990 -1.7107
+
+# train_set semisup15k_100k_250k semisup50k_100k_250k semisup100k_250k
are all of these results still part of the scripts? remove any that are
not.
If these lines came from compare_wer.sh it would be nice if you could show
the corresponding command line.
and an explanation of the naming convention would be nice too.... in
semisupX_Y_Z, it's not clear what X, Y and Z are.
------------------------------
In
egs/fisher_english/s5/local/semisup/chain/tuning/run_tdnn_100k_semisupervised_1a.sh
<#2140 (comment)>:
> +
+if [ -z "$sup_egs_dir" ]; then
+ sup_egs_dir=$dir/egs_${supervised_set_perturbed}
+ frames_per_eg=$(cat $sup_chain_dir/egs/info/frames_per_eg)
+
+ if [ $stage -le 12 ]; then
+ if [[ $(hostname -f) == *.clsp.jhu.edu ]] && [ ! -d $sup_egs_dir/storage ]; then
+ utils/create_split_dir.pl \
+ /export/b0{5,6,7,8}/$USER/kaldi-data/egs/fisher_english-$(date +'%m_%d_%H_%M')/s5c/$sup_egs_dir/storage $sup_egs_dir/storage
+ fi
+ mkdir -p $sup_egs_dir/
+ touch $sup_egs_dir/.nodelete # keep egs around when that run dies.
+
+ echo "$0: generating egs from the supervised data"
+ steps/nnet3/chain/get_egs.sh --cmd "$decode_cmd" \
+ --left-context $egs_left_context --right-context $egs_right_context \
fix this indentation level
------------------------------
In
egs/fisher_english/s5/local/semisup/chain/tuning/run_tdnn_100k_semisupervised_1a.sh
<#2140 (comment)>:
> +# This version of script uses only supervised data for i-vector extractor
+# training as against using the combined data as in run_tdnn_50k_semisupervised.sh.
+# We use 3-gram LM trained on 100 hours of supervised data. We do not have
+# enough data to do 4-gram LM rescoring as in run_tdnn_50k_semisupervised.sh.
+
+# This script uses phone LM to model UNK.
+# This script uses the same tree as that for the seed model.
+# See the comments in the script about how to change these.
+
+# Unsupervised set: train_unsup100k_250k
+# unsup_frames_per_eg=150
+# Deriv weights: Lattice posterior of best path pdf
+# Unsupervised weight: 1.0
+# Weights for phone LM (supervised, unsupervised): 3,2
+# LM for decoding unsupervised data: 4gram
+# Supervision: Naive split lattices
Is this accurate, that you are using naive splitting? You seem to be also
using it for the other example scripts; but the paper seems to say that
"smart splitting" is generally better. Can you clarify which options
control the type of splitting?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2140 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEATV5T1yhM0m0doSDi7j7uCjVNYOKCDks5tbIVpgaJpZM4RZduw>
.
--
Vimal Manohar
PhD Student
Electrical & Computer Engineering
Johns Hopkins University
|
|
ok, fine; minimal is good.
On Sun, Mar 4, 2018 at 7:39 PM, Vimal Manohar <[email protected]>
wrote:
… Smart splitting is not committed in this pull request. That requires other
binaries to be added.
On Sun, Mar 4, 2018 at 7:23 PM Daniel Povey ***@***.***>
wrote:
> ***@***.**** commented on this pull request.
>
> some small comments.
> ------------------------------
>
> In egs/fisher_english/s5/local/semisup/chain/tuning/run_tdnn_1a.sh
> <#2140 (comment)>:
>
> > +
> +# This is fisher chain recipe for training a model on a subset of around
> +# 100-300 hours of supervised data.
> +# This system uses phone LM to model UNK.
> +# local/semisup/run_50k.sh and local/semisup/run_100k.sh show how to
call this.
> +
> +# train_set train_sup15k train_sup50k train_sup
> +# ivector_train_set semisup15k_100k_250k semisup50k_100k_250k train_sup
> +# WER on dev 27.75 21.41 19.23
> +# WER on test 27.24 21.03 19.01
> +# Final train prob -0.0959 -0.1035 -0.1224
> +# Final valid prob -0.1823 -0.1667 -0.1503
> +# Final train prob (xent) -1.9246 -1.5926 -1.6454
> +# Final valid prob (xent) -2.1873 -1.7990 -1.7107
> +
> +# train_set semisup15k_100k_250k semisup50k_100k_250k semisup100k_250k
>
> are all of these results still part of the scripts? remove any that are
> not.
> If these lines came from compare_wer.sh it would be nice if you could
show
> the corresponding command line.
> and an explanation of the naming convention would be nice too.... in
> semisupX_Y_Z, it's not clear what X, Y and Z are.
> ------------------------------
>
> In
> egs/fisher_english/s5/local/semisup/chain/tuning/run_tdnn_
100k_semisupervised_1a.sh
> <#2140 (comment)>:
>
> > +
> +if [ -z "$sup_egs_dir" ]; then
> + sup_egs_dir=$dir/egs_${supervised_set_perturbed}
> + frames_per_eg=$(cat $sup_chain_dir/egs/info/frames_per_eg)
> +
> + if [ $stage -le 12 ]; then
> + if [[ $(hostname -f) == *.clsp.jhu.edu ]] && [ ! -d
$sup_egs_dir/storage ]; then
> + utils/create_split_dir.pl \
> + /export/b0{5,6,7,8}/$USER/kaldi-data/egs/fisher_english-$(date
+'%m_%d_%H_%M')/s5c/$sup_egs_dir/storage $sup_egs_dir/storage
> + fi
> + mkdir -p $sup_egs_dir/
> + touch $sup_egs_dir/.nodelete # keep egs around when that run dies.
> +
> + echo "$0: generating egs from the supervised data"
> + steps/nnet3/chain/get_egs.sh --cmd "$decode_cmd" \
> + --left-context $egs_left_context --right-context $egs_right_context \
>
> fix this indentation level
> ------------------------------
>
> In
> egs/fisher_english/s5/local/semisup/chain/tuning/run_tdnn_
100k_semisupervised_1a.sh
> <#2140 (comment)>:
>
> > +# This version of script uses only supervised data for i-vector
extractor
> +# training as against using the combined data as in
run_tdnn_50k_semisupervised.sh.
> +# We use 3-gram LM trained on 100 hours of supervised data. We do not
have
> +# enough data to do 4-gram LM rescoring as in
run_tdnn_50k_semisupervised.sh.
> +
> +# This script uses phone LM to model UNK.
> +# This script uses the same tree as that for the seed model.
> +# See the comments in the script about how to change these.
> +
> +# Unsupervised set: train_unsup100k_250k
> +# unsup_frames_per_eg=150
> +# Deriv weights: Lattice posterior of best path pdf
> +# Unsupervised weight: 1.0
> +# Weights for phone LM (supervised, unsupervised): 3,2
> +# LM for decoding unsupervised data: 4gram
> +# Supervision: Naive split lattices
>
> Is this accurate, that you are using naive splitting? You seem to be also
> using it for the other example scripts; but the paper seems to say that
> "smart splitting" is generally better. Can you clarify which options
> control the type of splitting?
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#2140#
pullrequestreview-101026334>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/
AEATV5T1yhM0m0doSDi7j7uCjVNYOKCDks5tbIVpgaJpZM4RZduw>
> .
>
--
Vimal Manohar
PhD Student
Electrical & Computer Engineering
Johns Hopkins University
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2140 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu0Cm-YmrYT-VLy3jlV-I0ZKLQOyEks5tbIk3gaJpZM4RZduw>
.
|
|
@vimalmanohar, sorry, there are conflicts now. Please resolve and once you confirm it's good to merge I'll merge. |
|
@vimalmanohar, don't forget about this. |
|
I'm still testing it one more time.
On Tue, Mar 20, 2018 at 12:37 AM Daniel Povey ***@***.***> wrote:
@vimalmanohar <https://github.com/vimalmanohar>, don't forget about this.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2140 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEATV3sLxfoRkWXe10bzcJNZZnvFaFHiks5tgIebgaJpZM4RZduw>
.
--
Vimal Manohar
PhD Student
Electrical & Computer Engineering
Johns Hopkins University
|
|
I fixed all issues and conflicts. |
danpovey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed some small things...
src/latbin/lattice-compose.cc
Outdated
| // Compute a map from each (t, tid) to (sum_of_acoustic_scores, count) | ||
| unordered_map<std::pair<int32,int32>, std::pair<BaseFloat, int32>, | ||
| PairHasher<int32> > acoustic_scores; | ||
| if (!write_compact) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why this acoustic-scores-map thing is necessary here because composition with an FST that only has weights on the graph side of the scores, will leave the acoustic scores where they were.
| this_deriv_weights(i) = (*deriv_weights)(t); | ||
| } | ||
| KALDI_ASSERT(output_weights.Dim() == num_frames_subsampled); | ||
| this_deriv_weights.MulElements(output_weights); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't let this issue hold up merging this PR, but you might want to measure the effect of, in the case where 'deriv_weights' are supplied, just ignoring the 'output_weights' here instead of multiplying them by them. Based on my previous experience, this would improve the results. Don't add options though-- too much complexity
src/chainbin/nnet3-chain-get-egs.cc
Outdated
| "and input frames."); | ||
| po.Register("deriv-weights-rspecifier", &deriv_weights_rspecifier, | ||
| "Per-frame weights that scales a frame's gradient during " | ||
| "backpropagation." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a space here after the ".".
| } | ||
|
|
||
|
|
||
| void ComputeAcousticScoresMap( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it turns out you don't need this code after looking into the composition, you can remove it.
|
Ok, I removed it from that binary but it is required elsewhere.
On Tue, Mar 27, 2018 at 4:31 PM Daniel Povey ***@***.***> wrote:
***@***.**** commented on this pull request.
Noticed some small things...
------------------------------
In src/latbin/lattice-compose.cc
<#2140 (comment)>:
> @@ -94,6 +102,11 @@ int main(int argc, char *argv[]) {
std::string key = lattice_reader1.Key();
KALDI_VLOG(1) << "Processing lattice for key " << key;
Lattice lat1 = lattice_reader1.Value();
+ // Compute a map from each (t, tid) to (sum_of_acoustic_scores, count)
+ unordered_map<std::pair<int32,int32>, std::pair<BaseFloat, int32>,
+ PairHasher<int32> > acoustic_scores;
+ if (!write_compact)
I don't see why this acoustic-scores-map thing is necessary here because
composition with an FST that only has weights on the graph side of the
scores, will leave the acoustic scores where they were.
------------------------------
In src/chainbin/nnet3-chain-get-egs.cc
<#2140 (comment)>:
> - frame_subsampling_factor);
+ if (!deriv_weights) {
+ NnetChainSupervision nnet_supervision("output", supervision_part,
+ output_weights,
+ first_frame,
+ frame_subsampling_factor);
+ nnet_chain_eg.outputs[0].Swap(&nnet_supervision);
+ } else {
+ Vector<BaseFloat> this_deriv_weights(num_frames_subsampled);
+ for (int32 i = 0; i < num_frames_subsampled; i++) {
+ int32 t = i + start_frame_subsampled;
+ if (t < deriv_weights->Dim())
+ this_deriv_weights(i) = (*deriv_weights)(t);
+ }
+ KALDI_ASSERT(output_weights.Dim() == num_frames_subsampled);
+ this_deriv_weights.MulElements(output_weights);
I won't let this issue hold up merging this PR, but you might want to
measure the effect of, in the case where 'deriv_weights' are supplied, just
ignoring the 'output_weights' here instead of multiplying them by them.
Based on my previous experience, this would improve the results. Don't add
options though-- too much complexity
------------------------------
In src/chainbin/nnet3-chain-get-egs.cc
<#2140 (comment)>:
> @@ -200,6 +266,20 @@ int main(int argc, char *argv[]) {
po.Register("srand", &srand_seed, "Seed for random number generator ");
po.Register("length-tolerance", &length_tolerance, "Tolerance for "
"difference in num-frames between feat and ivector matrices");
+ po.Register("supervision-length-tolerance", &supervision_length_tolerance,
+ "Tolerance for difference in num-frames-subsampled between "
+ "supervision and deriv weights, and also between supervision "
+ "and input frames.");
+ po.Register("deriv-weights-rspecifier", &deriv_weights_rspecifier,
+ "Per-frame weights that scales a frame's gradient during "
+ "backpropagation."
need a space here after the ".".
------------------------------
In src/lat/lattice-functions.cc
<#2140 (comment)>:
> @@ -1646,4 +1649,110 @@ void ComposeCompactLatticeDeterministic(
fst::Connect(composed_clat);
}
+
+void ComputeAcousticScoresMap(
if it turns out you don't need this code after looking into the
composition, you can remove it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2140 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEATV6RI96zPctmOu9pGjSHiSyVNaskXks5tiqGLgaJpZM4RZduw>
.
--
Vimal Manohar
PhD Student
Electrical & Computer Engineering
Johns Hopkins University
|
…sr#2140) Conflicts: egs/wsj/s5/steps/libs/nnet3/train/chain_objf/acoustic_model.py egs/wsj/s5/steps/nnet3/chain/train.py
…aldi-asr#2140; un-support --transform-dir. Thx: @aaror8 (kaldi-asr#2334) Conflicts: egs/wsj/s5/steps/nnet3/get_egs.sh
…aldi-asr#2140; un-support --transform-dir. Thx: @aaror8 (kaldi-asr#2334)
A simple version of semi-supervised training using lattice-free MMI on subset of Fisher English.
Moved from vimalmanohar#14.