Skip to content

Commit

Permalink
small Bugfix (#7079)
Browse files Browse the repository at this point in the history
* fix branch

Signed-off-by: fayejf <[email protected]>

* fix typo

Signed-off-by: fayejf <[email protected]>

* fix link

Signed-off-by: fayejf <[email protected]>

---------

Signed-off-by: fayejf <[email protected]>
  • Loading branch information
fayejf authored and web-flow committed Jul 20, 2023
1 parent 39aff5c commit d043305
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
2 changes: 1 addition & 1 deletion tutorials/asr/Offline_ASR_with_VAD_for_CTC_models.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -389,7 +389,7 @@
"source": [
"# Further Reading\n",
"\n",
"There are two ways to incorporate VAD into ASR pipeline. The first strategy is to drop the frames that are predicted as `non-speech` by VAD, as already discussed in this tutorial. The second strategy is to keep all the frames and mask the `non-speech` frames with zero-signal values. Also, instead of using segment-VAD as shown in this tutorial, we can use frame-VAD model for faster inference and better accuracy. For more information, please refer to the script [speech_to_text_with_vad.py](https://github.com/NVIDIA/NeMo/blob/stable/examples/asr_vad/speech_to_text_with_vad.py)."
"There are two ways to incorporate VAD into ASR pipeline. The first strategy is to drop the frames that are predicted as `non-speech` by VAD, as already discussed in this tutorial. The second strategy is to keep all the frames and mask the `non-speech` frames with zero-signal values. Also, instead of using segment-VAD as shown in this tutorial, we can use frame-VAD model for faster inference and better accuracy. For more information, please refer to the script [speech_to_text_with_vad.py](https://github.com/NVIDIA/NeMo/blob/stable/examples/asr/asr_vad/speech_to_text_with_vad.py)."
]
}
],
Expand Down
6 changes: 3 additions & 3 deletions tutorials/nlp/SpellMapper_English_ASR_Customization.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@
"# Install NeMo library. If you are running locally (rather than on Google Colab), comment out the below lines\n",
"# and instead follow the instructions at https://github.com/NVIDIA/NeMo#Installation\n",
"GITHUB_ACCOUNT = \"NVIDIA\"\n",
"BRANCH = \"main\"\n",
"BRANCH = \'r1.20.0\'\n",
"!python -m pip install git+https://github.com/{GITHUB_ACCOUNT}/NeMo.git@{BRANCH}#egg=nemo_toolkit[all]\n",
"\n",
"# Download local version of NeMo scripts. If you are running locally and want to use your own local NeMo code,\n",
Expand Down Expand Up @@ -536,7 +536,7 @@
"id": "b1K6paeee2Iu"
},
"source": [
"As we mentioned earlier, this model pipeline is intended to work with custom vocabularies up to several thousand entries. Since the whole medical vocabulary contains 110k entries, we restrict our custom vocabulary to 5000+ terms that occured in given corpus of abstracts.\n",
"As we mentioned earlier, this model pipeline is intended to work with custom vocabularies up to several thousand entries. Since the whole medical vocabulary contains 110k entries, we restrict our custom vocabulary to 5000+ terms that occurred in given corpus of abstracts.\n",
"\n",
"The goal of indexing our custom vocabulary is to build an index where key is a letter n-gram and value is the whole phrase. The keys are n-grams in the given user phrase and their misspelled variants taken from our collection of n-\n",
"gram mappings (see Index of custom vocabulary in Fig. 1)\n",
Expand Down Expand Up @@ -1273,7 +1273,7 @@
"### Filtering by Dynamic Programming(DP) score\n",
"\n",
"What else can be done?\n",
"Given a fragment and its potential replacement, we can apply **dynamic programming** to find the most probable \"translation\" path between them. We will use the same n-gram mapping vocabulary, because its frequencies give us \"translation probability\" of each n-gram pair. The final path score can be calculated as maximum sum of log probalities of matching n-grams along this path.\n",
"Given a fragment and its potential replacement, we can apply **dynamic programming** to find the most probable \"translation\" path between them. We will use the same n-gram mapping vocabulary, because its frequencies give us \"translation probability\" of each n-gram pair. The final path score can be calculated as maximum sum of log probabilities of matching n-grams along this path.\n",
"Let's look at an example. "
]
},
Expand Down

0 comments on commit d043305

Please sign in to comment.