Skip to content

Commit

Permalink
Tn tutorial (NVIDIA#4090)
Browse files Browse the repository at this point in the history
* refactor tn data folder, and update of measure

Signed-off-by: Yang Zhang <[email protected]>

* udpate jenkins

Signed-off-by: Yang Zhang <[email protected]>

* added whitelist with spaces for asr

Signed-off-by: Yang Zhang <[email protected]>

* save

Signed-off-by: Yang Zhang <[email protected]>

* save

Signed-off-by: Yang Zhang <[email protected]>

* electronic updated

Signed-off-by: Yang Zhang <[email protected]>

* added file support

Signed-off-by: Yang Zhang <[email protected]>

* add ip prompt

Signed-off-by: Yang Zhang <[email protected]>

* add missing file

Signed-off-by: Yang Zhang <[email protected]>

* update grammars

Signed-off-by: Yang Zhang <[email protected]>

* fix header

Signed-off-by: Yang Zhang <[email protected]>

* fixed electronic review added ssn

Signed-off-by: Yang Zhang <[email protected]>

* udpated money

Signed-off-by: Yang Zhang <[email protected]>

* added more formats for year range

Signed-off-by: Yang Zhang <[email protected]>

* fix pytest

Signed-off-by: Yang Zhang <[email protected]>

* style fix

Signed-off-by: Yang Zhang <[email protected]>

* fix

Signed-off-by: Yang Zhang <[email protected]>

* fix

Signed-off-by: Yang Zhang <[email protected]>

* remove run predict

Signed-off-by: Yang Zhang <[email protected]>

* add images

Signed-off-by: Yang Zhang <[email protected]>

* add images

Signed-off-by: Yang Zhang <[email protected]>

* remove redundant notebook

Signed-off-by: Yang Zhang <[email protected]>

* remove readme

Signed-off-by: Yang Zhang <[email protected]>

* change requirement name for tn

Signed-off-by: Yang Zhang <[email protected]>

* updated wfst tut

Signed-off-by: Yang Zhang <[email protected]>

* added final notes

Signed-off-by: Yang Zhang <[email protected]>

* added final notes

Signed-off-by: Yang Zhang <[email protected]>

* redirect readme

Signed-off-by: Yang Zhang <[email protected]>

* remove files

Signed-off-by: Yang Zhang <[email protected]>

* deleted redundant docs

Signed-off-by: Yang Zhang <[email protected]>

* finished deployment docs

Signed-off-by: Yang Zhang <[email protected]>

* fix style

Signed-off-by: Yang Zhang <[email protected]>

* update tutorials

Signed-off-by: Yang Zhang <[email protected]>

* add more test cases

Signed-off-by: Yang Zhang <[email protected]>

* update

Signed-off-by: Yang Zhang <[email protected]>

* update

Signed-off-by: Yang Zhang <[email protected]>

* add

Signed-off-by: Yang Zhang <[email protected]>

* remove image

Signed-off-by: Yang Zhang <[email protected]>

* add link

Signed-off-by: Yang Zhang <[email protected]>

* lgtm

Signed-off-by: Yang Zhang <[email protected]>

* sort imports

Signed-off-by: Yang Zhang <[email protected]>

* changed to language support matrix

Signed-off-by: Yang Zhang <[email protected]>

* changed to language support matrix

Signed-off-by: Yang Zhang <[email protected]>

* fix branch

Signed-off-by: Yang Zhang <[email protected]>
  • Loading branch information
yzhang123 authored May 6, 2022
1 parent 6da257a commit 30bda3b
Show file tree
Hide file tree
Showing 33 changed files with 7,754 additions and 8,760 deletions.
8 changes: 4 additions & 4 deletions Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -124,12 +124,12 @@ pipeline {
parallel {
stage('En TN grammars') {
steps {
sh 'CUDA_VISIBLE_DEVICES="" python nemo_text_processing/text_normalization/normalize.py "1" --cache_dir /home/TestData/nlp/text_norm/ci/grammars/4-26'
sh 'CUDA_VISIBLE_DEVICES="" python nemo_text_processing/text_normalization/normalize.py --text="1" --cache_dir /home/TestData/nlp/text_norm/ci/grammars/4-26'
}
}
stage('En ITN grammars') {
steps {
sh 'CUDA_VISIBLE_DEVICES="" python nemo_text_processing/inverse_text_normalization/inverse_normalize.py --language en "twenty" --cache_dir /home/TestData/nlp/text_norm/ci/grammars/4-26'
sh 'CUDA_VISIBLE_DEVICES="" python nemo_text_processing/inverse_text_normalization/inverse_normalize.py --language en --text="twenty" --cache_dir /home/TestData/nlp/text_norm/ci/grammars/4-26'
}
}
stage('Test En non-deterministic TN & Run all En TN/ITN tests (restore grammars from cache)') {
Expand All @@ -153,7 +153,7 @@ pipeline {
stage('L2: Eng TN') {
steps {
sh 'cd tools/text_processing_deployment && python pynini_export.py --output=/home/TestData/nlp/text_norm/output/ --grammars=tn_grammars --cache_dir /home/TestData/nlp/text_norm/ci/grammars/4-26 --language=en && ls -R /home/TestData/nlp/text_norm/output/ && echo ".far files created "|| exit 1'
sh 'cd nemo_text_processing/text_normalization/ && python run_predict.py --input=/home/TestData/nlp/text_norm/ci/test.txt --input_case="lower_cased" --language=en --output=/home/TestData/nlp/text_norm/output/test.pynini.txt --verbose'
sh 'cd nemo_text_processing/text_normalization/ && python normalize.py --input_file=/home/TestData/nlp/text_norm/ci/test.txt --input_case="lower_cased" --language=en --output_file=/home/TestData/nlp/text_norm/output/test.pynini.txt --verbose'
sh 'cat /home/TestData/nlp/text_norm/output/test.pynini.txt'
sh 'cmp --silent /home/TestData/nlp/text_norm/output/test.pynini.txt /home/TestData/nlp/text_norm/ci/test_goal_py_04-14.txt || exit 1'
sh 'rm -rf /home/TestData/nlp/text_norm/output/*'
Expand All @@ -163,7 +163,7 @@ pipeline {
stage('L2: Eng ITN export') {
steps {
sh 'cd tools/text_processing_deployment && python pynini_export.py --output=/home/TestData/nlp/text_denorm/output/ --grammars=itn_grammars --cache_dir /home/TestData/nlp/text_norm/ci/grammars/4-26 --language=en && ls -R /home/TestData/nlp/text_denorm/output/ && echo ".far files created "|| exit 1'
sh 'cd nemo_text_processing/inverse_text_normalization/ && python run_predict.py --input=/home/TestData/nlp/text_denorm/ci/test.txt --language=en --output=/home/TestData/nlp/text_denorm/output/test.pynini.txt --verbose'
sh 'cd nemo_text_processing/inverse_text_normalization/ && python inverse_normalize.py --input_file=/home/TestData/nlp/text_denorm/ci/test.txt --language=en --output_file=/home/TestData/nlp/text_denorm/output/test.pynini.txt --verbose'
sh 'cmp --silent /home/TestData/nlp/text_denorm/output/test.pynini.txt /home/TestData/nlp/text_denorm/ci/test_goal_py.txt || exit 1'
sh 'rm -rf /home/TestData/nlp/text_denorm/output/*'
}
Expand Down
2 changes: 0 additions & 2 deletions docs/source/nlp/text_normalization/intro.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
(Inverse) Text Normalization
============================

NeMo supports Text Normalization (TN) and Inverse Text Normalization (ITN) tasks via rule-based `nemo_text_processing` python package and Neural-based TN/ITN model.

Rule-based (WFST) TN/ITN:

.. toctree::
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 2 additions & 9 deletions docs/source/nlp/text_normalization/wfst/intro.rst
Original file line number Diff line number Diff line change
@@ -1,22 +1,15 @@
WFST-based (Inverse) Text Normalization
=======================================

NeMo supports Text Normalization (TN) and Inverse Text Normalization (ITN) tasks via rule-based `nemo_text_processing` python package and Neural-based TN/ITN model.
NeMo supports Text Normalization (TN), audio-based TN and Inverse Text Normalization (ITN) tasks.

`nemo_text_processing` that is installed with the `nemo_toolkit`, see :doc:`NeMo Introduction <../starthere/intro>` for installation details.
Additional requirements can be found in `setup.sh <https://github.com/NVIDIA/NeMo/blob/stable/nemo_text_processing/setup.sh>`_.

Tutorials on how to get started with WFST-based NeMo text normalization could be found `tutorials/text_processing <https://github.com/NVIDIA/NeMo/tree/stable/tutorials/text_processing>`_.

Rule-based (WFST) TN/ITN:
WFST-based TN/ITN:

.. toctree::
:maxdepth: 2

wfst_text_normalization
wfst_inverse_text_normalization
wfst_text_processing_deployment
wfst_api



37 changes: 0 additions & 37 deletions docs/source/nlp/text_normalization/wfst/wfst_api.rst

This file was deleted.

This file was deleted.

Loading

0 comments on commit 30bda3b

Please sign in to comment.