-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Chinese text normalization]Chinese TN part in text_normalization #4826
Conversation
@yzhang123 I make a new pr as a clear version. |
score = ( | ||
pynutil.insert("score: \"") | ||
+ Cardinal().graph_cardinal | ||
+ pynini.cross(":", "比") | ||
+ Cardinal().graph_cardinal | ||
+ pynutil.insert("\"") | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the symbols from tsv file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have done
add your sign in data/math/symbol.tsv,this graph just convert sigh to character,you can add more | ||
cases with detailed cases | ||
''' | ||
score_sign = pynini.string_file(get_abs_path("data/math/score.tsv")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_abs_path not defined here and and in other files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for this,which happens with last change trying to add tsv in math.I checked other files and they are correct now.
Signed-off-by: Ubuntu <[email protected]>
Signed-off-by: Ubuntu <[email protected]>
Signed-off-by: Ubuntu <[email protected]>
Signed-off-by: Ubuntu <[email protected]>
Signed-off-by: Ubuntu <[email protected]>
Signed-off-by: Ubuntu <[email protected]>
Signed-off-by: Ubuntu <[email protected]>
Signed-off-by: Ubuntu <[email protected]>
Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: Ubuntu <[email protected]>
Signed-off-by: Ubuntu <[email protected]>
@pengzhendong thanks for your review on this and previous PRs. If nothing more from your side I'm ok to merge. FYI: all pytest and SH tests passed |
Hope this is not too late, we received a documentation last week on how to handle numbers from ASR team in Shanghai. Here is the link to it: http://www.moe.gov.cn/ewebeditor/uploadfile/2015/01/13/20150113091154536.pdf |
…IDIA#4826) * add zh in normalize Signed-off-by: Ubuntu <[email protected]> * add "zh" in normalize.py Signed-off-by: Ubuntu <[email protected]> * add zh in tools Signed-off-by: Ubuntu <[email protected]> * add zh in test Signed-off-by: Ubuntu <[email protected]> * fix bug in en/graph_utils.py Signed-off-by: Ubuntu <[email protected]> * Update README.md Signed-off-by: Ubuntu <[email protected]> * add score.tsv Signed-off-by: Ubuntu <[email protected]> * update math Signed-off-by: Ubuntu <[email protected]> * add kab language asr models (NVIDIA#4819) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: Ubuntu <[email protected]> * add import in math Signed-off-by: Ubuntu <[email protected]> Signed-off-by: Ubuntu <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]>
…IDIA#4826) * add zh in normalize Signed-off-by: Ubuntu <[email protected]> * add "zh" in normalize.py Signed-off-by: Ubuntu <[email protected]> * add zh in tools Signed-off-by: Ubuntu <[email protected]> * add zh in test Signed-off-by: Ubuntu <[email protected]> * fix bug in en/graph_utils.py Signed-off-by: Ubuntu <[email protected]> * Update README.md Signed-off-by: Ubuntu <[email protected]> * add score.tsv Signed-off-by: Ubuntu <[email protected]> * update math Signed-off-by: Ubuntu <[email protected]> * add kab language asr models (NVIDIA#4819) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: Ubuntu <[email protected]> * add import in math Signed-off-by: Ubuntu <[email protected]> Signed-off-by: Ubuntu <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]>
…IDIA#4826) * add zh in normalize Signed-off-by: Ubuntu <[email protected]> * add "zh" in normalize.py Signed-off-by: Ubuntu <[email protected]> * add zh in tools Signed-off-by: Ubuntu <[email protected]> * add zh in test Signed-off-by: Ubuntu <[email protected]> * fix bug in en/graph_utils.py Signed-off-by: Ubuntu <[email protected]> * Update README.md Signed-off-by: Ubuntu <[email protected]> * add score.tsv Signed-off-by: Ubuntu <[email protected]> * update math Signed-off-by: Ubuntu <[email protected]> * add kab language asr models (NVIDIA#4819) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: Ubuntu <[email protected]> * add import in math Signed-off-by: Ubuntu <[email protected]> Signed-off-by: Ubuntu <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]>
…IDIA#4826) * add zh in normalize Signed-off-by: Ubuntu <[email protected]> * add "zh" in normalize.py Signed-off-by: Ubuntu <[email protected]> * add zh in tools Signed-off-by: Ubuntu <[email protected]> * add zh in test Signed-off-by: Ubuntu <[email protected]> * fix bug in en/graph_utils.py Signed-off-by: Ubuntu <[email protected]> * Update README.md Signed-off-by: Ubuntu <[email protected]> * add score.tsv Signed-off-by: Ubuntu <[email protected]> * update math Signed-off-by: Ubuntu <[email protected]> * add kab language asr models (NVIDIA#4819) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: Ubuntu <[email protected]> * add import in math Signed-off-by: Ubuntu <[email protected]> Signed-off-by: Ubuntu <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]>
…IDIA#4826) * add zh in normalize Signed-off-by: Ubuntu <[email protected]> * add "zh" in normalize.py Signed-off-by: Ubuntu <[email protected]> * add zh in tools Signed-off-by: Ubuntu <[email protected]> * add zh in test Signed-off-by: Ubuntu <[email protected]> * fix bug in en/graph_utils.py Signed-off-by: Ubuntu <[email protected]> * Update README.md Signed-off-by: Ubuntu <[email protected]> * add score.tsv Signed-off-by: Ubuntu <[email protected]> * update math Signed-off-by: Ubuntu <[email protected]> * add kab language asr models (NVIDIA#4819) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: Ubuntu <[email protected]> * add import in math Signed-off-by: Ubuntu <[email protected]> Signed-off-by: Ubuntu <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Signed-off-by: Hainan Xu <[email protected]>
…IDIA#4826) * add zh in normalize Signed-off-by: Ubuntu <[email protected]> * add "zh" in normalize.py Signed-off-by: Ubuntu <[email protected]> * add zh in tools Signed-off-by: Ubuntu <[email protected]> * add zh in test Signed-off-by: Ubuntu <[email protected]> * fix bug in en/graph_utils.py Signed-off-by: Ubuntu <[email protected]> * Update README.md Signed-off-by: Ubuntu <[email protected]> * add score.tsv Signed-off-by: Ubuntu <[email protected]> * update math Signed-off-by: Ubuntu <[email protected]> * add kab language asr models (NVIDIA#4819) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: Ubuntu <[email protected]> * add import in math Signed-off-by: Ubuntu <[email protected]> Signed-off-by: Ubuntu <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Signed-off-by: Hainan Xu <[email protected]>
This PR continues #4543 & #4638 & #4683
What does this PR do ?
Add Chinese Text Normalization Tools in NeMo
Collection:
[NeMo/norm_text_processing/text_normalization]
[NeMo/tools/text_processing_deployment]
[NeMo/tests/nemo_text_processing/zh]