Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TTS] Add script for text preprocessing #6541

Merged
merged 2 commits into from
May 22, 2023
Merged

[TTS] Add script for text preprocessing #6541

merged 2 commits into from
May 22, 2023

Conversation

rlangman
Copy link
Collaborator

@rlangman rlangman commented May 2, 2023

What does this PR do ?

A lot of people already have a local script which does text processing & normalization before training. To do it quickly one needs to use multiprocessing or joblib, as our text normalizer does not support batch processing.

It only does normalization and optional lower casing of the final output right now, but could be extended with additional processing functionality as needed.

Collection: [TTS]

Changelog

  • Create text preprocessing script
  • Add example config file for English text normalization

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

redoctopus
redoctopus previously approved these changes May 3, 2023
@rlangman rlangman merged commit 0838fe8 into main May 22, 2023
8 checks passed
@rlangman rlangman deleted the tts_text branch May 22, 2023 18:13
hsiehjackson pushed a commit to hsiehjackson/NeMo that referenced this pull request Jun 2, 2023
* [TTS] Add script for text preprocessing

Signed-off-by: Ryan <[email protected]>

* [TTS] Use Normalizer.input_case

Signed-off-by: Ryan <[email protected]>

---------

Signed-off-by: Ryan <[email protected]>
Signed-off-by: hsiehjackson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants