Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TTS] Add additional config to preprocess_text and compute_feature_stats #7321

Merged
merged 2 commits into from
Aug 29, 2023

Conversation

rlangman
Copy link
Collaborator

What does this PR do ?

Add some configuration to TTS preprocessing to make them more usable.

Most important, adding batch_size to text normalization multi-processing which I have found to make it about 10x faster than using the default 'auto' batch size in joblib. I do not fully understand why, but the documentation indicates that large batch sizes work for small, fast tasks:

batch_size: int or 'auto', default: 'auto'
The number of atomic tasks to dispatch at once to each
worker. When individual evaluations are very fast, dispatching
calls to workers can be slower than sequential computation because
of the overhead. Batching fast computations together can mitigate
this.

Collection: [TTS]

Changelog

  • Add batch_size parameter to preprocess_text.py
  • Add input and output field parameters to preprocess_text.py
  • Fix boolean flag parsing for lower_case for preprocess_text.py
  • Modify compute_feature_stats.py so that it can take a list of manifests, making it easy to compute pitch/energy stats across multiple datasets.

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

@racoiaws
Copy link
Collaborator

Otherwise LGTM

@rlangman rlangman merged commit f265ac4 into main Aug 29, 2023
15 checks passed
@rlangman rlangman deleted the tts_preprocess branch August 29, 2023 00:27
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
…ats (NVIDIA#7321)

* [TTS] Add additional config to preprocess_text and compute_feature_stats

Signed-off-by: Ryan <[email protected]>

* [TTS] Rename batch_size to joblib_batch_size

Signed-off-by: Ryan <[email protected]>

---------

Signed-off-by: Ryan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants