Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TTS] Separate TTS tokenization and g2p util to fix circular import #6080

Merged
merged 2 commits into from
Feb 23, 2023

Conversation

rlangman
Copy link
Collaborator

What does this PR do ?

Split the g2p data_utils into 2 util classes, one containing tokenization functions and the other g2p specific ones.

This fixes a circular import between the the text tokenizers and G2p.

Collection: [TTS]

Changelog

  • Move relevant helper methods to a new tokenizer utility to avoid tts_tokenizers importing tts g2p.
  • Update unit tests

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

Copy link
Collaborator

@redoctopus redoctopus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine with me. @XuesongYang or @ekmb any comments?

XuesongYang
XuesongYang previously approved these changes Feb 22, 2023
Copy link
Collaborator

@XuesongYang XuesongYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please update the year to 2023 in the license header of each changed file.

@rlangman rlangman merged commit 003ef78 into main Feb 23, 2023
@XuesongYang XuesongYang deleted the tts_import branch February 23, 2023 02:43
titu1994 pushed a commit to titu1994/NeMo that referenced this pull request Mar 24, 2023
…VIDIA#6080)

* [TTS] Separate TTS tokenization and g2p util to fix circular import

Signed-off-by: Ryan <[email protected]>

* [TTS] Update year in tts headers

Signed-off-by: Ryan <[email protected]>
hsiehjackson pushed a commit to hsiehjackson/NeMo that referenced this pull request Jun 2, 2023
…VIDIA#6080)

* [TTS] Separate TTS tokenization and g2p util to fix circular import

Signed-off-by: Ryan <[email protected]>

* [TTS] Update year in tts headers

Signed-off-by: Ryan <[email protected]>
Signed-off-by: hsiehjackson <[email protected]>
tango4j added a commit to tango4j/NeMo that referenced this pull request Jun 20, 2023
[TTS] Separate TTS tokenization and g2p util to fix circular import (NVIDIA#6080)

See merge request taejinp/NeMo!3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants