Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TTS][refactor] Part 3 - nemo.collections.tts.g2p.models #6113

Merged
merged 4 commits into from
Feb 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -1035,7 +1035,7 @@ pipeline {
steps {
sh 'cd examples/tts/g2p && \
TIME=`date +"%Y-%m-%d-%T"` && OUTPUT_DIR=output_${TIME} && \
python heteronym_classification_train_and_evaluate.py \
python g2p_heteronym_classification_train_and_evaluate.py \
train_manifest=/home/TestData/g2p/manifest.json \
validation_manifest=/home/TestData/g2p/manifest.json \
test_manifest=/home/TestData/g2p/manifest.json \
Expand All @@ -1047,7 +1047,7 @@ pipeline {
exp_manager.exp_dir=${OUTPUT_DIR} \
+exp_manager.use_datetime_version=False\
+exp_manager.version=test && \
python heteronym_classification_inference.py \
python g2p_heteronym_classification_inference.py \
manifest=/home/TestData/g2p/manifest.json \
pretrained_model=${OUTPUT_DIR}/HeteronymClassification/test/checkpoints/HeteronymClassification.nemo \
output_manifest=preds.json'
Expand Down
10 changes: 5 additions & 5 deletions docs/source/tts/g2p.rst
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ To train ByT5 G2P model and evaluate it after at the end of the training, run:
do_training=True \
do_testing=True

Example of the config file: ``NeMo/examples/text_processing/g2p/conf/t5_g2p.yaml``.
Example of the config file: ``NeMo/examples/tts/g2p/conf/g2p_t5.yaml``.


To train G2P-Conformer model and evaluate it after at the end of the training, run:
Expand Down Expand Up @@ -168,7 +168,7 @@ To train the model, run:

.. code-block::

python heteronym_classification_train_and_evaluate.py \
python g2p_heteronym_classification_train_and_evaluate.py \
train_manifest=<Path to train manifest file>" \
validation_manifest=<Path to validation manifest file>" \
model.wordids=<Path to wordids.tsv file, similar to https://github.com/google-research-datasets/WikipediaHomographData/blob/master/data/wordids.tsv> \
Expand All @@ -179,7 +179,7 @@ To train the model and evaluate it when the training is complete, run:

.. code-block::

python heteronym_classification_train_and_evaluate.py \
python g2p_heteronym_classification_train_and_evaluate.py \
train_manifest=<Path to train manifest file>" \
validation_manifest=<Path to validation manifest file>" \
model.test_ds.dataset.manifest=<Path to test manifest file>" \
Expand All @@ -191,7 +191,7 @@ To evaluate pretrained model, run:

.. code-block::

python heteronym_classification_train_and_evaluate.py \
python g2p_heteronym_classification_train_and_evaluate.py \
do_training=False \
do_testing=True \
model.test_ds.dataset.manifest=<Path to test manifest file>" \
Expand All @@ -201,7 +201,7 @@ To run inference with a pretrained HeteronymClassificationModel, run:

.. code-block::

python heteronym_classification_inference.py \
python g2p_heteronym_classification_inference.py \
manifest="<Path to .json manifest>" \
pretrained_model="<Path to .nemo file or pretrained model name from list_available_models()>" \
output_file="<Path to .json manifest to save prediction>"
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
import torch
from omegaconf import OmegaConf

from nemo.collections.tts.g2p.models.heteronym_classification import HeteronymClassificationModel
from nemo.collections.tts.models.g2p_heteronym_classification import HeteronymClassificationModel
from nemo.core.config import hydra_runner
from nemo.utils import logging

Expand All @@ -34,15 +34,15 @@

Inference form manifest:

python heteronym_classification_inference.py \
python g2p_heteronym_classification_inference.py \
manifest="<Path to .json manifest>" \
pretrained_model="<Path to .nemo file or pretrained model name from list_available_models()>" \
output_manifest="<Path to .json manifest to save prediction>" \
wordid_to_phonemes_file="<Path to a file with mapping from wordid predicted by the model to phonemes>"

Interactive inference:

python heteronym_classification_inference.py \
python g2p_heteronym_classification_inference.py \
pretrained_model="<Path to .nemo file or pretrained model name from list_available_models()>" \
wordid_to_phonemes_file="<Path to a file with mapping from wordid predicted by the model to phonemes>" # Optional

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
import torch

from nemo.collections.common.callbacks import LogEpochTimeCallback
from nemo.collections.tts.g2p.models.heteronym_classification import HeteronymClassificationModel
from nemo.collections.tts.models.g2p_heteronym_classification import HeteronymClassificationModel
from nemo.core.config import hydra_runner
from nemo.utils import logging
from nemo.utils.exp_manager import exp_manager
Expand All @@ -29,14 +29,14 @@
To prepare dataset, see NeMo/scripts/dataset_processing/g2p/export_wikihomograph_data_to_manifest.py

To run training:
python heteronym_classification_train_and_evaluate.py \
python g2p_heteronym_classification_train_and_evaluate.py \
train_manifest=<Path to train manifest file>" \
validation_manifest=<Path to validation manifest file>" \
model.wordids="<Path to wordids.tsv file>" \
do_training=True

To run training and testing (once the training is complete):
python heteronym_classification_train_and_evaluate.py \
python g2p_heteronym_classification_train_and_evaluate.py \
train_manifest=<Path to train manifest file>" \
validation_manifest=<Path to validation manifest file>" \
model.test_ds.dataset.manifest=<Path to test manifest file>" \
Expand All @@ -45,7 +45,7 @@
do_testing=True

To run testing:
python heteronym_classification_train_and_evaluate.py \
python g2p_heteronym_classification_train_and_evaluate.py \
do_training=False \
do_testing=True \
model.test_ds.dataset.manifest=<Path to test manifest file>" \
Expand All @@ -60,7 +60,7 @@
"""


@hydra_runner(config_path="conf", config_name="heteronym_classification.yaml")
@hydra_runner(config_path="conf", config_name="g2p_heteronym_classification.yaml")
def main(cfg):
trainer = pl.Trainer(**cfg.trainer)
exp_manager(trainer, cfg.get("exp_manager", None))
Expand Down
2 changes: 1 addition & 1 deletion examples/tts/g2p/g2p_inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
from omegaconf import OmegaConf
from utils import get_metrics

from nemo.collections.tts.g2p.models.g2p_model import G2PModel
from nemo.collections.tts.models.base import G2PModel
from nemo.core.config import hydra_runner
from nemo.utils import logging

Expand Down
8 changes: 4 additions & 4 deletions examples/tts/g2p/g2p_train_and_evaluate.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,14 @@
from utils import get_model

from nemo.collections.common.callbacks import LogEpochTimeCallback
from nemo.collections.tts.g2p.models.g2p_model import G2PModel
from nemo.collections.tts.models.base import G2PModel
from nemo.core.config import hydra_runner
from nemo.utils import logging, model_utils
from nemo.utils.exp_manager import exp_manager

"""
This script supports training of G2PModels
(for T5G2PModel use t5_g2p.yaml, for CTCG2PModel use either g2p_conformer.yaml or g2p_t5_ctc.yaml)
(for T5G2PModel use g2p_t5.yaml, for CTCG2PModel use either g2p_conformer.yaml or g2p_t5_ctc.yaml)

# Training T5G2PModel and evaluation at the end of training:
python examples/text_processing/g2p/g2p_train_and_evaluate.py \
Expand All @@ -38,7 +38,7 @@
do_training=True \
do_testing=True

Example of the config file: NeMo/examples/text_processing/g2p/conf/t5_g2p.yaml
Example of the config file: NeMo/examples/tts/g2p/conf/g2p_t5.yaml

# Training Conformer-G2P Model and evaluation at the end of training:
python examples/text_processing/g2p/g2p_train_and_evaluate.py \
Expand All @@ -64,7 +64,7 @@
"""


@hydra_runner(config_path="conf", config_name="t5_g2p")
@hydra_runner(config_path="conf", config_name="g2p_t5")
def main(cfg):
trainer = pl.Trainer(**cfg.trainer)
exp_manager(trainer, cfg.get("exp_manager", None))
Expand Down
4 changes: 2 additions & 2 deletions examples/tts/g2p/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@
import json

from nemo.collections.asr.metrics.wer import word_error_rate
from nemo.collections.tts.g2p.models import CTCG2PModel
from nemo.collections.tts.g2p.models.t5_g2p import T5G2PModel
from nemo.collections.tts.models.g2p_ctc import CTCG2PModel
from nemo.collections.tts.models.g2p_t5 import T5G2PModel
from nemo.utils import logging


Expand Down
13 changes: 0 additions & 13 deletions nemo/collections/tts/g2p/data/__init__.py

This file was deleted.

18 changes: 0 additions & 18 deletions nemo/collections/tts/g2p/models/__init__.py

This file was deleted.

82 changes: 0 additions & 82 deletions nemo/collections/tts/g2p/models/g2p_model.py

This file was deleted.

2 changes: 1 addition & 1 deletion nemo/collections/tts/g2p/modules.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def setup_heteronym_model(
"""

try:
from nemo.collections.tts.g2p.models.heteronym_classification import HeteronymClassificationModel
from nemo.collections.tts.models.g2p_heteronym_classification import HeteronymClassificationModel

self.heteronym_model = heteronym_model
self.heteronym_model.set_wordid_to_phonemes(wordid_to_phonemes_file)
Expand Down
Loading