-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Update MagpieTTS model with latest changes #15010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
moved t5tts script to magpietts Signed-off-by: Xuesong Yang <[email protected]>
Signed-off-by: Xuesong Yang <[email protected]>
* wip Signed-off-by: Paarth Neekhara <[email protected]> * attn prior inference implementation Signed-off-by: Paarth Neekhara <[email protected]> * more hacks Signed-off-by: Paarth Neekhara <[email protected]> * minor tweaks Signed-off-by: Paarth Neekhara <[email protected]> * clean ups and make text attention strictly monotonic at inference Signed-off-by: Paarth Neekhara <[email protected]> * more updates Signed-off-by: Paarth Neekhara <[email protected]> * minor tweaks Signed-off-by: Paarth Neekhara <[email protected]> * compute head wise attention maps Signed-off-by: Paarth Neekhara <[email protected]> * configurable ctc prior layers during training Signed-off-by: Paarth Neekhara <[email protected]> * log only ctc prior layers on tensorboard Signed-off-by: Paarth Neekhara <[email protected]> * add layerwise logging Signed-off-by: Paarth Neekhara <[email protected]> * more configurable inference Signed-off-by: Paarth Neekhara <[email protected]> * more conifigs Signed-off-by: Paarth Neekhara <[email protected]> * updated end prediction logic as per discussion with roy Signed-off-by: Paarth Neekhara <[email protected]> * DPO preference pair creations: add option to choose min length * Cleanup * handle cases where predicted codes are very small, havent tested but should work Signed-off-by: Paarth Neekhara <[email protected]> * undo predicted len change since it is not needed Signed-off-by: Paarth Neekhara <[email protected]> * clean up notebook Signed-off-by: Paarth Neekhara <[email protected]> --------- Signed-off-by: Paarth Neekhara <[email protected]> Co-authored-by: Fejgin, Roy <[email protected]>
When doing Pareto ranking make sure to only compare indices that correspond to metrics.
…#52) Updated by Jason * local transformer training tested, prediction not tested Signed-off-by: Paarth Neekhara <[email protected]> * local transformer updates Signed-off-by: Paarth Neekhara <[email protected]> * local transformer inference working Signed-off-by: Paarth Neekhara <[email protected]> * aligner module Signed-off-by: Paarth Neekhara <[email protected]> * aligner module updates Signed-off-by: Paarth Neekhara <[email protected]> * wip Signed-off-by: Paarth Neekhara <[email protected]> * wip Signed-off-by: Paarth Neekhara <[email protected]> * change aligner text input to encoder output Signed-off-by: Paarth Neekhara <[email protected]> * obtain hard alignment from t5tts decoder Signed-off-by: Paarth Neekhara <[email protected]> * log hard attention training Signed-off-by: Paarth Neekhara <[email protected]> * binarization method, obtain_prior_from_cross_attn fix Signed-off-by: Paarth Neekhara <[email protected]> * added configs for local transformer and alignment encoder Signed-off-by: Paarth Neekhara <[email protected]> * added prior window decay factors Signed-off-by: Paarth Neekhara <[email protected]> * more configs.. Signed-off-by: Paarth Neekhara <[email protected]> * config was missing Signed-off-by: Paarth Neekhara <[email protected]> * slight modification in alignment encoder computation, pass target audio embeddings (removing bos) Signed-off-by: Paarth Neekhara <[email protected]> * some comments Signed-off-by: Paarth Neekhara <[email protected]> * prior prob configurable Signed-off-by: Paarth Neekhara <[email protected]> * update yamls Signed-off-by: Paarth Neekhara <[email protected]> * refactor inference prior code Signed-off-by: Paarth Neekhara <[email protected]> * set prior epsilon to 0 to avoid any attention scores on unintended parts Signed-off-by: Paarth Neekhara <[email protected]> * make prior epsilon configurable in training Signed-off-by: Paarth Neekhara <[email protected]> * added rtf metrics and notebook, infer and evaluate changes Signed-off-by: Paarth Neekhara <[email protected]> * turn off alignment encoder training after 50k steps Signed-off-by: Paarth Neekhara <[email protected]> --------- Signed-off-by: Paarth Neekhara <[email protected]>
Updated by Jason, added back inference class * wavlm speaker eval Signed-off-by: Shehzeen Hussain <[email protected]> * connect to inference script Signed-off-by: Shehzeen Hussain <[email protected]> * bug fix Signed-off-by: Shehzeen Hussain <[email protected]> * grpo started, training seems to be working Signed-off-by: Shehzeen Hussain <[email protected]> * grpo local training seems ok Signed-off-by: Shehzeen Hussain <[email protected]> * only one generation per item in val Signed-off-by: Shehzeen Hussain <[email protected]> * allow cfg use during generation process Signed-off-by: Shehzeen Hussain <[email protected]> * fix cer threshold for 0 reward Signed-off-by: Shehzeen Hussain <[email protected]> * use kv cache for grpo generation Signed-off-by: Shehzeen Hussain <[email protected]> * remove kv cache for now Signed-off-by: Shehzeen Hussain <[email protected]> * kv cache for online po configurable Signed-off-by: Shehzeen Hussain <[email protected]> * configurable reward params Signed-off-by: Shehzeen Hussain <[email protected]> * grpo val set added in evalset Signed-off-by: Shehzeen Hussain <[email protected]> * comments update Signed-off-by: Shehzeen Hussain <[email protected]> * modify reward scaling Signed-off-by: Shehzeen Hussain <[email protected]> * moved preference optimization code and classes to a new file Signed-off-by: Shehzeen Hussain <[email protected]> * missing file Signed-off-by: Shehzeen Hussain <[email protected]> * added language option in online PO Signed-off-by: Shehzeen Hussain <[email protected]> * some updates in the script Signed-off-by: Shehzeen Hussain <[email protected]> * add reference free option Signed-off-by: Shehzeen Hussain <[email protected]> * handle corner cases Signed-off-by: Shehzeen Hussain <[email protected]> * bug fix in reference free mode and torch.load fix for new container Signed-off-by: Shehzeen Hussain <[email protected]> * added option for pesq reward Signed-off-by: Shehzeen Hussain <[email protected]> * pesq device bug fix Signed-off-by: Shehzeen Hussain <[email protected]> --------- Signed-off-by: Shehzeen Hussain <[email protected]>
Signed-off-by: Shehzeen Hussain <[email protected]>
* add back missing dev files Signed-off-by: Jason <[email protected]> * more bug fixes from merge Signed-off-by: Jason <[email protected]> * add latest changes for rc5 docker Signed-off-by: Jason <[email protected]> --------- Signed-off-by: Jason <[email protected]>
…to make the recipe working with PTL 1.9+. (#47) Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: Xuesong Yang <[email protected]>
…. modify and make it optional. (#49) Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: Xuesong Yang <[email protected]>
…er. (#52) * structured both loggers for train/val/test. * enable `resume` param to ensure the resumed training logs being merged on the previous run id. * removed `tb_logger` func. Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: Xuesong Yang <[email protected]>
* [magpietts] minor fix for the usage of freezing a model. Signed-off-by: Xuesong Yang <[email protected]> * fixed a typo. Signed-off-by: Xuesong Yang <[email protected]> * Apply suggestions from code review --------- Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
) * trainer import fix for new pytorch lightning Signed-off-by: Paarth Neekhara <[email protected]> * handle strict prior window correctly Signed-off-by: Paarth Neekhara <[email protected]> * disable autocasting codec model and making prior window strict Signed-off-by: Paarth Neekhara <[email protected]> --------- Signed-off-by: Paarth Neekhara <[email protected]>
…oader. (#54) * [magpie][lhotse] added a lhotse dataloader for monologue tts. this is a working recipe with num_workers>0 for training and num_workers=0 for val datasets. Still faced issues when num_workers>0 during validation steps. Investigating rootcauses. * all contents in a batch are obtained correctly, but dtype mismatches. * fix dtype for text tokens and codec codes. Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> * [lhotse_shar_prep] add script to create shar dataset. * with more efficient changes. * bugfix previously the last batch would be dropped if the size is less than the buffer size. this fixes it. Signed-off-by: Xuesong Yang <[email protected]> * [lhotse_dataloader] clean up commented lines. Signed-off-by: Xuesong Yang <[email protected]> * [lhotse_dataloader] bugfix to force spawn over fork to address CUDA initialization errors when multiple workers are used during validation. Signed-off-by: Xuesong Yang <[email protected]> * [lhotse_dataloader] save efforts to set up tokenizer again for training since it has been setup ready during model initialization. Signed-off-by: Xuesong Yang <[email protected]> * [lhotse_dataloader] switch to setup tokenizer inside __getitem__ to support spawn worker processes. Signed-off-by: Xuesong Yang <[email protected]> * [magpietts][lhotse] fixed a bug of attatch_tensor which save wrong numpy array. update yaml config Signed-off-by: Xuesong Yang <[email protected]> * [magpie][lhotse_config] enforce quadratic_duration if using lhotse dataloader to avoid frequent OOMs. changed yaml name to monologue Signed-off-by: Xuesong Yang <[email protected]> * [magpie][example] add LR logger. Signed-off-by: Xuesong Yang <[email protected]> * cleanup Signed-off-by: Xuesong Yang <[email protected]> * [lhotse_yaml] made changes for yaml config according to comments. Signed-off-by: Xuesong Yang <[email protected]> * [magpie][lhotse_dataset] added docstring for lhotse dataset Signed-off-by: Xuesong Yang <[email protected]> * [magpie][lhotse_dataset] remove yamls Signed-off-by: Xuesong Yang <[email protected]> * [magpie][lhotse_dataset] remove Edresson's lhotse implementations, and update yaml name. Signed-off-by: Xuesong Yang <[email protected]> * [magpie][lhotse_dataset] add a README showing guidance how to create lhotse data Signed-off-by: Xuesong Yang <[email protected]> * [magpie][lhotse_dataset] update MonoCut example. Signed-off-by: Xuesong Yang <[email protected]> * rename config Signed-off-by: Jason <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Jason <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Jason <[email protected]>
* add fix to infer script Signed-off-by: Jason <[email protected]> * add no context option Signed-off-by: Jason <[email protected]> * add nemo option to infer script Signed-off-by: Jason <[email protected]> * add in latest bf16 changes from Edresson Signed-off-by: Jason <[email protected]> * add comment Signed-off-by: Jason <[email protected]> * enforce codec precision for now Signed-off-by: Jason <[email protected]> * fix autocast bug Signed-off-by: Jason <[email protected]> * another bug fix Signed-off-by: Jason <[email protected]> * clean PR Signed-off-by: Jason <[email protected]> * change hardcoded epsilon Signed-off-by: Jason <[email protected]> * infer changes Signed-off-by: Jason <[email protected]> * address review Signed-off-by: Jason <[email protected]> --------- Signed-off-by: Jason <[email protected]>
* bug fix in context text embedding initialization Signed-off-by: Paarth Neekhara <[email protected]> * bug fixes in infer and evaluate Signed-off-by: Paarth Neekhara <[email protected]> --------- Signed-off-by: Paarth Neekhara <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Paarth Neekhara <[email protected]>
Make sure to reserve enough tokens for special uses like EOS/BOS. WARNING: old models will be incompatible with the updated inference YAMLs and will need to override the num_audio_tokens_per_codebook to the value they were trained with.
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Shehzeen Hussain <[email protected]>
#51) * preference optimization updates, trainer updates remove redundant datagen class Signed-off-by: Shehzeen Hussain <[email protected]> * revert model pt change, add freeze_model function Signed-off-by: Shehzeen Hussain <[email protected]> * remove redundant inference class Signed-off-by: Shehzeen Hussain <[email protected]> * remove custom freeze model function and use lightning inbuilt freeze instead Signed-off-by: Shehzeen Hussain <[email protected]> * added a readme for magpie preference optimization Signed-off-by: Shehzeen Hussain <[email protected]> * change class name from MagpieTTSModelInference to MagpieTTSModelPrefDataGen Signed-off-by: Shehzeen Hussain <[email protected]> * update class name from MagpieTTSModelPrefDataGen to MagpieTTSModelOfflinePODataGen Signed-off-by: Shehzeen Hussain <[email protected]> --------- Signed-off-by: Shehzeen Hussain <[email protected]>
…codes (#66) * [magpie][wandb] add loggings for pad ratios for text tokens and audio codes. Signed-off-by: Xuesong Yang <[email protected]> * [magpie][wandb] fix pad ratio calculation Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: Xuesong Yang <[email protected]>
…codes (#66) * [magpie][wandb] add loggings for pad ratios for text tokens and audio codes. Signed-off-by: Xuesong Yang <[email protected]> * [magpie][wandb] fix pad ratio calculation Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: Xuesong Yang <[email protected]>
* Bugfix: num_audio_tokens_per_codebook
Make sure to reserve enough tokens for special uses like EOS/BOS.
WARNING: old models will be incompatible with the updated inference YAMLs
and will need to override the num_audio_tokens_per_codebook to the value they were
trained with.
* Rework how number of codes and codebooks are handled (WIP)
* Reorder the code a bit for clarity
* Refactor codebook configuration
* read codec parameters from codec checkpoint; remove corresponding configuration from Magpie YAML files
* add mechanism for backward compatibility with older checkpoints:
** If using `infer_and_evaluate.py`, just set the --legacy_codebooks command line flag
** If running training or inference with the Hydra command line, override using the following flags:
```
forced_num_all_tokens_per_codebook: 2048
forced_audio_bos_id: ${sum:${model.forced_num_all_tokens_per_codebook}, -1} # 2047
forced_audio_eos_id: ${sum:${model.forced_num_all_tokens_per_codebook}, -2} # 2046
forced_context_audio_bos_id: ${sum:${model.forced_num_all_tokens_per_codebook}, -4} # 2044
forced_context_audio_eos_id: ${sum:${model.forced_num_all_tokens_per_codebook}, -3} # 2045
```
* Add README on the codebook reorganization
... and how to load legacy checkpoints.
* Cleanup
* Cleanup and fixing typos
* Cleanup
* Cleanup
* Clarify the README on the embedding table layout
* README cleanup
* Rename an attritube for clarity
codec_model_downsample_factor --> codec_model_samples_per_frame
…nd image on the sliding bar instead of incrementing by 1. (#61) * [magpie][wandb][bugfix] ensure consistent validation step for audio and image on the sliding bar instead of incrementing by 1. Signed-off-by: Xuesong Yang <[email protected]> * [magpietts][loggers] support logging metrics using multiple loggers enabled in exp_manager. * [magpietts][lhotse_dataset] remove useless imports and functions. Signed-off-by: Xuesong Yang <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: Xuesong Yang <[email protected]>
* Refine the README on codebook layout updates * Typo fix * Bugfix: wire in the `legacy_codebooks` flag in a missing place
* add update config to infer script Signed-off-by: Jason <[email protected]> * Update infer_and_evaluate.py --------- Signed-off-by: Jason <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces MagpieTTS, a text-to-speech model with support for training, inference, evaluation, and preference optimization. The changes include:
- Core MagpieTTS model implementation and preference optimization variants
- Comprehensive evaluation and inference scripts with metric computation (CER, WER, SSIM, UTMOSv2, FCD)
- Lhotse dataset integration for efficient data processing and sharding
- Test coverage for transformer modules, FCD metrics, and Lhotse filters
- Utility scripts for data preparation, context audio extraction, and codec processing
Reviewed Changes
Copilot reviewed 61 out of 62 changed files in this pull request and generated 29 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/collections/tts/modules/test_fcd_metric.py | Adds comprehensive unit tests for Frechet Codec Distance metric |
| tests/collections/tts/modules/test_transformer_2501.py | Updates transformer tests to include mask parameters and adds batched inference tests |
| tests/collections/common/test_lhotse_tts_filters.py | Adds tests for Lhotse dataset filters (CER, speaker similarity, validation status) |
| tests/collections/common/test_lhotse_dataloading.py | Removes duplicate test function |
| scripts/magpietts/*.py | Adds evaluation, inference, data preparation, and codec extraction scripts |
| scripts/magpietts/dpo/*.py | Adds DPO/RPO preference pair creation scripts |
| nemo/collections/tts/modules/utmosv2.py | Adds UTMOSv2 MOS estimation wrapper |
| nemo/collections/tts/modules/encodec_modules.py | Adds properties for num_codebooks and codebook_size |
| nemo/utils/nemo_logging.py | Adds stacklevel parameter to logging calls for better source location reporting |
| nemo/collections/common/tokenizers/text_to_speech/tts_tokenizers.py | Fixes typo and improves AggregatedTTSTokenizer implementation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| @pytest.mark.unit | ||
| def test_codebooks_mismatch_update(self, metric, device, codec): | ||
| """Test that the FCD metric doesn't crash when provided with incorrect number ofcodebooks.""" |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing space between 'of' and 'codebooks' in the docstring.
| # @property | ||
| # def codebook_size(self): | ||
| # """Returns the size of the implicit codebook.""" | ||
| # return self.codebook_size_per_group**self.num_groups | ||
|
|
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment appears to contain commented-out code.
| # @property | |
| # def codebook_size(self): | |
| # """Returns the size of the implicit codebook.""" | |
| # return self.codebook_size_per_group**self.num_groups |
| import os | ||
| import random | ||
| import re | ||
| import time |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'time' is not used.
| 'alignment_loss': alignment_loss, | ||
| } | ||
|
|
||
| def training_step(self, batch, batch_idx): |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is shadowed by attribute training_step in superclass ModelPT.
| 'batch_metrics': generated_codes_and_metrics['metrics'], | ||
| } | ||
|
|
||
| def training_step(self, batch, batch_idx): |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is shadowed by attribute training_step in superclass ModelPT.
| def training_step(self, batch, batch_idx): | |
| def ptl_training_step(self, batch, batch_idx): |
| 'text': context_tensors['text'], | ||
| 'text_lens': context_tensors['text_lens'], | ||
| 'context_audio_codes': context_tensors['context_audio_codes'], | ||
| 'context_audio_codes_lens': context_tensors['context_audio_codes_lens'], | ||
| 'dec_context_size': dec_context_size, | ||
| 'aligner_attn_soft': aligner_attn_soft, | ||
| 'aligner_attn_hard': aligner_attn_hard, | ||
| } | ||
|
|
||
| def training_step(self, batch, batch_idx): |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is shadowed by attribute training_step in superclass ModelPT.
| print("...Making Shars") | ||
| out_shar_dir = Path(out_shar_dir) | ||
| out_shar_dir.mkdir(parents=True, exist_ok=True) | ||
| shard_size = shard_size |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assignment assigns a variable to itself.
Signed-off-by: Jason <[email protected]>
| num_audio_samples = num_codec_frames * self.codec_model_samples_per_frame | ||
| return num_audio_samples | ||
|
|
||
| def __getitem__(self, cuts: CutSet) -> Dict[str, Union[torch.Tensor, List]]: |
Check notice
Code scanning / CodeQL
Non-standard exception raised in special method Note
ValueError
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 days ago
To adhere to Python conventions for __getitem__, you should change the exception type in line 232 from ValueError to KeyError. This involves editing the specific line:
- File:
nemo/collections/tts/data/text_to_speech_dataset_lhotse.py - Region: within the
__getitem__method, specifically at item access check (lines 230–232). - Change: Instead of
raise ValueError(...), useraise KeyError(...).
No additional imports are required; KeyError is a built-in exception. No additional definitions or code changes needed.
-
Copy modified line R232
| @@ -229,7 +229,7 @@ | ||
| for cut in cuts: | ||
| speaker = cut.supervisions[0].speaker | ||
| if not check_speaker_format(speaker): | ||
| raise ValueError(f"Invalid format in cut.supervisions[0].speaker: {speaker}") | ||
| raise KeyError(f"Invalid format in cut.supervisions[0].speaker: {speaker}") | ||
| dataset_name = speaker.strip().split()[2].split(":")[-1] | ||
| dataset_name_list.append(dataset_name) | ||
|
|
nemo/collections/tts/models/magpietts_preference_optimization.py
Dismissed
Show dismissed
Hide dismissed
| self.target_sample_rate = target_sample_rate | ||
| self.codec_model_samples_per_frame = codec_model_samples_per_frame | ||
|
|
||
| def __getitem__(self, cuts: CutSet) -> Optional[Dict[str, Any]]: |
Check notice
Code scanning / CodeQL
Non-standard exception raised in special method Note
ValueError
This method raises
ValueError
This method raises
ValueError
This method raises
ValueError
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 days ago
To fix this problem, we should change all cases where ValueError is raised in the __getitem__ method of AudioPairLhotseDataset and instead raise KeyError. This includes the branches where required keys ("shard_origin", "context_recording") are missing from cut.custom, and where "shard_origin" does not match the required pattern for extracting a shard index. Only replace the exceptions in this method; ensure the error message is preserved so debugging remains clear. Only edit within the bounds of the shown code—do not change anything else or add unnecessary imports.
-
Copy modified line R150 -
Copy modified line R154 -
Copy modified line R160
| @@ -147,17 +147,17 @@ | ||
| if not cut.has_custom("shard_origin"): | ||
| err_msg = f"Cut {cut} is missing required key 'shard_origin'." | ||
| logging.error(err_msg) | ||
| raise ValueError(err_msg) | ||
| raise KeyError(err_msg) | ||
| if not cut.has_custom("context_recording"): | ||
| err_msg = f"Cut {cut} is missing required key 'context_recording'." | ||
| logging.error(err_msg) | ||
| raise ValueError(err_msg) | ||
| raise KeyError(err_msg) | ||
|
|
||
| # Parse shard index from the custom field, handling potential errors | ||
| origin_path = cut.custom["shard_origin"] | ||
| match = re.search(r"cuts\.(\d+)\.jsonl\.gz$", origin_path) | ||
| if match is None: | ||
| raise ValueError(f"Could not parse shard index from shard_origin: {origin_path}") | ||
| raise KeyError(f"Could not parse shard index from shard_origin: {origin_path}") | ||
| shard_idx_origin = int(match.group(1)) | ||
|
|
||
| # audio shape: (num_channels (1), num_samples) -> (num_samples) |
… RL; remove some experimental flags Signed-off-by: Jason <[email protected]>
Signed-off-by: blisc <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: blisc <[email protected]>
…ypes from magpie and Clean up scripts Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
What does this PR do ?
Updates MagpieTTS with latest dev changes.
Collection: tts
Changelog