Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,14 @@ build
.venv
*.lock

# Local caches / secrets (never ship to remote via rsync)
.ssh/
.hf_cache/
.nemo_run/

# Emergent dataset artifacts (large; stored in shared data_dir instead)
nemo_skills/dataset/emergent_tts/data/

__pycache__
.ipynb_checkpoints

Expand Down
124 changes: 124 additions & 0 deletions nemo_skills/dataset/emergent_tts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
## EmergentTTS-Eval dataset (`emergent_tts`)

This dataset integration lets you:

- **Prepare** the EmergentTTS-Eval test set under a shared `data_dir` (download baseline audios + metadata + MOS model).
- **Generate** TTS outputs with NeMo-Skills (`ns eval` via `run_tts_eval.py`).
- **Score** the generated outputs with EmergentTTS-Eval (WER/MOS/win-rate, depending on config).

### 1) Prepare the test set (requires `HF_TOKEN`)

`prepare.py` downloads the dataset and writes all required artifacts into:

- `<DATA_DIR>/emergent_tts/emergent/test.jsonl`
- `<DATA_DIR>/emergent_tts/data/emergent_tts_eval_data.jsonl`
- `<DATA_DIR>/emergent_tts/data/baseline_audios/*.wav`
- `<DATA_DIR>/emergent_tts/data/wv_mos.ckpt`

Run it from your dev machine (or any environment with network access):

```bash
cd /home/vmendelev/workspace/expressiveness/src/nemo-skills-tts-eval
. ./.venv/bin/activate

export HF_TOKEN="<your_hf_token>"

python nemo_skills/dataset/emergent_tts/prepare.py \
--output_dir "<DATA_DIR>/emergent_tts"
```
Comment on lines +20 to +28
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace developer-specific absolute paths with generic placeholders.

The README hard-codes /home/vmendelev/workspace/expressiveness/src/nemo-skills-tts-eval (lines 21, 78, 98, and elsewhere) and /lustre/fsw/llmservice_nemo_speechlm/users/vmendelev/code (line 50) throughout every code block. These paths are specific to one developer's environment and will not work for any other contributor.

Replace them with environment variables or clearly marked placeholders, e.g. <REPO_ROOT>, <CLUSTER_WORKDIR>. The <repo_url> on line 51 also needs to be filled in with the actual EmergentTTS-Eval-public repository URL.

Also applies to: 49-52, 77-110

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemo_skills/dataset/emergent_tts/README.md` around lines 20 - 28, Replace all
developer-specific absolute paths in the README.md code blocks (e.g.,
`/home/vmendelev/...` and
`/lustre/fsw/llmservice_nemo_speechlm/users/vmendelev/code`) with clear
placeholders like `<REPO_ROOT>` and `<CLUSTER_WORKDIR>` and update the repo
reference `<repo_url>` to the actual EmergentTTS-Eval-public repository URL;
ensure every code block that currently contains hard-coded paths (including the
examples around the prepare.py invocation and the git clone block) uses these
placeholders or environment variables (e.g., export REPO_ROOT="/path/to/repo")
so contributors can substitute their own paths.


Optional flags:

- `--num_samples 10`: write only the first 10 samples (smoke test).
- `--overwrite`: re-download / regenerate outputs.

### 2) Configure evaluation

Use the example configs in `nemo_skills/dataset/emergent_tts/scripts/config/`.

In `scripts/config/default.yaml`, set:

- `generation.data_dir: <DATA_DIR>`
- `scoring.emergent_data_dir: <DATA_DIR>/emergent_tts/data`
- `scoring.scoring_code_path: <PATH_TO>/EmergentTTS-Eval-public` (on the cluster)

### 3) Clone + patch EmergentTTS-Eval-public for NVIDIA Inference API judging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix duplicate section number — two sections are labeled "3)".

Line 45 is ### 3) Clone + patch EmergentTTS-Eval-public and line 73 is ### 3) Run evaluation. The subsequent section (### 4) Smoke test) is also off by one. The correct numbering should be: 3 → Clone & patch, 4 → Run evaluation, 5 → Smoke test.

Also applies to: 73-73, 95-95

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemo_skills/dataset/emergent_tts/README.md` at line 45, The README contains
duplicate section numbering: update the three heading lines so they read "### 3)
Clone + patch EmergentTTS-Eval-public for NVIDIA Inference API judging", change
the second "### 3) Run evaluation" to "### 4) Run evaluation", and increment
"### 4) Smoke test" to "### 5) Smoke test" (these exact heading strings identify
the locations to edit) so the section numbers are sequential.


On EOS (or wherever you run scoring), clone EmergentTTS-Eval:

```bash
cd /lustre/fsw/llmservice_nemo_speechlm/users/vmendelev/code
git clone <repo_url> EmergentTTS-Eval-public
```

Then update Emergent’s judge client selection so that **Gemini models are called via NVIDIA’s OpenAI-compatible Inference API**.

Target behavior:

- **Model name** stays as: `gcp/google/gemini-2.5-pro` (or similar).
- **Base URL** is NVIDIA Inference API: `https://inference-api.nvidia.com/v1`
- **API key** comes from: `JUDGER_API_KEY` (or `NVIDIA_API_KEY`)

Minimal patch checklist inside `EmergentTTS-Eval-public`:

- In `api_clients.py` (or wherever the client is chosen), ensure `gcp/google/*` uses an **OpenAI-compatible** client (not the Google SDK client), e.g.:
- `OpenAI(base_url=<judger_base_url>, api_key=os.getenv("JUDGER_API_KEY"))`
- Thread `judger_base_url` through so calls use `https://inference-api.nvidia.com/v1` (not the full `/v1/chat/completions` endpoint).

After patching, set these in `scripts/config/default.yaml`:

- `scoring.judge_model: gcp/google/gemini-2.5-pro`
- `scoring.judger_base_url: https://inference-api.nvidia.com/v1/chat/completions`
Comment on lines +64 to +71
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

judger_base_url prose and config example are contradictory.

Line 66 explicitly says to use https://inference-api.nvidia.com/v1 and not the full /v1/chat/completions endpoint. However, the config snippet on line 71 sets:

scoring.judger_base_url: https://inference-api.nvidia.com/v1/chat/completions

One of these must be wrong; resolve the discrepancy before merge.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemo_skills/dataset/emergent_tts/README.md` around lines 64 - 71, The README
has a contradiction between the prose and the example config about
judger_base_url: the prose instructs threading judger_base_url as the base URL
(e.g., https://inference-api.nvidia.com/v1) while the config example sets
scoring.judger_base_url to the full chat completions path; fix by choosing the
base-only form and updating the example and any code that consumes it (e.g.,
api_clients.py) to append the endpoint path when constructing requests; ensure
references to judger_base_url and scoring.judger_base_url consistently use the
base (no /v1/chat/completions) and that OpenAI(...) or equivalent client is
created with base_url=judger_base_url and the code appends the proper
/v1/chat/completions suffix where needed, and update the README snippet and
scripts/config/default.yaml to match.


### 3) Run evaluation (generation + scoring)

From your dev machine, submit jobs to EOS:

```bash
cd /home/vmendelev/workspace/expressiveness/src/nemo-skills-tts-eval
. ./.venv/bin/activate
mkdir -p .nemo_run

export NEMORUN_HOME="$PWD/.nemo_run"
export NEMO_SKILLS_CONFIG_DIR=/home/vmendelev/workspace/expressiveness/src/ns_eval/cluster_configs
export NEMO_SKILLS_DISABLE_UNCOMMITTED_CHANGES_CHECK=1

# Required for win-rate judging (NVIDIA Inference API key)
export JUDGER_API_KEY="<your_nvidia_api_key>"

python -m nemo_skills.dataset.emergent_tts.scripts.run_tts_eval \
--config nemo_skills/dataset/emergent_tts/scripts/config/default.yaml \
--stage all \
--expname emergent_eval
```

### 4) Smoke test (10 samples, interactive)

```bash
cd /home/vmendelev/workspace/expressiveness/src/nemo-skills-tts-eval
. ./.venv/bin/activate
mkdir -p .nemo_run

export NEMORUN_HOME="$PWD/.nemo_run"
export NEMO_SKILLS_CONFIG_DIR=/home/vmendelev/workspace/expressiveness/src/ns_eval/cluster_configs
export NEMO_SKILLS_DISABLE_UNCOMMITTED_CHANGES_CHECK=1

python -m nemo_skills.dataset.emergent_tts.scripts.run_tts_eval \
--config nemo_skills/dataset/emergent_tts/scripts/config/interactive_10.yaml \
--stage generation \
--expname emergent_smoke10
```

### Outputs

NeMo-Skills generation writes:

- `<output_dir>/eval-results/emergent_tts.emergent/output.jsonl`
- `<output_dir>/eval-results/emergent_tts.emergent/audio/*.wav` (or equivalent)

Emergent scoring writes (in the same benchmark folder):

- `emergent-tts-eval_*_evaluation-predictions.jsonl`
- `emergent-tts-eval_*_evaluation-metrics.json`
- `metrics.json` (a NeMo-Skills-friendly copy of Emergent metrics)

Comment on lines +1 to +124
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add expected evaluation results for at least one tested model.

The README documents the workflow thoroughly but does not include any sample metric output (e.g. WER, MOS, win-rate) for a reference model run. Based on learnings from CONTRIBUTING.md: "When adding new benchmarks, add documentation with example commands for how to run evaluation, expected results for tested models, and any dataset-specific details like special preparation arguments or non-standard inference arguments."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemo_skills/dataset/emergent_tts/README.md` around lines 1 - 124, The README
for emergent_tts is missing example evaluation results; update
nemo_skills/dataset/emergent_tts/README.md to include a short "Expected results"
section showing sample metrics (WER, MOS, win-rate) for at least one tested
model and the exact config used; reference the example config path
scripts/config/default.yaml (and interactive_10.yaml for smoke tests) and the
output filenames (eval-results/.../output.jsonl,
emergent-tts-eval_*_evaluation-metrics.json, metrics.json) so readers can
reproduce the run and compare their numbers.

6 changes: 6 additions & 0 deletions nemo_skills/dataset/emergent_tts/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"""EmergentTTS-Eval dataset integration for NeMo-Skills.

This package contains tooling to prepare the EmergentTTS-Eval benchmark for
NeMo-Skills evaluation runs.
"""

3 changes: 3 additions & 0 deletions nemo_skills/dataset/emergent_tts/emergent/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# EmergentTTS-Eval benchmark (NeMo-Skills)

GENERATION_ARGS = "++prompt_format=openai"
238 changes: 238 additions & 0 deletions nemo_skills/dataset/emergent_tts/prepare.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@
#!/usr/bin/env python3
# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Prepare EmergentTTS-Eval benchmark for NeMo-Skills.

This script:
1) Downloads the EmergentTTS-Eval HF dataset
2) Saves baseline audios to wav files
3) Writes `data/emergent_tts_eval_data.jsonl` in Emergent's expected schema
4) Downloads `data/wv_mos.ckpt`
5) Writes NeMo-Skills `test.jsonl` for generation (OpenAI prompt format)

Typical usage (to create everything under your shared NeMo-Skills data dir):
python prepare.py --output_dir /lustre/.../data_dir/emergent_tts
"""

from __future__ import annotations

import argparse
import json
import os
import time
import urllib.request
from urllib.error import ContentTooShortError
from pathlib import Path


SYSTEM_MESSAGE = "You are a helpful assistant."
DEFAULT_DATASET = "bosonai/EmergentTTS-Eval"
DEFAULT_SPLIT = "train"
WV_MOS_URL = "https://zenodo.org/record/6201162/files/wav2vec2.ckpt?download=1"


def _require_deps():
try:
import numpy as np # noqa: F401
from datasets import load_dataset # noqa: F401
import librosa # noqa: F401
import soundfile # noqa: F401
from pydub import AudioSegment # noqa: F401
from tqdm import tqdm # noqa: F401
except Exception as e: # pragma: no cover
raise RuntimeError(
"Missing dependencies for EmergentTTS-Eval preparation.\n\n"
"Install into the repo venv:\n"
" cd /home/vmendelev/workspace/expressiveness/src/nemo-skills-tts-eval\n"
" . ./.venv/bin/activate\n"
" pip install datasets numpy pydub tqdm librosa soundfile\n"
) from e
Comment on lines +46 to +61
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Hardcoded developer path in error message; catch ImportError instead of bare Exception.

Two issues:

  1. Line 58 contains a hardcoded developer-specific path (/home/vmendelev/workspace/...). This is meaningless to other users. Replace with a generic instruction.

  2. The except Exception on line 54 should be except ImportError — that's the only exception expected from failed imports. Catching broader exceptions can mask unrelated bugs (e.g., a library that imports successfully but fails during its own init for a different reason).

Proposed fix
 def _require_deps():
     try:
-        import numpy as np  # noqa: F401
-        from datasets import load_dataset  # noqa: F401
-        import librosa  # noqa: F401
-        import soundfile  # noqa: F401
-        from pydub import AudioSegment  # noqa: F401
-        from tqdm import tqdm  # noqa: F401
-    except Exception as e:  # pragma: no cover
+        import numpy as np
+        from datasets import load_dataset
+        import librosa
+        import soundfile
+        from pydub import AudioSegment
+        from tqdm import tqdm
+    except ImportError as e:  # pragma: no cover
         raise RuntimeError(
             "Missing dependencies for EmergentTTS-Eval preparation.\n\n"
-            "Install into the repo venv:\n"
-            "  cd /home/vmendelev/workspace/expressiveness/src/nemo-skills-tts-eval\n"
-            "  . ./.venv/bin/activate\n"
-            "  pip install datasets numpy pydub tqdm librosa soundfile\n"
+            "Install the required packages:\n"
+            "  pip install datasets numpy pydub tqdm librosa soundfile\n"
         ) from e

As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving".

🧰 Tools
🪛 Ruff (0.15.1)

[warning] 48-48: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


[warning] 49-49: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


[warning] 50-50: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


[warning] 51-51: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


[warning] 52-52: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


[warning] 53-53: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


[warning] 55-61: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemo_skills/dataset/emergent_tts/prepare.py` around lines 46 - 61, In
_require_deps() replace the broad "except Exception" with "except ImportError"
so only failed imports are caught, and update the RuntimeError message to remove
the hardcoded developer path—use a generic installation instruction (e.g.,
"activate your virtualenv and run: pip install datasets numpy pydub tqdm librosa
soundfile") so users get actionable, non-personalized guidance; keep the
original raised-from (from e) behavior when re-raising the RuntimeError.



def _download_wv_mos(dst_path: Path, overwrite: bool) -> None:
dst_path.parent.mkdir(parents=True, exist_ok=True)
if dst_path.exists() and not overwrite:
return
tmp_path = dst_path.with_suffix(dst_path.suffix + ".tmp")

# Zenodo downloads can occasionally fail with ContentTooShortError; retry.
max_attempts = 5
for attempt in range(1, max_attempts + 1):
if tmp_path.exists():
tmp_path.unlink()
try:
urllib.request.urlretrieve(WV_MOS_URL, str(tmp_path))
tmp_path.replace(dst_path)
return
except ContentTooShortError as e:
# Partial download: wait and retry.
wait_s = min(5 * attempt, 30)
print(f"Warning: partial download for wv_mos.ckpt (attempt {attempt}/{max_attempts}): {e}")
time.sleep(wait_s)
except Exception as e:
wait_s = min(5 * attempt, 30)
print(f"Warning: failed downloading wv_mos.ckpt (attempt {attempt}/{max_attempts}): {e}")
time.sleep(wait_s)

raise RuntimeError(f"Failed to download wv_mos.ckpt after {max_attempts} attempts: {WV_MOS_URL}")


def _write_benchmark_init(bench_dir: Path) -> None:
bench_dir.mkdir(parents=True, exist_ok=True)
init_path = bench_dir / "__init__.py"
init_path.write_text(
(
"# EmergentTTS-Eval benchmark (NeMo-Skills)\n\n"
'GENERATION_ARGS = "++prompt_format=openai"\n'
),
encoding="utf-8",
)


def _to_nemo_skills_entry(sample: dict) -> dict:
# MagpieTTS backend expects JSON with at least `text`. We also keep Emergent
# metadata to enable deterministic conversion/scoring later.
payload = {
"text": sample["text_to_synthesize"],
"text_to_synthesize": sample["text_to_synthesize"],
"category": sample["category"],
"evolution_depth": sample["evolution_depth"],
"language": sample["language"],
"unique_id_eval": sample["unique_id_eval"],
# Optional fields used by MagpieTTS evaluation code paths.
"context_audio_filepath": "",
"duration": 5.0,
"context_audio_duration": 5.0,
}
return {
"problem": "",
"messages": [
{"role": "system", "content": SYSTEM_MESSAGE},
{"role": "user", "content": json.dumps(payload, ensure_ascii=False)},
],
}


def main():
_require_deps()
import numpy as np
from datasets import load_dataset
from pydub import AudioSegment
from tqdm import tqdm

parser = argparse.ArgumentParser(description="Prepare EmergentTTS-Eval for NeMo-Skills")
parser.add_argument(
"--output_dir",
type=str,
default=str(Path(__file__).parent),
help="Where to create emergent_tts module structure (default: folder containing this script).",
)
parser.add_argument("--dataset", type=str, default=DEFAULT_DATASET, help="HF dataset name")
parser.add_argument("--split", type=str, default=DEFAULT_SPLIT, help="HF split to download (train contains 1645)")
parser.add_argument(
"--overwrite",
action="store_true",
help="Overwrite existing files (baseline audios, jsonl, wv_mos.ckpt, test.jsonl).",
)
parser.add_argument(
"--num_samples",
type=int,
default=None,
help="Optional: limit number of samples (debug). If set, takes the first N rows.",
)
args = parser.parse_args()

output_dir = Path(args.output_dir).resolve()
data_dir = output_dir / "data"
baseline_audios_dir = data_dir / "baseline_audios"
baseline_audios_dir.mkdir(parents=True, exist_ok=True)

# Emergent expected files
emergent_jsonl_path = data_dir / "emergent_tts_eval_data.jsonl"
wv_mos_path = data_dir / "wv_mos.ckpt"

# NeMo-Skills benchmark module structure
bench_dir = output_dir / "emergent"
test_jsonl_path = bench_dir / "test.jsonl"
_write_benchmark_init(bench_dir)

# Download dataset
dataset_hf = load_dataset(args.dataset, split=args.split)
total = len(dataset_hf) if args.num_samples is None else min(args.num_samples, len(dataset_hf))

if emergent_jsonl_path.exists() and test_jsonl_path.exists() and not args.overwrite:
print(f"Found existing outputs under {output_dir}. Use --overwrite to rebuild.")
else:
if args.overwrite:
for p in [emergent_jsonl_path, test_jsonl_path]:
if p.exists():
p.unlink()

emergent_records: list[dict] = []

# Build emergent jsonl + baseline audios
for i in tqdm(range(total), desc="Preparing EmergentTTS-Eval"):
curr = dataset_hf[i]
unique_id = i

# Save baseline audio
wav_path = baseline_audios_dir / f"{unique_id}.wav"
if args.overwrite or not wav_path.exists():
audio_array = curr["audio"]["array"]
audio_sr = int(curr["audio"]["sampling_rate"])
audio_array_int16 = np.int16(audio_array * 32767)
audio_segment = AudioSegment(
audio_array_int16.tobytes(),
frame_rate=audio_sr,
sample_width=2,
channels=1,
)
audio_segment.export(str(wav_path), format="wav")
Comment on lines +193 to +202
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential int16 overflow/wrap when audio samples exceed [-1, 1].

np.int16(audio_array * 32767) will silently wrap around if any sample exceeds the [-1.0, 1.0] range (e.g., a value of 1.001 * 32767 = 32799 wraps to -32737 as int16). This produces audible clicks/artifacts. Clip before converting.

Proposed fix
-                audio_array_int16 = np.int16(audio_array * 32767)
+                audio_array_int16 = np.int16(np.clip(audio_array, -1.0, 1.0) * 32767)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemo_skills/dataset/emergent_tts/prepare.py` around lines 193 - 202, The
conversion from float samples to int16 can wrap if samples exceed [-1,1]; update
the code that creates audio_array_int16 (working with variables audio_array,
audio_array_int16, and the AudioSegment export to wav_path) to first clamp/clip
audio_array to [-1.0, 1.0] (e.g., via np.clip), then scale and convert to int16
(preferably round then astype) before building AudioSegment and exporting; this
ensures no int16 overflow/wrap and avoids audible artifacts.


emergent_records.append(
{
"unique_id_eval": unique_id,
"category": curr["category"],
"text_to_synthesize": curr["text_to_synthesize"],
"evolution_depth": curr["evolution_depth"],
"language": curr["language"],
}
)

# Write emergent jsonl data file
emergent_jsonl_path.parent.mkdir(parents=True, exist_ok=True)
with open(emergent_jsonl_path, "w", encoding="utf-8") as f:
for rec in emergent_records:
f.write(json.dumps(rec, ensure_ascii=False) + "\n")

# Write NeMo-Skills test.jsonl
with open(test_jsonl_path, "w", encoding="utf-8") as f:
for rec in emergent_records:
f.write(json.dumps(_to_nemo_skills_entry(rec), ensure_ascii=False) + "\n")

# Download MOS model checkpoint (used by Emergent scoring)
_download_wv_mos(wv_mos_path, overwrite=args.overwrite)

print("\nPrepared EmergentTTS-Eval:")
print(f" - data dir: {data_dir}")
print(f" - baseline audios: {baseline_audios_dir}")
print(f" - emergent jsonl: {emergent_jsonl_path}")
print(f" - wv_mos.ckpt: {wv_mos_path}")
print(f" - nemo-skills test.jsonl: {test_jsonl_path}")


if __name__ == "__main__":
main()

2 changes: 2 additions & 0 deletions nemo_skills/dataset/emergent_tts/scripts/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
"""Scripts for running EmergentTTS-Eval via NeMo-Skills."""

Loading
Loading