Add AudioBench and Librispeech-PC benchmarks for speech and audio language models by Jorjeous · Pull Request #1043 · NVIDIA-NeMo/Skills

Jorjeous · 2025-11-14T14:19:32Z

Resolve conflict
Signed-off-by: George Zelenfroind gzelenfroind@nvidia.com

Add audiobench

and fix prepare.py for MMAU-pro

Jorjeous · 2025-11-19T15:06:17Z

This PR adding evaluation on set's from audiobench and apply's minor fix to manifest format in mmau-pro

To achieve this WER, BlUE score calculation was implemented.
As well as division on Judge | Nonjudge sets

Resolve conflict Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com> Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Signed-off-by: Sadegh Mahdavi <smahdavi@nvidia.com> Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Signed-off-by: msamadi <msamadi@nvidia.com> Co-authored-by: msamadi <msamadi@nvidia.com> Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Signed-off-by: mmkrtchyan <mmkrtchyan@nvidia.com> Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Author did not signed commit This reverts commit ecfafd1. Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Author sign off is incorrect This reverts commit 353c202. Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

melllinia · 2025-11-21T16:48:54Z

nemo_skills/dataset/audiobench/judge/__init__.py

+# Judge configuration matching AudioBench official implementation
+# Using Llama-3.1-70B with vllm (can be overridden in run scripts)
+JUDGE_PIPELINE_ARGS = {
+    "model": "meta-llama/Meta-Llama-3.1-70B-Instruct",


Please try to add NVIDIA deployed model instead from this link and check if it works: https://build.nvidia.com/meta/llama-3_1-70b-instruct

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

- Add comprehensive documentation for LibriSpeech-PC benchmark in speech-audio.md - Fix jiwer import to be lazy (only import when needed for ASR evaluation) Signed-off-by: mmkrtchyan <mmkrtchyan@nvidia.com>

Signed-off-by: mmkrtchyan <mmkrtchyan@nvidia.com>

gwarmstrong · 2025-12-08T17:05:17Z

Closed in favor of #1060

melllinia self-requested a review November 14, 2025 14:21

melllinia force-pushed the audiobench-benchmark branch from 75c51eb to 97ca5b8 Compare November 21, 2025 15:07

Jorjeous and others added 17 commits November 21, 2025 08:22

Add AudioBench benchmark for speech and audio language models

f77e2ab

Resolve conflict Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com> Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

update prepare.py for audiobench

0a600d4

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Fix on mmau-pro prepare.py

f23e8ce

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

add absolute path's to prepare.py

e734aab

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

update names

47122b4

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

update destination for downloading

325868e

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Placeholder for proof verification paper (NVIDIA-NeMo#1037)

c48c7b8

Signed-off-by: Sadegh Mahdavi <smahdavi@nvidia.com> Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Converting ICPC25 to ICPC evaluation (NVIDIA-NeMo#1045)

460f3e6

Signed-off-by: msamadi <msamadi@nvidia.com> Co-authored-by: msamadi <msamadi@nvidia.com> Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

LibriSpeech PC Benchmark Evaluation

83b3f7f

Signed-off-by: mmkrtchyan <mmkrtchyan@nvidia.com> Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Testline

09dae53

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

revert

5e787a5

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

upd strtucture

ffbfc0f

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Change judge config to align with Audiobench's

88f72b7

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

upd __init__ files

0a87820

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

changed organization of sets + minor additions

232c5e4

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Revert "Converting ICPC25 to ICPC evaluation (NVIDIA-NeMo#1045)"

311f638

Author did not signed commit This reverts commit ecfafd1. Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Revert "Placeholder for proof verification paper (NVIDIA-NeMo#1037)"

2990929

Author sign off is incorrect This reverts commit 353c202. Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

Jorjeous force-pushed the audiobench-benchmark branch from 2e81d8e to 2990929 Compare November 21, 2025 16:22

Jorjeous added 2 commits November 21, 2025 08:25

linter

b8981f4

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

update .gitignore

90cec88

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

melllinia reviewed Nov 21, 2025

View reviewed changes

add LS-PnC

be2cd1a

Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>

karpnv requested a review from melllinia November 24, 2025 15:36

melllinia added 2 commits November 25, 2025 17:34

Add LibriSpeech-PC documentation and fix jiwer import

aaf7814

- Add comprehensive documentation for LibriSpeech-PC benchmark in speech-audio.md - Fix jiwer import to be lazy (only import when needed for ASR evaluation) Signed-off-by: mmkrtchyan <mmkrtchyan@nvidia.com>

Improving mmau-pro metric calculation

a3a1169

Signed-off-by: mmkrtchyan <mmkrtchyan@nvidia.com>

melllinia changed the title ~~Add AudioBench benchmark for speech and audio language models~~ Add AudioBench and Librispeech-PC benchmarks for speech and audio language models Nov 25, 2025

Merge remote-tracking branch 'origin/main' into librispeech-pc-eval

771f2e6

gwarmstrong closed this Dec 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AudioBench and Librispeech-PC benchmarks for speech and audio language models#1043

Add AudioBench and Librispeech-PC benchmarks for speech and audio language models#1043
Jorjeous wants to merge 23 commits intoNVIDIA-NeMo:mainfrom
Jorjeous:audiobench-benchmark

Jorjeous commented Nov 14, 2025

Uh oh!

Jorjeous commented Nov 19, 2025

Uh oh!

melllinia Nov 21, 2025

Uh oh!

gwarmstrong commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Jorjeous commented Nov 14, 2025

Uh oh!

Jorjeous commented Nov 19, 2025

Uh oh!

melllinia Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

gwarmstrong commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants