Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new Audio metric DNSMOS #2525

Merged
merged 106 commits into from
Jul 17, 2024
Merged
Show file tree
Hide file tree
Changes from 44 commits
Commits
Show all changes
106 commits
Select commit Hold shift + click to select a range
e235e35
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 30, 2024
60c7318
+DNSMOS
quancs Apr 30, 2024
213e6ba
update
quancs Apr 30, 2024
fb04f0c
Merge branch 'DNSMOS' of https://github.com/quancs/torchmetrics into …
quancs Apr 30, 2024
284b3aa
fix
quancs Apr 30, 2024
7b01a01
+test
quancs May 1, 2024
0604b7e
fix
quancs May 1, 2024
0da07fe
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 1, 2024
a4749f3
fix
quancs May 1, 2024
572b308
fix
quancs May 1, 2024
fc06a33
fix
quancs May 1, 2024
10c14b2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 1, 2024
6cff8cc
fix
quancs May 1, 2024
4be8b47
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 1, 2024
f1e1d96
+req
quancs May 1, 2024
6f69898
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 1, 2024
87d9e70
fix
quancs May 2, 2024
4f20381
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 2, 2024
b5ad81e
fix
quancs May 2, 2024
0811527
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 3, 2024
6c52870
fix
quancs May 3, 2024
bdd98ef
fix
quancs May 3, 2024
2862530
fix
quancs May 3, 2024
b6f5200
Merge branch 'DNSMOS' of https://github.com/quancs/torchmetrics into …
quancs May 3, 2024
ba3d1b2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 3, 2024
32b10f5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 3, 2024
43b7e5b
fix
quancs May 3, 2024
f1af642
fix
quancs May 3, 2024
b845498
Merge branch 'DNSMOS' of https://github.com/quancs/torchmetrics into …
quancs May 3, 2024
29510ef
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 3, 2024
f96d887
fix
quancs May 3, 2024
1588d51
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 3, 2024
d0a602d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 3, 2024
99f8810
fix
quancs May 3, 2024
683e5f1
fix
quancs May 3, 2024
8d23eeb
Merge branch 'DNSMOS' of https://github.com/quancs/torchmetrics into …
quancs May 3, 2024
65b8446
fix
quancs May 3, 2024
5048eaf
fix removeprefix
quancs May 3, 2024
b7a8563
Merge branch 'DNSMOS' of https://github.com/quancs/torchmetrics into …
quancs May 3, 2024
d3daa89
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 3, 2024
af48fe2
Apply suggestions from code review
Borda May 6, 2024
d70400f
Merge branch 'master' into DNSMOS
Borda May 6, 2024
808774c
Update CHANGELOG.md
Borda May 6, 2024
21be811
Apply suggestions from code review
Borda May 6, 2024
74cd984
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 6, 2024
91964b2
Merge branch 'master' into DNSMOS
Borda May 7, 2024
9293010
add min & max versions
quancs May 7, 2024
04edab6
fix
quancs May 7, 2024
5eb0dd3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 7, 2024
baef964
fix
quancs May 8, 2024
e58c4da
fix
quancs May 8, 2024
78c1eef
fix
quancs May 8, 2024
be0cafc
fix
quancs May 8, 2024
2817fc9
fix
quancs May 8, 2024
e6fdb60
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 8, 2024
dba383e
fix
quancs May 11, 2024
e1876cc
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 11, 2024
3a75ea4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 13, 2024
fe5f54d
fix
quancs May 13, 2024
f24e363
Merge branch 'master' into DNSMOS
Borda May 14, 2024
d69b03c
Merge branch 'master' into DNSMOS
Borda May 19, 2024
7ee68e0
python==3.11, onnxruntime>=1.17.0
quancs May 20, 2024
7796c1a
Merge branch 'DNSMOS' of https://github.com/quancs/torchmetrics into …
quancs May 20, 2024
fa2c2cc
Merge branch 'master' into DNSMOS
Borda May 21, 2024
c6888be
Apply suggestions from code review
Borda May 21, 2024
36e0962
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 21, 2024
bf813a2
with 1.19
Borda May 21, 2024
8575e76
Merge branch 'master' into DNSMOS
Borda May 21, 2024
d77f591
Merge branch 'master' into DNSMOS
Borda May 29, 2024
a248058
Apply suggestions from code review
Borda May 31, 2024
2904d6b
Merge branch 'master' into DNSMOS
Borda May 31, 2024
616f98b
+num_theads
quancs May 31, 2024
0f2b3f8
+P835 paper
quancs May 31, 2024
24a9c46
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 31, 2024
d3b364e
Merge branch 'master' into DNSMOS
Borda Jun 2, 2024
a500116
fix req
quancs Jun 19, 2024
bcdad95
Merge branch 'master' into DNSMOS
quancs Jun 19, 2024
4937e41
fix
quancs Jun 19, 2024
14f0b2a
upgrade pip for resolving onnxruntime installation issue
quancs Jun 19, 2024
1fb40b1
fix onnxruntime==1.18 for python==3.11
quancs Jun 19, 2024
8a9284b
Merge branch 'master' into DNSMOS
quancs Jun 25, 2024
080c76a
Merge branch 'master' into DNSMOS
Borda Jun 25, 2024
f591c90
update
quancs Jun 25, 2024
3c58fb9
Merge branch 'master' into DNSMOS
quancs Jun 25, 2024
bf8e655
Merge branch 'master' into DNSMOS
Borda Jul 6, 2024
7001d9a
Merge branch 'master' into DNSMOS
Borda Jul 9, 2024
fca8d29
Apply suggestions from code review
Borda Jul 9, 2024
6a1e1fb
onnxruntime
quancs Jul 10, 2024
7167b99
fix
quancs Jul 10, 2024
a681818
fix
quancs Jul 10, 2024
5e2591a
use onnxruntime-gpu
quancs Jul 13, 2024
427cc35
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 13, 2024
4afc647
Apply suggestions from code review
Borda Jul 13, 2024
7a1d4b7
move to top
quancs Jul 13, 2024
9b1f57b
fix
quancs Jul 13, 2024
9676745
use onnx-gpu for linux & win, onnx for macos
quancs Jul 13, 2024
e627082
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 13, 2024
287957c
fix line too long
quancs Jul 14, 2024
9530c70
print available providers
quancs Jul 14, 2024
9e13bf6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 14, 2024
7865b02
+warn if failed to compute DNSMOS on GPU
quancs Jul 14, 2024
7f0ffad
Merge branch 'master' into DNSMOS
quancs Jul 14, 2024
ebd303f
Merge branch 'master' into DNSMOS
Borda Jul 16, 2024
5618f7a
Merge branch 'master' into DNSMOS
Borda Jul 17, 2024
5763494
Merge branch 'master' into DNSMOS
Borda Jul 17, 2024
6037c98
Merge branch 'master' into DNSMOS
mergify[bot] Jul 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

-
- Added a new audio metric `DNSMOS` ([#2525](https://github.com/PyTorchLightning/metrics/pull/2525))


### Changed
Expand Down
21 changes: 21 additions & 0 deletions docs/source/audio/deep_noise_suppression_mean_opinion_score.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
.. customcarditem::
:header: Deep Noise Suppression Mean Opinion Score (DNSMOS)
:image: https://pl-flash-data.s3.amazonaws.com/assets/thumbnails/audio_classification.svg
:tags: Audio

.. include:: ../links.rst

##################################################
Deep Noise Suppression Mean Opinion Score (DNSMOS)
##################################################

Module Interface
________________

.. autoclass:: torchmetrics.audio.dnsmos.DeepNoiseSuppressionMeanOpinionScore
:exclude-members: update, compute

Functional Interface
____________________

.. autofunction:: torchmetrics.functional.audio.dnsmos.deep_noise_suppression_mean_opinion_score
1 change: 1 addition & 0 deletions docs/source/links.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@
.. _Theils Uncertainty coefficient: https://en.wikipedia.org/wiki/Uncertainty_coefficient
.. _Perceptual Evaluation of Speech Quality: https://en.wikipedia.org/wiki/Perceptual_Evaluation_of_Speech_Quality
.. _pesq package: https://github.com/ludlows/python-pesq
.. _Deep Noise Suppression performance evaluation based on Mean Opinion Score: https://ieeexplore.ieee.org/document/9414878
.. _Cees Taal's website: http://www.ceestaal.nl/code/
.. _pystoi package: https://github.com/mpariente/pystoi
.. _stoi ref1: https://ieeexplore.ieee.org/abstract/document/5495701
Expand Down
3 changes: 3 additions & 0 deletions requirements/audio.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,6 @@ pesq @ git+https://github.com/ludlows/python-pesq
pystoi >=0.3.0, <0.5.0
torchaudio >=0.10.0, <2.4.0
gammatone @ https://github.com/detly/gammatone/archive/master.zip#egg=Gammatone
librosa
Borda marked this conversation as resolved.
Show resolved Hide resolved
onnxruntime-gpu
requests
7 changes: 7 additions & 0 deletions src/torchmetrics/audio/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
)
from torchmetrics.utilities.imports import (
_GAMMATONE_AVAILABLE,
_LIBROSA_AVAILABLE,
_ONNXRUNTIME_AVAILABLE,
_PESQ_AVAILABLE,
_PYSTOI_AVAILABLE,
_TORCHAUDIO_AVAILABLE,
Expand Down Expand Up @@ -54,3 +56,8 @@
from torchmetrics.audio.srmr import SpeechReverberationModulationEnergyRatio

__all__ += ["SpeechReverberationModulationEnergyRatio"]

if _LIBROSA_AVAILABLE and _ONNXRUNTIME_AVAILABLE:
from torchmetrics.audio.dnsmos import DeepNoiseSuppressionMeanOpinionScore

__all__ += ["DeepNoiseSuppressionMeanOpinionScore"]
166 changes: 166 additions & 0 deletions src/torchmetrics/audio/dnsmos.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# Copyright The Lightning team.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from typing import Any, Optional, Sequence, Union

import torch
from torch import Tensor, tensor

from torchmetrics.functional.audio.dnsmos import deep_noise_suppression_mean_opinion_score
from torchmetrics.metric import Metric
from torchmetrics.utilities.imports import (
_LIBROSA_AVAILABLE,
_MATPLOTLIB_AVAILABLE,
_ONNXRUNTIME_AVAILABLE,
_REQUESTS_AVAILABLE,
)
from torchmetrics.utilities.plot import _AX_TYPE, _PLOT_OUT_TYPE

__doctest_requires__ = {"DeepNoiseSuppressionMeanOpinionScore": ["requests", "librosa", "onnxruntime"]}

if not _MATPLOTLIB_AVAILABLE:
__doctest_skip__ = ["DeepNoiseSuppressionMeanOpinionScore.plot"]


class DeepNoiseSuppressionMeanOpinionScore(Metric):
"""Calculate `Deep Noise Suppression performance evaluation based on Mean Opinion Score`_ (DNSMOS).

Human subjective evaluation is the ”gold standard” to evaluate speech quality optimized for human perception.
Perceptual objective metrics serve as a proxy for subjective scores. The conventional and widely used metrics
require a reference clean speech signal, which is unavailable in real recordings. The no-reference approaches
correlate poorly with human ratings and are not widely adopted in the research community. One of the biggest
use cases of these perceptual objective metrics is to evaluate noise suppression algorithms. DNSMOS generalizes
well in challenging test conditions with a high correlation to human ratings in stack ranking noise suppression
methods. More details can be found in [DNSMOS paper](https://arxiv.org/pdf/2010.15258.pdf).
Borda marked this conversation as resolved.
Show resolved Hide resolved

As input to ``forward`` and ``update`` the metric accepts the following input

- ``preds`` (:class:`~torch.Tensor`): float tensor with shape ``(...,time)``

As output of `forward` and `compute` the metric returns the following output
Borda marked this conversation as resolved.
Show resolved Hide resolved

- ``dnsmos`` (:class:`~torch.Tensor`): float tensor of DNSMOS values reduced across the batch
with shape ``(..., 4)`` indicating [p808_mos, mos_sig, mos_bak, mos_ovr] in the last dim.
Borda marked this conversation as resolved.
Show resolved Hide resolved
quancs marked this conversation as resolved.
Show resolved Hide resolved

.. note:: using this metric requires you to have ``librosa``, ``onnxruntime`` and ``requests`` installed.
Install as ``pip install librosa onnxruntime-gpu requests``.
Borda marked this conversation as resolved.
Show resolved Hide resolved

.. note:: the ``forward`` and ``compute`` methods in this class return a reduced DNSMOS value
for a batch. To obtain the DNSMOS value for each sample, you may use the functional counterpart in
:func:`~torchmetrics.functional.audio.dnsmos.deep_noise_suppression_mean_opinion_score`.

Args:
fs: sampling frequency
personalized: whether interfering speaker is penalized
device: the device used for calculating DNSMOS, can be cpu or cuda:n, where n is the index of gpu.
If None is given, then the device of input is used.

Raises:
ModuleNotFoundError:
If ``librosa``, ``onnxruntime`` or ``requests`` packages are not installed

Example:
>>> from torch import randn
>>> from torchmetrics.audio import DeepNoiseSuppressionMeanOpinionScore
>>> g = torch.manual_seed(1)
>>> preds = randn(8000)
>>> dnsmos = DeepNoiseSuppressionMeanOpinionScore(8000, False)
>>> dnsmos(preds)
tensor([2.2285, 2.1132, 1.3972, 1.3652], dtype=torch.float64)

"""

sum_dnsmos: Tensor
total: Tensor
full_state_update: bool = False
is_differentiable: bool = False
higher_is_better: bool = True
plot_lower_bound: float = 0
plot_upper_bound: float = 5

def __init__(
self,
fs: int,
personalized: bool,
device: Optional[str] = None,
**kwargs: Any,
) -> None:
super().__init__(**kwargs)
if not _LIBROSA_AVAILABLE or not _ONNXRUNTIME_AVAILABLE or not _REQUESTS_AVAILABLE:
raise ModuleNotFoundError(
"DNSMOS metric requires that librosa, onnxruntime and requests are installed."
" Install as `pip install librosa onnxruntime-gpu requests`."
)

self.fs = fs
self.personalized = personalized
self.cal_device = device

self.add_state("sum_dnsmos", default=tensor([0, 0, 0, 0], dtype=torch.float64), dist_reduce_fx="sum")
self.add_state("total", default=tensor(0), dist_reduce_fx="sum")

def update(self, preds: Tensor) -> None:
"""Update state with predictions."""
metric_batch = deep_noise_suppression_mean_opinion_score(
preds,
self.fs,
self.personalized,
self.cal_device,
).to(self.sum_dnsmos.device)

self.sum_dnsmos += metric_batch.reshape(-1, 4).sum(dim=0)
self.total += metric_batch.reshape(-1, 4).shape[0]

def compute(self) -> Tensor:
"""Compute metric."""
return self.sum_dnsmos / self.total

def plot(self, val: Union[Tensor, Sequence[Tensor], None] = None, ax: Optional[_AX_TYPE] = None) -> _PLOT_OUT_TYPE:
"""Plot a single or multiple values from the metric.

Args:
val: Either a single result from calling `metric.forward` or `metric.compute` or a list of these results.
If no value is provided, will automatically call `metric.compute` and plot that result.
Borda marked this conversation as resolved.
Show resolved Hide resolved
ax: An matplotlib axis object. If provided will add plot to that axis
Borda marked this conversation as resolved.
Show resolved Hide resolved

Returns:
Figure and Axes object

Raises:
ModuleNotFoundError:
If `matplotlib` is not installed
Borda marked this conversation as resolved.
Show resolved Hide resolved

.. plot::
:scale: 75

>>> # Example plotting a single value
>>> import torch
>>> from torchmetrics.audio import DeepNoiseSuppressionMeanOpinionScore
>>> metric = DeepNoiseSuppressionMeanOpinionScore(8000, False)
>>> metric.update(torch.rand(8000))
>>> fig_, ax_ = metric.plot()

.. plot::
:scale: 75

>>> # Example plotting multiple values
>>> import torch
>>> from torchmetrics.audio import DeepNoiseSuppressionMeanOpinionScore
>>> metric = DeepNoiseSuppressionMeanOpinionScore(8000, False)
>>> values = [ ]
>>> for _ in range(10):
... values.append(metric(torch.rand(8000)))
>>> fig_, ax_ = metric.plot(values)

"""
return self._plot(val, ax)
7 changes: 7 additions & 0 deletions src/torchmetrics/functional/audio/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
)
from torchmetrics.utilities.imports import (
_GAMMATONE_AVAILABLE,
_LIBROSA_AVAILABLE,
_ONNXRUNTIME_AVAILABLE,
_PESQ_AVAILABLE,
_PYSTOI_AVAILABLE,
_TORCHAUDIO_AVAILABLE,
Expand Down Expand Up @@ -55,3 +57,8 @@
from torchmetrics.functional.audio.srmr import speech_reverberation_modulation_energy_ratio

__all__ += ["speech_reverberation_modulation_energy_ratio"]

if _LIBROSA_AVAILABLE and _ONNXRUNTIME_AVAILABLE:
from torchmetrics.functional.audio.dnsmos import deep_noise_suppression_mean_opinion_score

__all__ += ["deep_noise_suppression_mean_opinion_score"]
Loading
Loading