Add `BaryScore` #852

stancld · 2022-02-21T14:14:34Z

🚀 Feature

Add BaryScore

Sources:

Paper: Automatic Text Evaluation through the Lens of Wasserstein Barycenters (EMNLP '21)
Repo

Motivation

The recent NLG metrics are more often based on BERT (or related) embeddings. As such, I believe, we should also start adding such metrics into TorchMetrics with an extra dependency on transformers if a user wants to use any of these metrics. The BaryScore metric is from a family of untrained metrics (i.e. the model is not fine-tuned on any specific task) so it should be easier for us to begin with it.

Abstract:

A new metric BaryScore to evaluate text generation based on deep contextualized embeddings e.g., BERT, Roberta, ELMo) is introduced. This metric is motivated by a new framework relying on optimal transport tools, i.e., Wasserstein distance and barycenter. By modelling the layer output of deep contextualized embeddings as a probability distribution rather than by a vector embedding; this framework provides a natural way to aggregate the different outputs through the Wasserstein space topology. In addition, it provides theoretical grounds to our metric and offers an alternative to available solutions e.g., MoverScore and BertScore). Numerical evaluation is performed on four different tasks: machine translation, summarization, data2text generation and image captioning. Our results show that \texttt{BaryScore} outperforms other BERT based metrics and exhibits more consistent behaviour in particular for text summarization.

The text was updated successfully, but these errors were encountered:

ashutoshml · 2022-02-22T15:15:55Z

Interested!

stancld · 2022-02-22T15:37:28Z

Great, @ashutoshml! Just let's wait on #849 as we discussed as there'll be many common dependencies :]

stancld · 2022-03-31T18:51:00Z

Hi @ashutoshml - I'd like to apologize for my delay here. I've been quite busy and unable to complete my part yet. Gonna try to proceed this weekend as much as possible.

stancld · 2022-06-25T09:15:06Z

Hello @ashutoshml, very sorry for my really huge delay, last months have been pretty turbulent for me. Let me know if you're still interested in contributing this metric to torchmetrics. If so, I believe you can checkout branch metric/InfoLM and start working on BaryScore metric implementation. One thing is that the reference metrics are still not possible to install as a package. We are thus enforced to hardcode the results in our tests. If you want to generate test results for your metric, I really recommend you to use my repo infolm-docker. Would be also happy if you send a PR there to contain all reference results on one place.

ashutoshml · 2022-06-25T13:53:36Z

Hi @stancld. I hope things are ok now.
I will be able to take up this only after Aug 5.

stancld · 2022-07-27T20:43:05Z

Hi @ashutoshml, it's completely fine - the time is really flying so we're almost there. If you're then interested, just let me know. I have some WIP we can discuss and can create a MR together :]

stancld · 2022-09-08T12:52:50Z

Helo @ashutoshml, any updates here? :]

ashutoshml · 2022-09-09T04:34:31Z

Hi @stancld - Was occupied with some work. I will have a look at it this weekend.

Borda · 2022-11-07T20:35:45Z

I will have a look at it this weekend.

great, looking forward @ashutoshml 🚀

stancld added enhancement New feature or request New metric labels Feb 21, 2022

stancld added the good first issue Good for newcomers label Feb 21, 2022

stancld assigned ashutoshml Feb 22, 2022

stancld mentioned this issue Mar 3, 2022

Make a package release PierreColombo/nlg_eval_via_simi_measures#6

Closed

stancld mentioned this issue Jun 24, 2022

Add InfoLM #915

Merged

4 tasks

stancld self-assigned this Jul 12, 2022

stancld added this to the v0.10 milestone Jul 14, 2022

stancld removed this from the v0.10 milestone Jul 27, 2022

SkafteNicki added this to the future milestone Oct 26, 2022

stancld removed the good first issue Good for newcomers label Nov 6, 2022

Borda added the topic: Text label Aug 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `BaryScore` #852

Add `BaryScore` #852

stancld commented Feb 21, 2022 •

edited

Loading

ashutoshml commented Feb 22, 2022

stancld commented Feb 22, 2022

stancld commented Mar 31, 2022

stancld commented Jun 25, 2022

ashutoshml commented Jun 25, 2022

stancld commented Jul 27, 2022

stancld commented Sep 8, 2022

ashutoshml commented Sep 9, 2022

Borda commented Nov 7, 2022 •

edited

Loading

Add BaryScore #852

Add BaryScore #852

Comments

stancld commented Feb 21, 2022 • edited Loading

🚀 Feature

Sources:

Motivation

Abstract:

ashutoshml commented Feb 22, 2022

stancld commented Feb 22, 2022

stancld commented Mar 31, 2022

stancld commented Jun 25, 2022

ashutoshml commented Jun 25, 2022

stancld commented Jul 27, 2022

stancld commented Sep 8, 2022

ashutoshml commented Sep 9, 2022

Borda commented Nov 7, 2022 • edited Loading

Add `BaryScore` #852

Add `BaryScore` #852

stancld commented Feb 21, 2022 •

edited

Loading

Borda commented Nov 7, 2022 •

edited

Loading